scaling in Research

50 %
50 %
Information about scaling in Research

Published on March 14, 2014

Author: manlymohan


Dr.V.Mohanasundaram,M.B.A.,M.Phil.,Ph.D Professor and Head, Vivekanandha Institute of Engineering and Technology for Women, Tiruchengode, Tamilnadu: Dr.V.Mohanasundaram,M.B.A.,M.Phil.,Ph.D Professor and Head, Vivekanandha Institute of Engineering and Technology for Women, Tiruchengode, Tamilnadu 1 PowerPoint Presentation: 2 Measurement and Scaling: 3 Measurement and Scaling PowerPoint Presentation: 4 Measurement To collect data, you need to have something to measure Measurement is the process of assigning numbers or scores to characteristics or attributes of the objects or people of interest Accuracy of Measurements: 5 Accuracy of Measurements Why do scores on a measurement scale differ? A true difference in the characteristic being measured. Short-term personal factors (e.g., moods, time constraints) Situational factors (e.g., surroundings) Variations in method of administering survey. Sampling of items included in the questionnaire. Lack of clarity in the measurement instrument. Mechanical or instrument factors causing completion errors. Measurement Process: 6 Measurement Process Define concepts to be measured Define attributes of the concepts Select scale of measurement (data type) Generate Items/Questions Wording Response format Layout and design questionnaire Pretest and refine Definition: 7 Definition Scaling is the generation of a broadly defined continuum on which measured objects are located. Four Basic Scales of Measurement: 8 Nominal Scales Ordinal Scales Interval Scales Ratio Scales Four Basic Scales of Measurement Nominal Scales: 9 Nominal Scales Nominal scales focus on only requiring a respondent to provide some type of descriptor as the raw response Example. Please indicate your current martial status. __Married __ Single __ Single, never married __ Widowed Ordinal Scales: 10 Ordinal Scales Ordinal scales allow the respondent to express “relative magnitude” between the raw responses to a question Example. Which one statement best describes your opinion of an Intel PC processor? __ Higher than AMD’s PC processor __ About the same as AMD’s PC processor __ Lower than AMD’s PC processor Interval Scales: 11 Interval Scales Interval scales demonstrate the absolute differences between each scale point PowerPoint Presentation: 12 How likely are you going to buy a new automobile within the next six months? (Please check the most appropriate category) Definitely will not buy ___ 1 Probably will not buy ___ 2 May or may not buy ___ 3 Probably will buy ___ 4 Definitely will buy ___ 5 Interval s cale Ratio Scales: 13 Ratio Scales Ratio scales allow for the identification of absolute differences between each scale point, and absolute comparisons between raw responses Example 1. Please circle the number of children under 18 years of age currently living in your household. 0 1 2 3 4 5 6 7 (if more than 7, please specify ___.) PowerPoint Presentation: 14 0 1 2 3 4 5 6 7 Examples height, weight, age, l ength time i ncome m arket share 1.What is your annual income before taxes? $ _______ 2. How far is your workplace from home? _______ kilometres Levels of Measurements: 15 Levels of Measurements Four levels of Measurements Nominal Measures categories Ordinal Categories + rank and order Interval Equal distance between any two consecutive measures Ratio Intervals + meaningful zeros Criteria for Scale Selection: 16 Criteria for Scale Selection Understanding of the questions Discriminatory power of scale descriptors Balanced versus unbalanced scales Forced or nonforced choice scales Desired measure of central tendency and dispersion PowerPoint Presentation: 17 Paired comparison Rank order Constant sum Comparative scales Non-comparative scales Continuous rating scales Itemized rating scales Stapel Semantic differential Likert A c lassification of s caling t echniques SCALING TECHNIQUES Others PowerPoint Presentation: 18 Types of s caling Techniques COMPARATIVE SCALES Involve the respondent directly comparing stimulus objects. e.g. How does Pepsi compare with Coke on sweetness NON-COMPARATIVE SCALES Respondent scales each stimulus object independently of other objects e.g. How would you rate the sweetness of Pepsi on a scale of 1 to 10 PowerPoint Presentation: 19 Comparative s cales: p aired c omparison i tems A and B A and C A and D B and C B and D C and D If we have brands A, B, C and D, we would have respondents compare u sually limited to N < 15 PowerPoint Presentation: 20 Please indicate which of the following airlines you prefer by circling your more preferred airline in each pair: Air Canada WestJet Air Transat Air Canada Horizon Air WestJet WestJet Air Transat Air Canada Horizon Air Horizon Air Air Transat Comparative s cales: p aired c omparison i tems PowerPoint Presentation: 21 Allocate a total of 100 points among the following soft-drinks depending on how favorable you feel toward each; the more highly you think of each soft-drink, the more points you should allocate to it. (Please check that the allocated points add to 100.) Coca-Cola _____ points 7-Up _____ points Mirinda _____ points Fanta _____ points Pepsi-Cola _____ points 100 points Comparative s cales: c onstant s um s cales PowerPoint Presentation: 22 Rank the following soft-drinks from 1 (best) to 5 (worst) according to your taste preference: Coca-Cola _____ 7-Up _____ Fanta _____ Pepsi-Cola _____ Mountain Dew _____ √ Top and bottom rank choices are ‘easy’ √ Middle ranks are usually most ‘difficult’ Comparative s cales: r ank o rder s cales PowerPoint Presentation: 23 Continuous scale How would you rate Stat. Analysis to other courses this term 0 10 20 30 40 50 60 70 80 90 100 The worst The Best X X Non comparative scale  PowerPoint Presentation: 24 Non - comparative scale PowerPoint Presentation: 25 Itemized Rating Scales Semantic d ifferential s cale The Likert scale Stape l scale Likert Scale: 26 Likert Scale A likert scale is an ordinal scale format that asks respondents to indicate the extent to which they agree or disagree with a series of mental or behavioral belief statements about a given object Likert Scales: 27 Likert Scales A very popular rating scale Measures the feelings/degree of agreement of the respondents Ideally, 4 to 7 points Examples of 5-point surveys Agreement SD D ND/NA A SA Satisfaction SD D ND/NS S SS Quality VP P Average G VG Summative Likert Scales: 28 Summative Likert Scales Must contain multiple items Each individual item must measure something that has an underlying, quantitative measurement continuum There can be no right/wrong answers as opposed to multiple-choice questions Items must be statements to which the respondent assigns a rating Cannot be used to measure knowledge or ability, but familiarity Semantic Differential Scale: 29 Semantic Differential Scale A semantic differential scale is unique bipolar ordinal scale format that captures a person’s attitudes and/or feelings about a given object Semantic Differential Scales: 30 Semantic Differential Scales Uses a set of scale anchored by their extreme responses using words of opposite meaning. Example: Dark ___ ___ ___ ___ ___ Light Short ___ ___ ___ ___ ___ Tall Evil ___ ___ ___ ___ ___ Good Four to seven categories are ideal Magnitude Scaling: 31 Magnitude Scaling Attempts to measure constructs along a numerical, ratio level scale Respondent is given an item with a pre-assigned numerical value attached to it to establish a “norm” The respondent is asked to rate other items with numerical values as a proportion of the “norm” Very powerful if reliability is established Thurston Scales: 32 Thurston Scales Thurston Scales Items are formed Panel of experts assigns values from 1 to 11 to each item Mean or median scores are calculated for each item Select statements evenly spread across the scale Thurston Scales: 33 Thurston Scales Example: Please check the item that best describes your level of willingness to try new tasks I seldom feel willing to take on new tasks (1.7) I will occasionally try new tasks (3.6) I look forward to new tasks (6.9) I am excited to try new tasks (9.8) Guttman Scales: 34 Guttman Scales Also known as Scalograms Both the respondents and items are ranked Cutting points are determined (Goodenough-Edwards technique) Coefficient of Reproducibility (CR eg ) - a measure of goodness of fit between the observed and predicted ideal response patterns Keep items with CR eg of 0.90 or higher PowerPoint Presentation: 35 Select a plus number for words that you think describe the store accurately. The more accurately you think the work describes the store, the larger the plus number you should choose. Select a minus number for words you think do not describe the store accurately. The less accurately you think the word describes the store, the larger the minus number you should choose, therefore, you can select any number from + 5 for words that you think are very accurate all the way to - 5 for words that you think are very inaccurate. A Stapel s cale for m easuring a s tore’s i mage Tesco +5 +5 +5 +4 +4 +4 +3 +3 +3 +2 +2 +2 +1 +1 +1 HIGH POOR WIDE QUALITY SERVICE VARIETY -1 -1 -1 -2 -2 -2 -3 -3 -3 -4 -4 -4 -5 -5 -5 PowerPoint Presentation: 36 SOME BASIC CONSIDERATIONS WHEN SELECTING A SCALE Balanced v ersus n on-balanced a lternatives Number of c ategories Odd or e ven n umber of s cale c ategories Forced v ersus n on-forced c hoice PowerPoint Presentation: 37 Odd Strongly a gree _____ Agree _____ Neutral _____ Disagree _____ Strongly disagree _____ Even Strongly a gree_____ Agree _____ Disagree _____ Strongly disagree___ Odd versus even if neutral responses likely, use odd number PowerPoint Presentation: 38 Balanced vs. Unbalanced Balanced Very good ______ Good ______ Fair ______ Poor ______ Very p oor ______ Unbalanced Excellent ______ Very g ood ______ Good ______ Fair ______ Poor ______ PowerPoint Presentation: 39 Forced vs. Unforced Forced Extremely r eliable ___ Very r eliable ___ Somewhat r eliable ___ Somewhat u nreliable ___ Very u nreliable ___ Extremely u nreliable ___ Unforced Extremely r eliable ___ Very r eliable ___ Somewhat r eliable ___ Somewhat u nreliable ___ Very u nreliable ___ Extremely u nreliable ___ Don’t know ___ PowerPoint Presentation: 40 Labeled vs. End a nchored Labeled Excellent _____ Very g ood _____ Fair _____ Poor _____ Very Poor _____ End Anchored Excellent _____ _____ _____ _____ Poor _____ PowerPoint Presentation: 41 Labeled Excellent _____ Very g ood _____ Fair _____ Poor _____ Very p oor _____ Excellent _____ Very g ood _____ Fair _____ Poor _____ Very p oor _____ Intervals m ay n ot r eflect the s emantic m eaning of the Adjectives Intervals a re n ot e qual Intervals a re n ot e qual PowerPoint Presentation: 42 Number of s cale p oints 5 Point Excellent _____ _____ _____ _____ Poor _____ 10 Point Excellent _____________ _____________ _____________ _____________ _____________ _____________ _____________ _____________ _____________ _____________ Poor Scale Construction: 43 Scale Construction Define Constructs Conceptual/theoretical basis from the literature Are their sub-scales (dimensions) to the scale Multiple item sub-scales Principle of Parsimony Simplest explanation among a number of equally valid explanations must be used Characteristics of good measurement scales: 44 Characteristics of g ood m easurement s cales 1. Reliability The degree to which a measure accurately captures a true outcome without error synonymous with repetitive consistency 2. Validity The degree to which a measure faithfully represents the underlying concept (it asks the right questions) 3. Sensitivity The ability to discriminate meaningful differences between attitudes. The more categories the more sensitive (but less reliable ) Validity and Reliability: 45 Validity and Reliability Reliability can be more easily determined than validity If it is reliable, it may or may not be valid If a measure is valid, it may or may not be reliable If it is not reliable, it cannot be valid If it is not valid, it may or may not be reliable PowerPoint Presentation: 46 Reliability and Validity Neither Reliable Nor Valid Reliable But Not Valid Reliable And Valid PowerPoint Presentation: 47 Example of low validity, high reliability Scale is perfectly accurate, but is capturing the wrong thing; for example, it measures consumers’ interest in creative writing rather than preference for kinds of stationery. PowerPoint Presentation: 48 Example of modest validity, low reliability Scale genuinely measures consumers’ interest in kinds of stationery, but poorly worded items, sloppy administration, data entry errors lead to random errors in data Item Construction: 49 Item Construction Agreement items Write declarative statements Death penalty should be abolished I like to listen to classical music Frequency items (how often) I like to read Evaluation items How well did your team play How well does the police serve your community Item Writing: 50 Item Writing Mutually exclusive and collectively exhaustive items Use positively and negatively phrased questions Avoid colloquialism, expressions and jargon Avoid the use of negatives to reverse the wording of an item Don’t use: I am not satisfied with my job Use: I hate my job! Be brief, focused, and clear Use simple, unbiased questions Sources of Error: 51 Sources of Error Social desirability Giving politically correct answers Response sets All yes, or all no responses Acquiescence Telling you what you want to hear Personal bias Wants to send a message Sources of Error: 52 Sources of Error Response order Recency - Respondent stops reading once s/he gets to the response s/he likes Primacy - Remember better the initial choices Fatigue Item order Answers to later items may be affected by earlier items (simple, factual items first) Respondent may not know how to answer earlier questions Assessing Instruments: 53 Assessing Instruments Three issues to consider Validity: Does the instrument measure what its supposed to measure Reliability: Does it consistently repeat the same measurement Practicality: Is this a practical instrument Types of Validity: 54 Types of Validity Face validity Does the instrument, on its face, appear to measure what it is supposed to measure. For instance, if you prepare a test to measure whether students can perform multiplication, and the people you show it to all agree that it looks like a good test of multiplication ability, you have shown the face validity of your test Content validity : Content validity Content validity Degree to which the content of the items adequately represent the universe of all relevant items under study Generally arrived at through a panel of experts Example: Researchers aim to study mathematical learning and create a survey to test for mathematical skill. If these researchers only tested for multiplication and then drew conclusions from that survey, their study would not show content validity because it excludes other mathematical functions. 55 Criterion related: 56 Criterion related Criterion related validity, also referred to as instrumental validity, is used to demonstrate the accuracy of a measure or procedure by comparing it with another measure or procedure which has been demonstrated to be valid. For example, imagine a hands-on driving test has been shown to be an accurate test of driving skills. By comparing the scores on the written driving test with the scores from the hands-on driving test, the written test can be validated by using a criterion related strategy in which the hands-on driving test is compared to the written test. Concurrent validity :  Concurrent validity Concurrent Validity occurs when the criterion measures are obtained at the same time as the test scores. This indicates the extent to which the test scores accurately estimate an individual’s current state with regards to the criterion. For example, on a test that measures levels of depression, the test would be said to have concurrent validity if it measured the current levels of depression experienced by the test taker. 57 Predictive validity : Predictive validity Criterion is measured after the passage of time Known-groups Predictive Validity occurs when the criterion measures are obtained at a time after the test. Examples of test with predictive validity are career or aptitude tests, which are helpful in determining who is likely to succeed or fail in certain subjects or occupations. 58 Construct validity: Construct validity Construct validity seeks agreement between a theoretical concept and a specific measuring device or procedure. For example, a researcher inventing a new IQ test might spend a great deal of time attempting to "define" intelligence in order to reach an acceptable level of construct validity. 59 Convergent validity and Discriminate validity: 60 Construct validity can be broken down into two sub-categories: Convergent validity and discriminate validity . Convergent validity is the actual general agreement among ratings, gathered independently of one another, where measures should be theoretically related. Discriminate validity is the lack of a relationship among measures which theoretically should not be related Convergent validity and Discriminate validity Types of Reliability: Types of Reliability Reliability is the extent to which an experiment, test, or any measuring procedure yields the same result on repeated trials. For researchers, four key types of reliability are: Equivalency Reliability Stability Reliability Internal Consistency Interrater Reliability 61 Stability Reliability: 62 Stability Reliability Stability reliability (sometimes called test, re-test reliability) is the agreement of measuring instruments over time. To determine stability, a measure or test is repeated on the same subjects at a future date. Results are compared and correlated with the initial test to give a measure of stability Equivalence Reliability: 63 Equivalence Reliability Degree to which alternative forms of the same measure produce same or similar results Give parallel forms of the same test to the same group with a short delay to avoid fatigue Look for high correlation between the scores of the two forms of the test Inter-rater reliability Types of Reliability: 64 Types of Reliability Internal Consistency Degree to which instrument items are homogeneous and reflect the same underlying constructs Split-half testing where the test is split into two halves that contain the same types of questions Uses Cronbach’s alpha to determine internal consistency. Only one administration of the test is required Kuder-Richardson (KR 20 ) for items with right and wrong answers Practicality: 65 Practicality Is the survey economical Cost of producing and administering the survey Time requirement Common sense! Convenience Adequacy of instructions Easy to administer Can the measurement be interpreted by others Scoring keys Evidence of validity and reliability Established norms

Add a comment

Related presentations

Related pages

Scaling - Social Research Methods

Scaling is the branch of measurement that involves the construction of an instrument that associates qualitative constructs with quantitative metric units.
Read more

Scaling in Research - YouTube

Want to watch this again later? Sign in to add this video to a playlist. Nominal, Ordinal, Interval and Ratio Scale
Read more

Scale (social sciences) - Wikipedia, the free encyclopedia

Thurstone scale – This is a scaling technique that incorporates ... Test-retest reliability checks how similar the results are if the research is ...
Read more

General Issues in Scaling - Social Research Methods

Purposes of Scaling. Why do we do scaling? Why not just create text statements or questions and use response formats to collect the answers? First ...
Read more

Research Methods - Measurement scales

Ratio. A ratio scale is the top level of measurement and is not often available in social research. The factor which clearly defines a ratio scale is that ...
Read more

SCALING : SAGE Research Methods

Dictionary. SCALING Paul E. Spector. Methods used by researchers to quantify human psychological responses to stimuli. Many scaling methods have ...
Read more

Multidimensional scaling - Wikipedia, the free encyclopedia

Multidimensional scaling (MDS) is a means of visualizing the level of similarity of individual cases of a dataset. It refers to a set of related ordination ...
Read more

Chapter 3: Levels Of Measurement And Scaling

Most texts on marketing research explain the four levels of measurement: nominal, ordinal, interval and ratio and so the treatment given to them here will ...
Read more