Information about Scatterplots tarsog and correlation

Scatterplots The scatterplot is the basic tool used to investigate relationships between two quantitative variables. Types of Variables Quantitative (measurements/ counts) Qualitative (groups)

What do I see in these scatterplots? There appears to be a linear trend. There appears to be moderate constant scatter about the trend line. Negative Association. No outliers or groupings visible. 454035 20 19 18 17 16 15 14 Latitude (°S) Mean January Air Temperatures for 30 New Zealand Locations Temperature(°C)

What do I see in these scatterplots? There appears to be a non-linear trend. There appears to be non-constant scatter about the trend line. Positive Association. One possible outlier (Large GDP, low % Internet Users). 0 10 20 30 40 GDP per capita (thousands of dollars) 0 10 20 30 40 50 60 70 80 InternetUsers(%) % of population who are Internet Users vs GDP per capita for 202 Countries

What do I see in these scatterplots? Two non-linear trends (Male and Female). Very little scatter about the trendlines Negative association until about 1970, then a positive association. Gap in the data collection (Second World War).Year 1990198019701960195019401930 30 28 26 24 22 20 Age Average Age New Zealanders are First Married

What do I look for in scatterplots? Trend Do you see a linear trend… straight line OR a non-linear trend?

What do I look for in scatterplots? Trend Do you see a linear trend… straight line OR a non-linear trend?

What do I look for in scatterplots? Association Do you see a positive association… as one variable gets bigger, so does the other OR a negative association? as one variable gets bigger, the other gets smaller

What do I look for in scatterplots? Association Do you see a positive association… as one variable gets bigger, so does the other OR a negative association? as one variable gets bigger, the other gets smaller

What do I look for in scatterplots? Scatter Do you see a strong relationship… little scatter OR a weak relationship? lots of scatter

What do I look for in scatterplots? Relationship Do you see a strong relationship… little scatter OR a weak relationship? lots of scatter

What do I look for in scatterplots? Scatter Do you see constant scatter… roughly the same amount of scatter as you look across the plot or non-constant scatter? the scatter looks like a “fan” or “funnel” or some areas are denser/sparser than others

What do I look for in scatterplots? Scatter Do you see constant scatter… roughly the same amount of scatter as you look across the plot or non-constant scatter? the scatter looks like a “fan” or “funnel” or some areas are denser/sparser than others

What do I look for in scatterplots? Anything unusual Do you see any outliers? unusually far from the trend any groupings?

What do I look for in scatterplots? Anything unusual Do you see any outliers? unusually far from the trend any groupings?

Rank these relationships from weakest (1) to strongest (4):

Rank these relationships from weakest (1) to strongest (4): 2 1 4 3

Rank these relationships from weakest (1) to strongest (4): How did you make your decisions? The less scatter there is about the trend line, the stronger the relationship is.

Describing scatterplots - TARSOG Trend – summary of strength, association and linear/non-linear Association – positive or negative Relationship – strong or weak Scatter – constant or non-constant Outliers Groupings

Correlation Correlation measures the strength of the linear association between two quantitative variables Get the correlation coefficient (r) from your calculator or computer r has a value between -1 and +1 Correlation has no units

Correlation coefficient (r) |r| = 1 |r| > 0.8 0.6 < |r| < 0.8 0.4 < |r| < 0.6 |r| < 0.4 perfect linear relationship strong linear relationship moderate linear relationship weak linear relationship no significant linear relationship |r| means the absolute value of r. This means ignore the negative sign. A positive value of r indicates the trend is positive, a negative sign indicates the trend is negative.

Examples of Correlation r = -1 r = -0.7 r = -0.4 r = 0 r = 0.3 r = 0.8 r = 1

What can go wrong? Use correlation only if you have two quantitative variables (variables that can be measured) There is an association between gender and weight but there isn’t a correlation between gender and weight! Gender is not quantitative Use correlation only if the relationship is linear Beware of outliers! They can have a big impact on correlation.

Deceptive situations r is a measure of how one variable varies in a linear relation to the other An obvious pattern does not always indicate a high value of the coefficient of correlation (r) A horizontal or vertical trend indicates that there is no relationship, and so r is (close to) zero

Always plot the data before looking at the correlation! No linear relationship, but there is a relationship! No linear relationship, but there is a relationship!

2007 For example: Rata's bean sprouts: r = 0 Minutes spent on measuring the plants 10 Height of sprout mm 10 20 30

2007 For example: Jenny's bean sprouts: r = 0 Soil depth (cm) 2 4 6 Height of sprout mm 10 20

Tick the plots where it would be OK to use a correlation coefficient to describe the strength of the relationship: 9876543210 400 0 300 0 200 0 100 0 0 Position Number Distance(millionmiles) Distances of Planets from the Sun Reaction Times (seconds) for 30 Year 10 Students 0 0. 2 0. 4 0. 6 0. 8 0 0.2 0.4 0. 6 0. 8 1 Non-dominant Hand DominantHand 454035 20 19 18 17 16 15 14 Latitude (°S) Mean January Air Temperatures for 30 New Zealand Locations Temperature(°C) Female ($) Average Weekly Income for Employed New Zealanders in 2001 Male($) 0 200 400 600 800 1000 1200 0 200 400 600 80 0

Tick the plots where it would be OK to use a correlation coefficient to describe the strength of the relationship: 9876543210 400 0 300 0 200 0 100 0 0 Position Number Distance(millionmiles) Distances of Planets from the Sun Reaction Times (seconds) for 30 Year 10 Students 0 0. 2 0. 4 0. 6 0. 8 0 0.2 0.4 0. 6 0. 8 1 Non-dominant Hand DominantHand 454035 20 19 18 17 16 15 14 Latitude (°S) Mean January Air Temperatures for 30 New Zealand Locations Temperature(°C) Female ($) Average Weekly Income for Employed New Zealanders in 2001 Male($) 0 200 400 600 800 1000 1200 0 200 400 600 80 0 P P Not linear Remove two outliers, nothing happening

What do I see in this scatterplot? Appears to be a linear trend, with a possible outlier (tall person with a small foot size.) Appears to be constant scatter. Positive association. 22 23 24 25 26 27 28 29 150 160 170 180 190 200 Foot size (cm) Height(cm) Height and Foot Size for 30 Year 10 Students

What will happen to the correlation coefficient if the tallest Year 10 student is removed? It will get smaller It won’t change It will get bigger22 23 24 25 26 27 28 29 150 160 170 180 190 200 Foot size (cm) Height(cm) Height and Foot Size for 30 Year 10 Students

What will happen to the correlation coefficient if the tallest Year 10 student is removed? It will get bigger 22 23 24 25 26 27 28 29 150 160 170 180 190 200 Foot size (cm) Height(cm) Height and Foot Size for 30 Year 10 Students

What do I see in this scatterplot? Appears to be a strong linear trend. Outlier (the elephant). Appears to be constant scatter. Positive association. 6005004003002001000 40 30 20 10 Gestation (Days) LifeExpectancy(Years) Life Expectancies and Gestation Period for a sample of non-human Mammals Elephant

What will happen to the correlation coefficient if the elephant is removed? It will get smaller It won’t change It will get bigger 6005004003002001000 40 30 20 10 Gestation (Days) LifeExpectancy(Years) Life Expectancies and Gestation Period for a sample of non-human Mammals Elephant

What will happen to the correlation coefficient if the elephant is removed? It will get smaller 6005004003002001000 40 30 20 10 Gestation (Days) LifeExpectancy(Years) Life Expectancies and Gestation Period for a sample of non-human Mammals Elephant

Adapted from University of Auckland Department of Statistics

13stats 3.9 TARSOG - Duration: ... Creating Scatterplots and Lines in Geogebra - Duration: ... Bivariate Correlation - Duration: ...

Read more

Bivariate Data. Notes for Merit Comparing two Bi- variate plots. Problem. Can the height of a A.I.S. Athlete in cm be used to predict the LBM in Kg’s? Or ...

Read more

13stats 3.9 correlation - Duration: ... 13stats 3.9 TARSOG - Duration: ... Quick Intro to Scatterplots, Bivariate Data, ...

Read more

## Add a comment