1、 Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 1Numerical Descriptive MeasuresChapter 3 Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 2In this chapter,you learn to:nDescribe the properties of central tendency,variation,and shape in numerical datanConstruct and inter
2、pret a boxplotnCompute descriptive summary measures for a populationnCalculate the covariance and the coefficient of correlationObjectives Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 3Summary Definitions The central tendency is the extent to which the values of a numerical variabl
3、e group around a typical or central value.The variation is the amount of dispersion or scattering away from a central value that the values of a numerical variable show.The shape is the pattern of the distribution of values from the lowest value to the highest value.DCOVA Copyright 2016,2013,2010 Pe
4、arson Education,Inc.Chapter 3,Slide 4Measures of Central Tendency:The MeannThe arithmetic mean(often just called the“mean”)is the most common measure of central tendencynFor a sample of size n:Sample sizenXXXnXXn21n1iiObserved valuesThe ith valuePronounced x-barDCOVA Copyright 2016,2013,2010 Pearson
5、 Education,Inc.Chapter 3,Slide 5Measures of Central Tendency:The Mean (cont)nThe most common measure of central tendencynMean=sum of values divided by the number of valuesnAffected by extreme values(outliers)11 12 13 14 15 16 17 18 19 20Mean=13 11 12 13 14 15 16 17 18 19 20Mean=143156555141312111415
6、7052041312111DCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 6Measures of Central Tendency:The MediannIn an ordered array,the median is the“middle”number(50%above,50%below)nLess sensitive than the mean to extreme valuesMedian=13Median=1311 12 13 14 15 16 17 18 19 20 11 12 13 14
7、15 16 17 18 19 20DCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 7Measures of Central Tendency:Locating the MediannThe location of the median when the values are in numerical order(smallest to largest):nIf the number of values is odd,the median is the middle numbernIf the number
8、 of values is even,the median is the average of the two middle numbersNote that is not the value of the median,only the position of the median in the ranked datadataorderedtheinposition21npositionMedian21nDCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 8Measures of Central Tende
9、ncy:The ModenValue that occurs most oftennNot affected by extreme valuesnUsed for either numerical or categorical datanThere may be no modenThere may be several modes0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Mode=90 1 2 3 4 5 6No ModeDCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 9Mea
10、sures of Central Tendency:Review ExampleHouse Prices:$2,000,000$500,000$300,000$100,000$100,000Sum$3,000,000 Mean:($3,000,000/5)=$600,000 Median:middle value of ranked data =$300,000 Mode:most frequent value =$100,000DCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 10Measures of
11、Central Tendency:Which Measure to Choose?The mean is generally used,unless extreme values(outliers)exist.The median is often used,since the median is not sensitive to extreme values.For example,median home prices may be reported for a region;it is less sensitive to outliers.In some situations it mak
12、es sense to report both the mean and the median.DCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 11Measures of Central Tendency:SummaryCentral TendencyArithmetic MeanMedianModenXXnii1Middle value in the ordered arrayMost frequently observed valueDCOVA Copyright 2016,2013,2010 Pea
13、rson Education,Inc.Chapter 3,Slide 12Same center,different variationMeasures of VariationnMeasures of variation give information on the spread or variability or dispersion of the data values.VariationStandard DeviationCoefficient of VariationRangeVarianceDCOVA Copyright 2016,2013,2010 Pearson Educat
14、ion,Inc.Chapter 3,Slide 13Measures of Variation:The Range Simplest measure of variation Difference between the largest and the smallest values:Range=Xlargest Xsmallest0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Range=13-1=12Example:DCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 14Measur
15、es of Variation:Why The Range Can Be Misleading Does not account for how the data are distributed Sensitive to outliers7 8 9 10 11 12Range=12-7=57 8 9 10 11 12Range=12-7=51,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,51,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120Range=5-1=4Range=120-1=119DCO
16、VA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 15nAverage(approximately)of squared deviations of values from the meannSample variance:Measures of Variation:The Sample Variance1-n)X(XSn1i2i2Where =arithmetic meann=sample sizeXi=ith value of the variable XXDCOVA Copyright 2016,2013,
17、2010 Pearson Education,Inc.Chapter 3,Slide 16nMost commonly used measure of variationnShows variation about the meannIs the square root of the variancenHas the same units as the original datanSample standard deviation:Measures of Variation:The Sample Standard Deviation1-n)X(XSn1i2iDCOVA Copyright 20
18、16,2013,2010 Pearson Education,Inc.Chapter 3,Slide 17Measures of Variation:The Standard DeviationSteps for Computing Standard Deviation1.Compute the difference between each value and the mean.2.Square each difference.3.Add the squared differences.4.Divide this total by n-1 to get the sample variance
19、.5.Take the square root of the sample variance to get the sample standard deviation.DCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 18Measures of Variation:Sample Standard Deviation:Calculation ExampleSample Data (Xi):10 12 14 15 17 18 18 24 n=8 Mean=X=164.309571301816)(2416)(14
20、16)(1216)(101n)X(24)X(14)X(12)X(10S22222222A measure of the“average”scatter around the meanDCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 19Measures of Variation:Comparing Standard DeviationsMean=15.5 S=3.338 11 12 13 14 15 16 17 18 19 20 2111 12 13 14 15 16 17 18 19 20 21Data
21、BData AMean=15.5 S=0.92611 12 13 14 15 16 17 18 19 20 21Mean=15.5 S=4.567Data CDCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 20Measures of Variation:Comparing Standard DeviationsSmaller standard deviationLarger standard deviationDCOVA Copyright 2016,2013,2010 Pearson Education
22、,Inc.Chapter 3,Slide 21Measures of Variation:Summary Characteristics The more the data are spread out,the greater the range,variance,and standard deviation.The more the data are concentrated,the smaller the range,variance,and standard deviation.If the values are all the same(no variation),all these
23、measures will be zero.None of these measures are ever negative.DCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 22Measures of Variation:The Coefficient of VariationnMeasures relative variationnAlways in percentage(%)nShows variation relative to meannCan be used to compare the var
24、iability of two or more sets of data measured in different units 100%XSCVDCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 23Measures of Variation:Comparing Coefficients of VariationnStock A:nAverage price last year=$50nStandard deviation=$5nStock B:nAverage price last year=$100nS
25、tandard deviation=$5Both stocks have the same standard deviation,but stock B is less variable relative to its price10%100%$50$5100%XSCVA5%100%$100$5100%XSCVBDCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 24Measures of Variation:Comparing Coefficients of Variation(cont)nStock A:
26、nAverage price last year=$50nStandard deviation=$5nStock C:nAverage price last year=$8nStandard deviation=$2Stock C has a much smaller standard deviation but a much higher coefficient of variation10%100%$50$5100%XSCVA25%100%$8$2100%XSCVC DCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3
27、,Slide 25Locating Extreme Outliers:Z-Score To compute the Z-score of a data value,subtract the mean and divide by the standard deviation.The Z-score is the number of standard deviations a data value is from the mean.A data value is considered an extreme outlier if its Z-score is less than-3.0 or gre
28、ater than+3.0.The larger the absolute value of the Z-score,the farther the data value is from the mean.DCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 26Locating Extreme Outliers:Z-Scorewhere X represents the data value X is the sample mean S is the sample standard deviationSXXZ
29、DCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 27Locating Extreme Outliers:Z-Score Suppose the mean math SAT score is 490,with a standard deviation of 100.Compute the Z-score for a test score of 620.3.1100130100490620SXXZA score of 620 is 1.3 standard deviations above the mean
30、and would not be considered an outlier.DCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 28Shape of a DistributionnDescribes how data are distributednTwo useful shape related statistics are:nSkewnessnMeasures the extent to which data values are not symmetricalnKurtosisnKurtosis af
31、fects the peakedness of the curve of the distributionthat is,how sharply the curve rises approaching the center of the distributionDCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 29Shape of a Distribution(Skewness)nMeasures the extent to which data is not symmetricalMean=Median
32、Mean Median Median MeanRight-SkewedLeft-SkewedSymmetricDCOVASkewnessStatistic0 Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 30Shape of a Distribution -Kurtosis measures how sharply the curve rises approaching the center of the distribution Sharper PeakThan Bell-Shaped(Kurtosis 0)Fl
33、atter ThanBell-Shaped(Kurtosis Xlargest MedianMedian XsmallestXlargest MedianMedian XsmallestXlargest Q3Q1 XsmallestXlargest Q3Q1 XsmallestQ3 MedianMedian Q1Q3 MedianMedian Q1 1)n Examples:(1-1/22)x 100%=75%.k=2 (2)(1-1/32)x 100%=88.89%.k=3 (3)Chebyshev RuleWithinAt leastDCOVA Copyright 2016,2013,20
34、10 Pearson Education,Inc.Chapter 3,Slide 58We Discuss Two Measures Of The Relationship Between Two Numerical Variables Scatter plots allow you to visually examine the relationship between two numerical variables and now we will discuss two quantitative measures of such relationships.The Covariance T
35、he Coefficient of Correlation Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 59The CovariancenThe covariance measures the strength of the linear relationship between two numerical variables(X&Y)nThe sample covariance:nOnly concerned with the strength of the relationship nNo causal ef
36、fect is implied1n)YY)(XX()Y,X(covn1iiiDCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 60nCovariance between two variables:cov(X,Y)0 X and Y tend to move in the same directioncov(X,Y)0 X and Y tend to move in opposite directionscov(X,Y)=0 X and Y are independentnThe covariance ha
37、s a major flaw:nIt is not possible to determine the relative strength of the relationship from the size of the covarianceInterpreting CovarianceDCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 61Coefficient of CorrelationnMeasures the relative strength of the linear relationship
38、between two numerical variablesnSample coefficient of correlation:whereYXSSY),(Xcovr 1n)X(XSn1i2iX1n)Y)(YX(XY),(Xcovn1iii1n)Y(YSn1i2iYDCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 62Features of theCoefficient of CorrelationnThe population coefficient of correlation is referred
39、 as.nThe sample coefficient of correlation is referred to as r.nEither or r have the following features:nUnit freenRange between 1 and 1nThe closer to 1,the stronger the negative linear relationshipnThe closer to 1,the stronger the positive linear relationshipnThe closer to 0,the weaker the linear r
40、elationshipDCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 63Scatter Plots of Sample Data with Various Coefficients of CorrelationYXYXYXYXr=-1r=-.6r=+.3r=+1YXr=0DCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 64The Coefficient of Correlation Using Microsoft
41、Excel FunctionDCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 65The Coefficient of Correlation Using Microsoft Excel Data Analysis Tool1.Select Data2.Choose Data Analysis3.Choose Correlation&Click OKDCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 66The Coeff
42、icient of CorrelationUsing Microsoft Excel4.Input data range and select appropriate options5.Click OK to get outputDCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 67Interpreting the Coefficient of CorrelationUsing Microsoft Excel r=.733 There is a relatively strong positive line
43、ar relationship between test score#1 and test score#2.Students who scored high on the first test tended to score high on second test.DCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 68Pitfalls in Numerical Descriptive MeasuresnData analysis is objectivenShould report the summary
44、measures that best describe and communicate the important aspects of the data setnData interpretation is subjectivenShould be done in fair,neutral and clear mannerDCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 69Ethical ConsiderationsNumerical descriptive measures:nShould docum
45、ent both good and bad resultsnShould be presented in a fair,objective and neutral mannernShould not use inappropriate summary measures to distort factsDCOVA Copyright 2016,2013,2010 Pearson Education,Inc.Chapter 3,Slide 70In this chapter we have discussed:nDescribing the properties of central tendency,variation,and shape in numerical datanConstructing and interpreting a boxplotnComputing descriptive summary measures for a populationnCalculating the covariance and the coefficient of correlationChapter Summary