1、Confidence Interval EstimationChapter 8ObjectivesIn this chapter,you learn:To construct and interpret confidence interval estimates for the population mean and the population proportionTo determine the sample size necessary to develop a confidence interval for the population mean or population propo
2、rtionChapter OutlineContent of this chapternConfidence Intervals for the Population Mean,nwhen Population Standard Deviation is Knownnwhen Population Standard Deviation is UnknownnConfidence Intervals for the Population Proportion,nDetermining the Required Sample SizePoint and Interval EstimatesnA p
3、oint estimate is a single number,na confidence interval provides additional information about the variability of the estimatePoint EstimateLower Confidence LimitUpperConfidence LimitWidth of confidence intervalDCOVAWe can estimate a Population Parameter Point Estimateswith a SampleStatistic(a Point
4、Estimate)MeanProportionpXDCOVAConfidence IntervalsnHow much uncertainty is associated with a point estimate of a population parameter?nAn interval estimate provides more information about a population characteristic than does a point estimatenSuch interval estimates are called confidence intervalsDC
5、OVAConfidence Interval EstimatenAn interval gives a range of values:nTakes into consideration variation in sample statistics from sample to samplenBased on observations from 1 samplenGives information about closeness to unknown population parametersnStated in terms of level of confidencene.g.95%conf
6、ident,99%confidentnCan never be 100%confidentDCOVAConfidence Interval ExampleCereal fill examplen Population has =368 and =15.n If you take a sample of size n=25 you known368 1.96*15/=(362.12,373.88).95%of the intervals formed in this manner will contain.nWhen you dont know,you use X to estimate nIf
7、 X=362.3 the interval is 362.3 1.96*15/=(356.42,368.18)nSince 356.42 368.18 the interval based on this sample makes a correct statement about.But what about the intervals from other possible samples of size 25?2525DCOVAConfidence Interval Example(continued)Sample#XLowerLimitUpperLimitContain?1362.30
8、356.42368.18Yes2369.50363.62375.38Yes3360.00354.12365.88No4362.12356.24368.00Yes5373.88368.00379.76YesDCOVAConfidence Interval ExamplenIn practice you only take one sample of size nnIn practice you do not know so you do not know if the interval actually contains nHowever you do know that 95%of the i
9、ntervals formed in this manner will contain nThus,based on the one sample,you actually selected you can be 95%confident your interval will contain (this is a 95%confidence interval)(continued)Note:95%confidence is based on the fact that we used Z=1.96.DCOVAEstimation Process(mean,is unknown)Populati
10、onRandom SampleMean X=50SampleI am 95%confident that is between 40&60.DCOVAGeneral FormulanThe general formula for all confidence intervals is:Point Estimate (Critical Value)(Standard Error)Where:nPoint Estimate is the sample statistic estimating the population parameter of interestnCritical Value i
11、s a table value based on the sampling distribution of the point estimate and the desired confidence levelnStandard Error is the standard deviation of the point estimateDCOVAConfidence LevelnConfidence the interval will contain the unknown population parameternA percentage(less than 100%)DCOVAConfide
12、nce Level,(1-)nSuppose confidence level=95%nAlso written(1-)=0.95,(so =0.05)nA relative frequency interpretation:n95%of all the confidence intervals that can be constructed will contain the unknown true parameternA specific interval either will contain or will not contain the true parameternNo proba
13、bility involved in a specific interval(continued)DCOVAConfidence IntervalsPopulation Mean UnknownConfidenceIntervalsPopulationProportion KnownDCOVAConfidence Interval for(Known)nAssumptionsnPopulation standard deviation is knownnPopulation is normally distributednIf population is not normal,use larg
14、e sample(n 30)nConfidence interval estimate:where is the point estimate Z/2 is the normal distribution critical value for a probability of/2 in each tail is the standard error n/2ZXXn/DCOVAFinding the Critical Value,Z/2nConsider a 95%confidence interval:Z/2=-1.96Z/2=1.960.05 so 0.9510.02520.0252Poin
15、t EstimateLower Confidence LimitUpperConfidence LimitZ units:X units:Point Estimate01.96/2ZDCOVACommon Levels of ConfidencenCommonly used confidence levels are 90%,95%,and 99%Confidence LevelConfidence Coefficient,Z/2 value1.281.6451.962.332.583.083.270.800.900.950.980.990.9980.99980%90%95%98%99%99.
16、8%99.9%1DCOVAxIntervals and Level of ConfidenceConfidence Intervals Intervals extend from to (1-)100%of intervals constructed contain;()100%do not.Sampling Distribution of the Meann2/ZX n2/ZX xx1x2/2/21DCOVAExamplenA sample of 11 circuits from a large normal population has a mean resistance of 2.20
17、ohms.We know from past testing that the population standard deviation is 0.35 ohms.nDetermine a 95%confidence interval for the true mean resistance of the population.DCOVA2.4068 1.99320.2068 2.20)11(0.35/1.96 2.20n/2 ZXExamplenA sample of 11 circuits from a large normal population has a mean resista
18、nce of 2.20 ohms.We know from past testing that the population standard deviation is 0.35 ohms.nSolution:(continued)DCOVAInterpretationnWe are 95%confident that the true mean resistance is between 1.9932 and 2.4068 ohms nAlthough the true mean may or may not be in this interval,95%of intervals forme
19、d in this manner will contain the true meanDCOVAConfidence IntervalsPopulation Mean UnknownConfidenceIntervalsPopulationProportion KnownDCOVADo You Ever Truly Know?nProbably not!nIn virtually all real world business situations,is not known.nIf there is a situation where is known then is also known(s
20、ince to calculate you need to know.)nIf you truly know there would be no need to gather a sample to estimate it.nIf the population standard deviation is unknown,we can substitute the sample standard deviation,S nThis introduces extra uncertainty,since S is variable from sample to samplenSo we use th
21、e t distribution instead of the normal distributionConfidence Interval for(Unknown)DCOVAnAssumptionsnPopulation standard deviation is unknownnPopulation is normally distributednIf population is not normal,use large sample(n 30)nUse Students t DistributionnConfidence Interval Estimate:(where t/2 is t
22、he critical value of the t distribution with n-1 degrees of freedom and an area of/2 in each tail)Confidence Interval for(Unknown)nStX2/(continued)DCOVAStudents t DistributionnThe t is a family of distributionsnThe t/2 value depends on degrees of freedom(d.f.)nNumber of observations that are free to
23、 vary after sample mean has been calculatedd.f.=n-1DCOVAIf the mean of these three values is 8.0,then X3 must be 9(i.e.,X3 is not free to vary)Degrees of Freedom(df)Here,n=3,so degrees of freedom =n 1=3 1=2(2 values can be any numbers,but the third is not free to vary for a given mean)Idea:Number of
24、 observations that are free to vary after sample mean has been calculatedExample:Suppose the mean of 3 numbers is 8.0 Let X1=7Let X2=8What is X3?DCOVAStudents t Distributiont0t (df=5)t (df=13)t-distributions are bell-shaped and symmetric,but have fatter tails than the normalStandard Normal(t with df
25、=)Note:t Z as n increasesDCOVAStudents t TableDCOVAUpper Tail Areadf.10.05.02513.078 6.31412.70621.88631.638 2.3533.182t02.920The body of the table contains t values,not probabilitiesLet:n=3 df=n-1=2 =0.10 /2=0.05/2=0.054.3032.920Selected t distribution valuesWith comparison to the Z valueConfidence
26、 t t t Z Level (10 d.f.)(20 d.f.)(30 d.f.)(d.f.)0.80 1.372 1.325 1.310 1.28 0.90 1.812 1.725 1.697 1.645 0.95 2.228 2.086 2.042 1.96 0.99 3.169 2.845 2.750 2.58Note:t Z as n increasesDCOVAExample of t distribution confidence interval A random sample of n=25 has X=50 and S=8.Form a 95%confidence inte
27、rval for nd.f.=n 1=24,soThe confidence interval is 2.06390.025t/2t258(2.0639)50nS/2tX46.698 53.302DCOVAExample of t distribution confidence intervalnInterpreting this interval requires the assumption that the population you are sampling from is approximately a normal distribution(especially since n
28、is only 25).nThis condition can be checked by creating a:nNormal probability plot ornBoxplot(continued)DCOVAConfidence IntervalsPopulation Mean UnknownConfidenceIntervalsPopulationProportion KnownDCOVAConfidence Intervals for the Population Proportion,nAn interval estimate for the population proport
29、ion()can be calculated by adding an allowance for uncertainty to the sample proportion(p)DCOVAConfidence Intervals for the Population Proportion,nRecall that the distribution of the sample proportion is approximately normal if the sample size is large,with standard deviationnWe will estimate this wi
30、th sample data:(continued)np)p(1n)(1pDCOVAConfidence Interval EndpointsnUpper and lower confidence limits for the population proportion are calculated with the formulanwhere nZ/2 is the standard normal value for the level of confidence desirednp is the sample proportionnn is the sample sizenNote:mus
31、t have np 5 and n(1-p)5np)p(1/2ZpDCOVAExamplenA random sample of 100 people shows that 25 are left-handed.nForm a 95%confidence interval for the true proportion of left-handersDCOVAExamplenA random sample of 100 people shows that 25 are left-handed.Form a 95%confidence interval for the true proporti
32、on of left-handers./1000.25(0.75)1.9625/100p)/np(1/2Zp0.3349 0.1651 3)1.96(0.0430.25(continued)DCOVAInterpretationnWe are 95%confident that the true percentage of left-handers in the population is between 16.51%and 33.49%.nAlthough the interval from 0.1651 to 0.3349 may or may not contain the true p
33、roportion,95%of intervals formed from samples of size 100 in this manner will contain the true proportion.DCOVADetermining Sample SizeFor the MeanDeterminingSample SizeFor theProportionDCOVASampling ErrornThe required sample size can be found to reach a desired margin of error(e)with a specified lev
34、el of confidence(1-)nThe margin of error is also called sampling errornthe amount of imprecision in the estimate of the population parameternthe amount added and subtracted to the point estimate to form the confidence intervalDCOVADetermining Sample SizeFor the MeanDeterminingSample Sizen2/ZX n2/Ze
35、Sampling error(margin of error)DCOVADetermining Sample SizeFor the MeanDeterminingSample Sizen2/Ze(continued)2222/eZnNow solve for n to getDCOVADetermining Sample SizenTo determine the required sample size for the mean,you must know:nThe desired level of confidence(1-),which determines the critical
36、value,Z/2nThe acceptable sampling error,enThe standard deviation,(continued)DCOVARequired Sample Size ExampleIf =45,what sample size is needed to estimate the mean within 5 with 90%confidence?(Always round up)219.195(45)(1.645)eZn222222So the required sample size is n=220DCOVAIf is unknownnIf unknow
37、n,can be estimated when using the required sample size formulanUse a value for that is expected to be at least as large as the true nSelect a pilot sample and estimate with the sample standard deviation,S DCOVADetermining Sample SizeDeterminingSample SizeFor theProportion22/2e)(1ZnNow solve for n to
38、 getn)(1Ze(continued)DCOVADetermining Sample SizenTo determine the required sample size for the proportion,you must know:nThe desired level of confidence(1-),which determines the critical value,Z/2nThe acceptable sampling error,enThe true proportion of events of interest,n can be estimated with a pi
39、lot sample if necessary(or conservatively use 0.5 as an estimate of)(continued)DCOVARequired Sample Size ExampleHow large a sample would be necessary to estimate the true proportion of defectives in a large population within 3%,with 95%confidence?(Assume a pilot sample yields p=0.12)DCOVARequired Sa
40、mple Size ExampleSolution:For 95%confidence,use Z/2=1.96e=0.03p=0.12,so use this to estimate So use n=451450.74(0.03)0.12)(0.12)(1(1.96)e)(1Zn22222/(continued)DCOVAEthical IssuesnA confidence interval estimate(reflecting sampling error)should always be included when reporting a point estimate nThe l
41、evel of confidence should always be reported nThe sample size should be reportednAn interpretation of the confidence interval estimate should also be providedChapter SummaryIn this chapter we discussed:n The construction and interpretation of confidence interval estimates for the population mean and
42、 the population proportionn The determination of the sample size necessary to develop a confidence interval for the population mean or population proportionOn Line Topic:BootstrappingChapter 8Bootstrapping Is A Method To Use When Population Is Not NormalTo estimate a population mean using bootstrapp
43、ing,you would:1.Select a random sample of size n without replacement from a population of size N.2.Resample the initial sample by selecting n values with replacement from the initial sample.3.Compute X from this resample.4.Repeat steps 2&3 m different times.5.Construct the resampling distribution of
44、 X.6.Construct an ordered array of the entire set of resampled Xs.7.In this ordered array find the value that cuts off the smallest/2(100%)and the value that cuts off the largest/2(100%).These values provide the lower and upper limits of the bootstrap confidence interval estimate of.DCOVABootstrappi
45、ng Requires The Use of Software As Minitab or JMP Typically a very large number(thousands)of resamples are used.Software is needed to:Automate the resampling process Calculate the appropriate sample statistic Create the ordered array Find the lower and upper confidence limitsDCOVABootstrapping Examp
46、le -Processing Time of Life Insurance ApplicationsDCOVA73 19 16 64 28 28 31 90 60 56 31 56 22 18 45 48 17 17 17 91 92 63 50 51 69 16 17Sample of 27 times taken without replacement from populationFrom boxplot conclude population is not normal so t confidence interval is not appropriate.Use bootstrapp
47、ing to form a confidence interval for.Comparing the original sample to the first resample with replacementDCOVA73 19 16 64 28 28 31 90 60 56 31 56 22 18 45 48 17 17 17 91 92 63 50 51 69 16 17Sample of 27 times taken without replacement from populationThe initial bootstrap resample omits some values(
48、18,45,50,63,and 91)that appear in the initial sample above.Note that the value of 73 appears twice even though it appears only once in the initial sample.16 16 16 17 17 17 17 17 19 22 28 31 31 51 56 56 60 60 64 64 64 64 69 73 73 90 92The Ordered Array of Sample Means for 100 ResamplesDCOVA31.592633.
49、925935.407436.518536.629636.963037.037037.074137.148137.370437.925938.111138.148138.222238.296338.740738.814838.851938.888939.000039.185239.333339.370439.666740.148140.518540.629640.925940.963041.259341.296341.703741.888942.074142.111142.185242.851943.074143.185243.370443.444443.703743.814843.851943
50、.851943.925943.963044.148144.407444.555644.777845.000045.444445.518545.555645.666745.740745.851945.963045.963046.000046.111146.296346.296346.333346.333346.481546.666746.740746.963047.074147.222247.296347.370447.481547.481547.555647.666747.851948.518548.888949.000049.222249.444449.481549.481549.62964