1、 1.意义 种群生态特性:空间是聚集 分布还是 随机分布,解决抽样方法,提供理论依据。2.分类 随机分布:泊松(Poisson)分布 聚集分布:负二项分布(negative binomial distribution)奈曼分布(Neyman)泊松二项分布 The simplest view of spatial patterning can be obtained by adopting an individual orientation,and asking the question,Given the location of one individual,what is the proba
2、bility that another individual is nearby?There are three possibilities:1.This probability is increasedaggregated pattern 2.This probability is reduceduniform pattern 3.This probability is unaffectedrandom patternRandomAggregatedUniformFigure4.3 Three possible types of spatial patterning of individua
3、l animals or plant in a population.3.频次分布理论公式 (1)泊松(普阿松)分布 (,),0,1,2.!kp kekk是参数例:蝗蝻的田间分布0205 0101200112(1)普阿松分布(Poisson 分布)!(;),0,1,2.kkp kek称为普阿松分布,是参数例:对公共汽车客流进行调查,统计某天上午10 3011 47左右每隔20秒钟来到的乘客批数,共得到230个记录。来到批数i0123 4总共频数ni100813496230频率0.430.350.150.040.030.420.360.160.050.01iinfn!iipi0.87取100*0
4、81*134*23*94*62300.869已经发现许多随机现象服从普阿松分布 (1)社会生活,服务行业 如:电话交换台中来到的呼叫数 公共汽车站来到的 乘客数 (2)物理学 放射性分裂落到某区域的质点数 (3)昆虫个体的空间分布以交换台电话呼叫数为例 (1)平衡性 在t0,t0+t中来到的呼叫数只与时间间隔长度t有关,而与时间起点T0无关 (2)独立增量性(无后效性)在t0,t0+t内来到k个呼叫这一事件与时刻T0发生的事件独立 (3)普通性 在充分小的时间间隔中,最多只来到一个呼叫虫数 x频率 f f*x022501130130240803103043124082522 5 20.6 1
5、84 0 8fxxf000.6180.53910!pee 另样的理论数 n*p0=408*0.5391=219.09 有一头虫的样本的理论数 n*p1=135.9 虫数 x观察值(o)理论值(c)0225219.90.111130135.90.2624042.20.093108.70.21 431.32.222.892()occ22()2.89occ意味不是一个小概率事件(p0.05),没有理由否定假设220.0520.052.89 22查表得:(自由度为3)=7.815 计算所得2离散数据的检验法22111989()iiinniiSYY22年,Pearson提出把作为一个度量实际数(观察值)
6、和预计数(理论值)之间的偏离度的数据,其定义为实际数预计数)预计数要求各组内的预计数都不少于5,当某组的Y少于5时,须把它和相邻的一组或几组合并直到Y大于5,然后再用上式计算 x2值。检验的理论与方法检验的理论与方法1 公式 O为实际观测值,E为理论推算值。其基本原理是应用理论推算值与实际观测值之间的偏离程度来决定其 值的大小。是理论分布总体的频数 是观察分布总体的频数 两个样本来自不同的总体2212120:H1:HkiiiiEEO122)(2分布的特点 df=1 df=3 df=5(1)分布于区间1,),偏斜度随自由度降低而增大,当自由度df=1时,曲线以纵轴为渐近线。(2)随自由度df增大
7、,分布趋左右对称,当df30时,分布接近正态。2222223 检验的基本步骤(1)建立检验假设,确定检验水平。(2)计算检验统计量0:H1:H12120.05kiiiiEEO122)((3)确定概率P值,计算自由度dfk-1 由 和自由度查统计表 的临界值(4)判断结果 临界值检验假设的关系 值 P 假设 判断 0.05 不拒绝 差异无显著性 0.05 拒绝 差异有显著性22,df222,05.02,05.00H0H例:假定某地婴儿出生的男女比例为1:1。研究者抽取了一个含10,000名婴儿的样品,男孩5100,女孩4900,问他是否证实了假设或否定了假设。某地婴儿出生性比为1:1 拒绝 婴儿
8、性比不为1:1kiiiiEEO122)(5000)50004900(5000)50005100(2240:H121:H1284.321,05.021,05.020H0:H注:在自由度df1时,需进行连续性矫正,其矫正的 为:适合性检验 比较观测数与理论数是否符合的假设检验叫适合性检验。例如在遗传学上,常用 检验来测定所得的结果是否符合孟德尔分离规律,自由组合定律等。2ckiiiicEEO122)5.0(2例 有一鲤鱼遗传试验,以荷包红鲤(红色)与湘江野鲤(青灰色)杂交,其 代获得如表5-2所列得体色分离尾数,问这一资料的实际观察值是否符合孟德尔的青:红=3:1一对等为基因的遗传规律?表表 鲤鱼
9、遗传试验遗传试验 F2观察结果观察结果 体 色 青 灰 色 红 色 总数 F2观测尾数 1503 99 16022F(1)鲤鱼体色 分离符合3:1比率。(2)取显著水平(3)计算 青灰色理论数 红色理论数22210.51503 1201.50.51201.5iiciiOEE63.3015.4005.05.4009920:H2F05.02c1316021201.54E 211602400.54E(4)差 值表。df=1时,故否定 ,接受 即鲤鱼体色 分离不符合3:1比率。220.053.842F2c20.05,10HAH 正二项分布是(p+q)n 的展开式的各项,其中n为个体总数,p,q为分成对
10、比两类期望的比例。Student(1907).-1,(-),kmqpp qmpk 其中,为总体平均值,展开上述式子,于是一个样本单位有r个个体的概率为2(1)!(1)!1,rrkrkrpprkqsxpkpx可以估算出p,k。矩法 由此可以推出2;,0,mVmkVmkVmkV方差,平均数;当负二项泊松当22()1.(1)95%ixxsCxxnCC222n服从均数为1,方差为的正态分布(n-1)C的概率为的置信区间为2n 1 2(n-1)落入区间,随机型分布落入区间外,聚集型分布上述蝗蝻例子中220.6691.080.61828161212165649112 0.071 0.14sIxnn 0.8
11、61.081.14说明上述蝗蝻属Poisson分布。10,vImII随 机 分 布I 0,聚 集 分 布当 种 群 由 于 随 机 死 亡原 来分 之 一 时;聚 集 度原 来分 之 一 时 Index of Dispersion Test.We define an index of dispersion I to be For the theoretical Poisson distribution,the variance equals the mean,so the expected value of I is always 1.0 in a Poisson world.The simp
12、lest test statistic for the index of dispersion is a chi-squared one:where I=Index of dispersion(as defined in equation 4.3)n=Number of quadrats counted =value of chi-squared with(n-1)degrees of freedom.2Observed variancesObserved meanxI 21I n0 41 82 23 54 25 36 1 虫数 频率25例:取了25个样,调查蚯蚓的田间分布。,25n24.2x
13、809.1S223.271.462.2411.4625 135.0sIxI n由于 observed chi-squared 20.97520.02520.97512.40;20.025.39.36所以,我们接受原假设:蚯蚓田间分布符合Poisson分布。提出 负二项分布中的K2;,mVmkk 时,V=m,负二项泊松个体分布呈完全随机性当k0时,V种群分布极不均匀,聚集度极高1k=作为聚集度量kk的特性:当种群密度因为随机死亡而减小时,k保持不变,表示种群空间分布的内在特点,而与密度无关22221 logloglog,log0,1,1,log0,1,log0,bbsabmsa mTayloza
14、msabamsaba mmab 2大量生物资料中总结出下列公式,幂法则。当log=0,b=1,s种群在一切密度下随机分布,种群在一切密度下均是聚集的,但不是聚集度的密度依赖性当种群在一切密度下均是聚集的,且具密度依赖性。当21(1)011,bbbsa mmmmm,所以,密度越高,种群分布越均匀,(聚集度越低)*1(1)jnjjjnjxxmx例:a 1b 0c 2d 3X1=1;x2=0X3=2;x4=3n=4A:一头“独居”1*(1-1)B:没有邻居C:有两头,各以对方为邻居;2*(2-1)=2D:每个有两个邻居,3*(3-1)=6,总共“邻居”数为:0+0+2+6=8 *881.336jmx
15、平均每个个体有1.33个邻居2*2*2*(1)(1)11,ssmmmmmmmsm mm 可以证明:,随机分布所以 聚集度指标:*mm*1,1,1,mmmmmm随 机 分 布均 匀 分 布聚 集 分 布*6.Iwao(1968,1972)m m 回归法 Iwao 发现*mm:m0,=0 0,1 上,文章中说:“任何实验可以作为是许多可能在相同条件下作出的实验的总体中的一个个体.一系列的实验则是以从这个总体中所抽得的一个样品”1.总体与抽样 设一块棉田有N株棉株,每株上某种害虫数分别为X1,X2.XN,Ni=1N22iiN2ii1 x=XiN1:=(x-x)N1 =(x-x)N总体平均数:总体方差
16、总体标准差:从总体N中,随机抽取n株(nN)样本,每株虫数分别为X1,X2,Xn.ini=1n22i1N2i11 x=xn1:S=(x-x)n1 s=(x-x)n样本平均数:样本方差样本标准差:目的:通过样本对总体做出推断 1908年,“Student”发表了t分布 d.f.=n-1xtxS(1)n,t(2)tt=0当分布正态分布分布是对称的分布,分布曲线中线为 例:棉田中随机调查50株棉株,以估计该棉田中害虫的数量.i0.050.05ni=1n22i121287 x=x=5.74n501 S(x-x)20.95nS(4.5720.95(=0.646505.7420.645.7420.64 4
17、.467.02 snxtxt平 均 每 株 虫 数(方 差)=标 准 差)=20.95标 准 误)xxxSSS0.05222222t n=d95%t24 n=,d-dxstsns概率,允许误差例:洪泽湖蝗区虫数样本数(f)fx017015353218363103042810012721271.271000.8650.05 t=2 d=0.540.865n=13.840.25 d=0.140.865n=3460.01xs,允许误差若 如果,我们引入变异系数(coefficient of variation)这儿,=标准差 =观察平均数 那么,绝对误差 可写成相对误差 ,(以百分比形式)(方程1)
18、sxxdt sr100 xt srx100t srx n2222100tsnxr2,0.05tsCVx22100200;CVtCVnrr 两个平均数的比较两个平均数的比较例如,我们要比较两个池塘中同一种鱼的重量是否有差异,典型的方法是个抽取一定数量的样本用t检验来检验两样本平均数是否有差异。但是,如何在抽样前回答应该取多少样?Snedecor and Cochran(1967,113)提出了如下的近似公式:一般 这儿 =从两个种群中的每一个抽取的样本大小;=水平为 的标准正态离差值 ()2222 ZZsnd50n nZ0.051.96;Z0.012.576Z =水平为 的型错误概率下的标准正态
19、离差值(见下表)=测量的方差。(已知,或推测)。=你希望以 概率能检测出的两平均值的最小差异。Type error Power Two-tailed 0.40 0.60 0.25 0.20 0.80 0.84 0.10 0.90 1.28 0.05 0.95 1.64 0.01 0.99 2.33 0.001 0.999 2.58Z2sABd1 1Z例.如果上例中我们希望检测出的平均数差异是:(从以前的研究中知道)如果,则 条。2249.4ABdgsg0.01,0.05.2222 2.576 1.649.4196.34n2.SAMPLE SIZE FOR DISCRETE VARIABLES
20、Counts of the numbers of plants in a quadrat or the numbers of eggs in a nest differ from continuous variables in their statistical properties.The frequency distribution of counts will often be described by either the binomial distribution,the Poisson distribution or the negative binomial distributi
21、on(Elliott 1977).The sampling properties of these distributions differ,so we require a different approach to estimating sample sizes needed for counts.(1)Proportions and Percentages Proportions like the sex ratio or fraction of juveniles in a population are described statistically by the binomial di
22、stribution.All the organisms are classified into two classes,and the distribution has only two parameters:Proportion of types in the population Proportion of types in the population1qp p If sample size is above 20,we can use the normal approximation to the confidence interval:Where Observed proporti
23、on Value of Students t-distribution for n-1 degrees of freedom Standard error of Thus the desired margin of error is Solving for n,the sample size required is ppt st p ps ppq n ppqdt stn22 tpqnd where n=Sample size needed for estimating the proportion p d=Desired margin of error in our estimate As a
24、 first approximation for we can use We need to have an approximate value of p to use in this equation.Prior information,or a guess,should be used;the only rule-of-thumb is that when in doubt,pick a value of p closer to 0.5 than you guess.This will make your answer conservative.As an example,suppose
25、you wish to estimate the sex ratio of a deer population.You expect p to be about 0.40,and you would like to estimate p within an error limit of with .From equation0.052.0.t0.020.05222.00.40 1 0.4024000.02ndeer(2)Counts from a Poisson DistributionSample size estimation is very simple for any variable
26、 that can be described by the Poisson distribution,in which the variance equals the mean.From this it follows thatorThus from equation,(1)assuming :where Sample size required for a Poisson variable Desired relative error(as percentage)Coefficient of variation=2ssxCVxxx1CVx0.05222002001;.(2)CVnrrxnrC
27、V 1xFor example,if you are counting eggs in starling nests and know that these counts fit a Poisson distribution and that the mean is about 6.0,then if you wish to estimate this mean with precision of (width of confidence interval),you have:nestsEquation(2)can be simplified for the normal range of r
28、elative errors as follows:For precision 0.05%22001266.756.0n 400nx10%3.STATISTICAL POWER ANALYSIS DecisionState of real world Do not reject null hypothesis Reject the null hypothesisNull hypothesis is Correct decision Type error actually true (probability=1-)(probability=)Null hypothesis is Type err
29、or Correct decision actually false (probability=)(probability=(1-)=power)Most ecologists worry about ,the probability of a Type error,but there is abundant evidence now that we should worry just as much or more about ,the probability of a Type error(Peterman 1990;Fairweather 1991).Power analysis can
30、 carried out before you begin your study(a priori,or prospective power analysis)or after you have finished(retrospective power analysis).Here we discuss a priori power analysis as it is used for the planning of experiments.Thomas(1997)discussed retrospective power analysis.The key point you should r
31、emember is that there are four variables affecting any statistical inference:sample sizeProbability of a Probability of a Type error Type error Magnitude of the effect=effect sizeThese four variables are interconnected,and once any three of them are fixed,the fourth is automatically determined.Looke
32、d at from another perspective,given any three of these,you can determine the fourth.SUMMARYThe most common question in ecological research is,how large a sample should I take?This chapter attempts to give a general answer to this question by providing a series of equations from which sample size may
33、 be calculated.It is always necessary to know something about the population you wish to analyze unless you use guesswork or prior observations.You must also make some explicit decision about how much error you will allow in your estimates(or how small a confidence interval you wish to have).For con
34、tinuous variables like weight or length,we can assume a normal distribution and calculate the required sample sizes for means and for variances quite precisely.For counts,we need to know the underlying statistical distributionbinomial,Poisson,or negative binomialbefore we can specify sample sizes ne
35、eded.Power analysis explores the relationships between the four interconnected variables (probability of Type error),(probability of Type error),effect size,and sample size.Fixing three of these automatically fixes the fourth,and ecologists should explore these relationships before they begin their
36、experiments.Significant effect sizes should be specified on ecological grounds before a study is begun.Sampling Designs:Random,Adaptive and Systematic Sampling(1)Simple Random Sampling(2)Stratilied Random Sampling(3)Adaptive Sampling(4)Systematic Sampling Simple random sampling is the easiest and mo
37、st common sampling design.Each possible sample unit must have an equal chance of being selected to obtain a random sample.All the formulas of statistics are based on random sampling,and probability theory is the foundation of statistics.Thus you should always sample randomly when you have a choice.I
38、n some cases the statistical population is finite in size,and the idea of a finite population correction must be added into formulas for variances and standard errors.These formulas are reviewed for measurements,ratios,and proportion.Often a statistical population can be subdivided into homogeneous
39、subpopulations,and random sampling can be applied to each subpopulation separately.This is stratified random sampling,and represents the single most powerful sampling design that ecologists can adopt in the field with relative ease.Stratified sampling is almost always more precise than simple random
40、 sampling,and every ecologist should use it whenever possible.Sample size allocation in stratified sampling can be determined using proportional or optimal allocation.To use optimal allocation,you need rough estimates of the variances in each of the strata and the cost of sampling each strata.Optima
41、l allocation is more precise than proportional allocation,and is to be preferred.Some simple rules are presented to allow you to estimate the optimal number of strata you should define in setting up a program of stratified random sampling.If organisms are rare and patchily distributed,you should con
42、sider using adaptive cluster sampling to estimate abundance.When a randomly placed quadrat contains a rare species,adaptive sampling adds quadrats in the vicinity of the original quadrat to sample the potential cluster.This additional nonrandom sampling requires special formulas to estimate abundanc
43、e without bias.Systematic sampling is easier to apply in the field than random sampling,but may produce biased estimates of means and confidence limits if there are periodicities in the data.In field ecology this is usually not the case,and systematic samples seem to be the equivalent of random samp
44、les in many field situations.If a gradient exists in the ecological community,systematic sampling will be better than random sampling for describing it.Step 1.Calculate the average abundance of each of the networks:(8.35)where =Average abundance of the i-th network =Abundance of the organism in each
45、 of the k quadrats in the i-th network =Number of quadrats in the i-th netwrok Step 2.From these values we obtain an estimator of the mean abundance as follows:(8.36)where Unbiased estimate of mean abundance from adaptive cluster sampling Number of initial sampling units selected via random sampling
46、1kjiiyiwmiwiyimiiwxnx n If the initial sample is selected with replacement,the variance of this mean is given by:(8.37)where Estimated variance of mean abundance for sampling with replacement and all other terms are as defined above.If the initial sample is selected without replacement,the variance
47、of the mean is given by:(8.38)where N=Total number of possible sample quadrats in the sampling universe 21var1niiwxxn n var x 21var1niiNnwxxNn nThe example shown in Figure 8.3.in the initial random sample of n=10 quadrats,from equation(8.36).plants per quadratSince we were sampling without replaceme
48、nt,we use equation(8.38)to estimate the variance of this mean:225007815110.0869010iiwxn 21var1niiNnwxxNn n2222400 100.08690.086978400 10 10 10.000137429We can obtain confidence limits from these estimates in the usual way:For this example with n=10,for 95%confidence limits ,and the confidence limits
49、 become:or from 0.0 to 0.171 plants per quadrat.When should one consider using adaptive sampling?Much depends on the abundance and the spatial pattern of the animals or the plants being studied.In general the more clustered the population and the rarer the organism,the more efficient it will be to u
50、se adaptive cluster sampling.Thompson(1992)shows,in Figure 8.2,that adaptive sampling is about 12%more efficient than simple random sampling for n=10 quadrats and nearly 50%more efficient when n=30 quadrats.In any particular situation it may well pay to conduct a pilot experiment with simple random