(数据库原理课件)Chapter10-Query-Processing.ppt

上传人(卖家):晟晟文业 文档编号:4980734 上传时间:2023-01-30 格式:PPT 页数:117 大小:888KB
下载 相关 举报
(数据库原理课件)Chapter10-Query-Processing.ppt_第1页
第1页 / 共117页
(数据库原理课件)Chapter10-Query-Processing.ppt_第2页
第2页 / 共117页
(数据库原理课件)Chapter10-Query-Processing.ppt_第3页
第3页 / 共117页
(数据库原理课件)Chapter10-Query-Processing.ppt_第4页
第4页 / 共117页
(数据库原理课件)Chapter10-Query-Processing.ppt_第5页
第5页 / 共117页
点击查看更多>>
资源描述

1、Database System Concepts,6th Ed.Silberschatz,Korth and SudarshanSee www.db- for conditions on re-use Silberschatz,Korth and Sudarshan12.2Database System Concepts-6th EditionnOverview nMeasures of Query CostnSelection Operation nSorting nJoin Operation nOther OperationsnEvaluation of ExpressionsSilbe

2、rschatz,Korth and Sudarshan12.3Database System Concepts-6th Edition1.Parsing and translation2.Optimization3.EvaluationSilberschatz,Korth and Sudarshan12.4Database System Concepts-6th EditionnParsing and translationltranslate the query into its internal form.This is then translated into relational al

3、gebra.lParser checks syntax,verifies relationsnEvaluationlThe query-execution engine takes a query-evaluation plan,executes that plan,and returns the answers to the query.Silberschatz,Korth and Sudarshan12.5Database System Concepts-6th EditionnRDBMS查询处理阶段查询处理阶段:1.查询分析2.查询检查3.查询优化 4.查询执行 Silberschatz

4、,Korth and Sudarshan12.6Database System Concepts-6th EditionCS 245Notes 66ExampleSelect B,DFrom R,SWhere R.A=“c”S.E=2 R.C=S.CSilberschatz,Korth and Sudarshan12.7Database System Concepts-6th EditionCS 245Notes 67 RABC S CDEa11010 x2b12020y2c21030z2d23540 x1e34550y3Silberschatz,Korth and Sudarshan12.8

5、Database System Concepts-6th EditionCS 245Notes 68 RABC S CDEa11010 x2b12020y2c21030z2d23540 x1e34550y3AnswerB D2 xSilberschatz,Korth and Sudarshan12.9Database System Concepts-6th EditionCS 245Notes 69 How do we execute query?-Do Cartesian product-Select tuples-Do projectionOne ideaSilberschatz,Kort

6、h and Sudarshan12.10Database System Concepts-6th EditionCS 245Notes 610RXSR.A R.B R.C S.C S.D S.E a 1 10 10 x 2 a 1 10 20 y 2 .c 2 10 10 x 2 .Silberschatz,Korth and Sudarshan12.11Database System Concepts-6th EditionCS 245Notes 611RXSR.A R.B R.C S.C S.D S.E a 1 10 10 x 2 a 1 10 20 y 2 .c 2 10 10 x 2

7、.Bingo!Got one.Silberschatz,Korth and Sudarshan12.12Database System Concepts-6th EditionCS 245Notes 612Relational Algebra-can be used to describe plans.Ex:Plan IB,D sR.A=“c”S.E=2 R.C=S.C XRSSilberschatz,Korth and Sudarshan12.13Database System Concepts-6th EditionCS 245Notes 613Relational Algebra-can

8、 be used to describe plans.Ex:Plan IB,D sR.A=“c”S.E=2 R.C=S.C XRSOR:B,D sR.A=“c”S.E=2 R.C=S.C(RXS)Silberschatz,Korth and Sudarshan12.14Database System Concepts-6th EditionCS 245Notes 614Another idea:B,D sR.A=“c”sS.E=2R SPlan II natural joinSilberschatz,Korth and Sudarshan12.15Database System Concept

9、s-6th EditionCS 245Notes 615 R SA B C (R)(S)C D Ea 1 10 A B C C D E 10 x 2b 1 20c 2 10 10 x 2 20 y 2c 2 10 20 y 2 30 z 2d 2 35 30 z 2 40 x 1e 3 45 50 y 3Silberschatz,Korth and Sudarshan12.16Database System Concepts-6th EditionCS 245Notes 616Plan III Use R.A and S.C Indexes(1)Use R.A index to select

10、R tuples with R.A=“c”(2)For each R.C value found,use S.C index to find matching tuplesSilberschatz,Korth and Sudarshan12.17Database System Concepts-6th EditionCS 245Notes 617Plan III Use R.A and S.C Indexes(1)Use R.A index to select R tuples with R.A=“c”(2)For each R.C value found,use S.C index to f

11、ind matching tuples(3)Eliminate S tuples S.E 2(4)Join matching R,S tuples,project B,D attributes and place in resultSilberschatz,Korth and Sudarshan12.18Database System Concepts-6th EditionCS 245Notes 618 R SA B C C D Ea 1 10 10 x 2b 1 20 20 y 2c 2 10 30 z 2d 2 35 40 x 1e 3 45 50 y 3ACI1I2Silberscha

12、tz,Korth and Sudarshan12.19Database System Concepts-6th EditionCS 245Notes 619ACI1I2=“c”R SA B C C D Ea 1 10 10 x 2b 1 20 20 y 2c 2 10 30 z 2d 2 35 40 x 1e 3 45 50 y 3Silberschatz,Korth and Sudarshan12.20Database System Concepts-6th Edition R SA B C C D Ea 1 10 10 x 2b 1 20 20 y 2c 2 10 30 z 2d 2 35

13、 40 x 1e 3 45 50 y 3CS 245Notes 620ACI1I2=“c”Silberschatz,Korth and Sudarshan12.21Database System Concepts-6th Edition R SA B C C D Ea 1 10 10 x 2b 1 20 20 y 2c 2 10 30 z 2d 2 35 40 x 1e 3 45 50 y 3CS 245Notes 621ACI1I2=“c”check=2?output:Silberschatz,Korth and Sudarshan12.22Database System Concepts-

14、6th EditionCS 245Notes 622next tuple:R SA B C C D Ea 1 10 10 x 2b 1 20 20 y 2c 2 10 30 z 2d 2 35 40 x 1e 3 45 50 y 3ACI1I2=“c”check=2?output:Silberschatz,Korth and Sudarshan12.23Database System Concepts-6th EditionnA relational algebra expression may have many equivalent expressionslE.g.,ssalary7500

15、0(salary(instructor)is equivalent to salary(ssalary75000(instructor)nEach relational algebra operation can be evaluated using one of several different algorithmslCorrespondingly,a relational-algebra expression can be evaluated in many ways.nAnnotated expression specifying detailed evaluation strateg

16、y is called an evaluation-plan.lE.g.,can use an index on salary to find instructors with salary 75000,lor can perform complete relation scan and discard instructors with salary 75000Silberschatz,Korth and Sudarshan12.24Database System Concepts-6th EditionnQuery Optimization:Amongst all equivalent ev

17、aluation plans choose the one with lowest cost.l Cost is estimated using statistical information from the database catalog4e.g.number of tuples in each relation,size of tuples,etc.nIn this chapter we studylHow to measure query costslAlgorithms for evaluating relational algebra operationslHow to comb

18、ine algorithms for individual operations in order to evaluate a complete expressionnIn next chapterlWe study how to optimize queries,that is,how to find an evaluation plan with lowest estimated costSilberschatz,Korth and Sudarshan12.25Database System Concepts-6th EditionnCost is generally measured a

19、s total elapsed time for answering querylMany factors contribute to time cost4disk accesses,CPU,or even network communicationnTypically disk access is the predominant cost,and is also relatively easy to estimate.Measured by taking into accountlNumber of seeks *average-seek-costlNumber of blocks read

20、 *average-block-read-costlNumber of blocks written*average-block-write-cost4Cost to write a block is greater than cost to read a block data is read back after being written to ensure that the write was successfulSilberschatz,Korth and Sudarshan12.26Database System Concepts-6th EditionnFor simplicity

21、 we just use the number of block transfers from disk and the number of seeks as the cost measuresltT time to transfer one blockltS time for one seeklCost for b block transfers plus S seeks b*tT+S*tS nWe ignore CPU costs for simplicitylReal systems do take CPU cost into accountnWe do not include cost

22、 to writing output to disk in our cost formulaeSilberschatz,Korth and Sudarshan12.27Database System Concepts-6th EditionnSeveral algorithms can reduce disk IO by using extra buffer space lAmount of real memory available to buffer depends on other concurrent queries and OS processes,known only during

23、 execution4We often use worst case estimates,assuming only the minimum amount of memory needed for the operation is availablenRequired data may be buffer resident already,avoiding disk I/OlBut hard to take into account for cost estimation28HeJing(2007)School of Software2.5 Relational Algebra 2.5 Rel

24、ational Algebra 关系代数关系代数Fundamental Operations of Relational Algebraset-theoretic operations 集合运算集合运算native relational operations 自然关系运算自然关系运算29HeJing(2007)School of Software2.5 Relational Algebra 2.5 Relational Algebra 关系代数关系代数Fundamental Operations of Relational Algebraset-theoretic operations 集合运

25、算集合运算native relational operations 自然关系运算自然关系运算3.3 Simple Select StatementspSyntaxSELECT ALL|DISTINCT|TOP N,AS INTO new-table-nameFROM,WHERE GROUP BY HAVING ORDER BY ASC|DESC;Silberschatz,Korth and Sudarshan12.31Database System Concepts-6th Edition1.选择操作的实现选择操作的实现 n例1Select*from student where ;考虑的几种情

26、况:C1:无条件;C2:Sno200215121;C3:Sage20;C4:SdeptCS AND Sage20;Silberschatz,Korth and Sudarshan12.32Database System Concepts-6th Edition选择操作的实现(续)选择操作的实现(续)n选择操作典型实现方法:选择操作典型实现方法:l1.简单的全表扫描方法简单的全表扫描方法 对查询的基本表顺序扫描,逐一检查每个元组是否满足选择条件,把满足条件的元组作为结果输出 适合小表,不适合大表Silberschatz,Korth and Sudarshan12.33Database Syste

27、m Concepts-6th Edition学号学号Sno姓名姓名Sname性别性别Ssex年龄年龄Sage所在系所在系Sdept200215121李勇李勇男男20CS200215122刘晨刘晨女女19IS200215123王敏王敏女女18MA200215125张立张立男男19ISSilberschatz,Korth and Sudarshan12.34Database System Concepts-6th Editionl2.索引索引(或散列或散列)扫描方法扫描方法 适合选择条件中的属性上有索引(例如B+树索引或Hash索引)通过索引先找到满足条件的元组主码或元组指针,再通过元组指针直接在

28、查询的基本表中找到元组 Silberschatz,Korth and Sudarshan12.35Database System Concepts-6th Edition选择操作的实现(续)选择操作的实现(续)n例1-C2 以C2为例,Sno200215121,并且Sno上有索引(或Sno是散列码):Select*from student where Sno200215121l使用索引(或散列)得到Sno为200215121 元组的指针l通过元组指针在student表中检索到该学生Silberschatz,Korth and Sudarshan12.36Database System Conc

29、epts-6th Edition200215121200215122200215123200215124学号学号Sno姓名姓名Sname性别性别Ssex年龄年龄Sage所在系所在系Sdept200215121李勇李勇男男20CS200215122刘晨刘晨女女19IS200215123王敏王敏女女18MA200215125张立张立男男19ISSilberschatz,Korth and Sudarshan12.37Database System Concepts-6th Editionn例1-C3 以C3为例,Sage20,并且Sage 上有B+树索引:Select*from student w

30、here Sage20l使用B+树索引找到Sage20的索引项,以此为入口点在B+树的顺序集上得到Sage20的所有元组指针l通过这些元组指针到student表中检索到所有年龄大于20的学生。Silberschatz,Korth and Sudarshan12.38Database System Concepts-6th EditionSilberschatz,Korth and Sudarshan12.39Database System Concepts-6th Edition选择操作的实现(续)选择操作的实现(续)n例1-C4 以C4为例,SdeptCS AND Sage20,如果Sdep

31、t和Sage上都有索引:Select*from student where SdeptCS AND Sage20l算法一:算法一:分别用上面两种方法分别找到SdeptCS的一组元组指针和Sage20的另一组元组指针求这2组指针的交集到student表中检索得到计算机系年龄大于20的学生Silberschatz,Korth and Sudarshan12.40Database System Concepts-6th Edition学号学号Sno姓名姓名Sname性别性别Ssex年龄年龄Sage所在系所在系Sdept200215121李勇李勇男男21CS200215122刘晨刘晨女女19IS200

32、215123王敏王敏女女18MA200215125张立张立男男19ISCSISMA181921系系别别索引索引年年龄龄索引索引Silberschatz,Korth and Sudarshan12.41Database System Concepts-6th Edition选择操作的实现(续)选择操作的实现(续)n例1-C4 以C4为例,SdeptCS AND Sage20,如果Sdept和Sage上都有索引:Select*from student where SdeptCS AND Sage20l算法二:算法二:找到SdeptCS的一组元组指针,通过这些元组指针到student表中检索对得到的

33、元组检查另一些选择条件(如Sage20)是否满足把满足条件的元组作为结果输出。Silberschatz,Korth and Sudarshan12.42Database System Concepts-6th Edition学号学号Sno姓名姓名Sname性别性别Ssex年龄年龄Sage所在系所在系Sdept200215121李勇李勇男男21CS200215122刘晨刘晨女女19IS200215123王敏王敏女女18MA200215125张立张立男男19ISCSISMA系系别别索引索引Silberschatz,Korth and Sudarshan12.43Database System Co

34、ncepts-6th EditionnFile scan nAlgorithm A1(linear search).Scan each file block and test all records to see whether they satisfy the selection condition.lCost estimate=br block transfers+1 seek4br denotes number of blocks containing records from relation rlIf selection is on a key attribute,can stop

35、on finding record4cost=(br/2)block transfers+1 seeklLinear search can be applied regardless of 4selection condition or4ordering of records in the file,or 4availability of indicesnNote:binary search generally does not make sense since data is not stored consecutivelylexcept when there is an index ava

36、ilable,land binary search requires more seeks than index searchSilberschatz,Korth and Sudarshan12.44Database System Concepts-6th EditionnIndex scan search algorithms that use an indexlselection condition must be on search-key of index.nA2(primary index,equality on key).Retrieve a single record that

37、satisfies the corresponding equality condition lCost=(hi+1)*(tT+tS)nA3(primary index,equality on nonkey)Retrieve multiple records.lRecords will be on consecutive blocks4Let b=number of blocks containing matching recordslCost=hi*(tT+tS)+tS+tT*bSilberschatz,Korth and Sudarshan12.45Database System Conc

38、epts-6th EditionnA4(secondary index,equality on nonkey).lRetrieve a single record if the search-key is a candidate key4Cost=(hi+1)*(tT+tS)lRetrieve multiple records if search-key is not a candidate key4each of n matching records may be on a different block 4Cost=(hi+n)*(tT+tS)Can be very expensive!S

39、ilberschatz,Korth and Sudarshan12.46Database System Concepts-6th EditionnCan implement selections of the form sAV(r)or sA V(r)by usingl a linear file scan,l or by using indices in the following ways:nA5(primary index,comparison).(Relation is sorted on A)4For sA V(r)use index to find first tuple v an

40、d scan relation sequentially from there4For sAV(r)just scan relation sequentially till first tuple v;do not use indexnA6(secondary index,comparison).4For sA V(r)use index to find first index entry v and scan index sequentially from there,to find pointers to records.4For sAV(r)just scan leaf pages of

41、 index finding pointers to records,till first entry v4In either case,retrieve records that are pointed to requires an I/O for each record Linear file scan may be cheaperSilberschatz,Korth and Sudarshan12.47Database System Concepts-6th EditionnConjunction:s1 2.n(r)nA7(conjunctive selection using one

42、index).lSelect a combination of i and algorithms A1 through A7 that results in the least cost for si(r).l Test other conditions on tuple after fetching it into memory buffer.nA8(conjunctive selection using composite index).lUse appropriate composite(multiple-key)index if available.nA9(conjunctive se

43、lection by intersection of identifiers).lRequires indices with record pointers.lUse corresponding index for each condition,and take intersection of all the obtained sets of record pointers.lThen fetch records from filelIf some conditions do not have appropriate indices,apply test in memory.Silbersch

44、atz,Korth and Sudarshan12.48Database System Concepts-6th EditionnDisjunction:s1 2.n(r).nA10(disjunctive selection by union of identifiers).lApplicable if all conditions have available indices.4Otherwise use linear scan.lUse corresponding index for each condition,and take union of all the obtained se

45、ts of record pointers.lThen fetch records from filenNegation:s(r)lUse linear scan on filelIf very few records satisfy,and an index is applicable to 4 Find satisfying records using index and fetch from fileSilberschatz,Korth and Sudarshan12.49Database System Concepts-6th EditionnWe may build an index

46、 on the relation,and then use the index to read the relation in sorted order.May lead to one disk block access for each tuple.nFor relations that fit in memory,techniques like quicksort can be used.For relations that dont fit in memory,external sort-merge is a good choice.n原因nSQL查询可以指定对输出进行排序n关系运算的某

47、些操作,如连接运算,排序后实现高效n对于可放进内存的关系,使用如快排序之类的技术。对不能放进内存的关系,使用外排序外排序n内排序n当数据集小于可用内存时,采用快速排序算法n快速排序的思想来源于分治策略。将数据块划分为两个序列,第一个序列的值小于第二个序列,在两个序列中按照递归排序的思想再次进行上述的划分,这样直到没有办法划分为止n外排序n创建有序段N路归并n所有的输入数据最初分成许多有序的归并段文件,然后不断归并成许多更大的归并段文件,直到剩下一个文件为止第一阶段,建立多个排序的归并段文件i=o;repeat读入M块关系数据或剩下的不足M块的数据;在内存中对关系的这一部分进行排序;将排好序的数

48、据写到归并段文件Ri中;i=i+1;until 到达关系的末尾第二阶段,对归并段文件进行归并为N个归并段文件Ri各分配一页内存缓冲页,分别读入一数据块repeat在所有的缓冲页中按序挑选第一个元组;把该元组作为输出写出,并且将它从缓冲区删除;if 任何一个归并段文件Ri的缓冲页为空并且没有到达文件末尾then 读入该归并段文件的下一块到相应的缓冲页中;until 所有的缓冲页为空n删除重复元组可通过散列或排序实现.n排序后,重复元组将相邻,保留一个,删除其余n优化:在外归并排序中,删除重复元组可以在序段生成和中间结果归并阶段进行n散列的情况类似 重复元组将进入相同桶n投影的实现方法:在每个元组

49、上执行投影,再进行重复元组删除.Silberschatz,Korth and Sudarshan12.56Database System Concepts-6th Edition1.Create sorted runs.Let i be 0 initially.Repeatedly do the following till the end of the relation:(a)Read M blocks of relation into memory (b)Sort the in-memory blocks (c)Write sorted data to run Ri;increment i

50、.Let the final value of i be N2.Merge the runs(next slide).Let M denote memory size(in pages).Silberschatz,Korth and Sudarshan12.57Database System Concepts-6th Edition2.Merge the runs(N-way merge).We assume(for now)that N M.1.Use N blocks of memory to buffer input runs,and 1 block to buffer output.R

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 办公、行业 > 各类PPT课件(模板)
版权提示 | 免责声明

1,本文((数据库原理课件)Chapter10-Query-Processing.ppt)为本站会员(晟晟文业)主动上传,163文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。
2,用户下载本文档,所消耗的文币(积分)将全额增加到上传者的账号。
3, 若此文所含内容侵犯了您的版权或隐私,请立即通知163文库(发送邮件至3464097650@qq.com或直接QQ联系客服),我们立即给予删除!


侵权处理QQ:3464097650--上传资料QQ:3464097650

【声明】本站为“文档C2C交易模式”,即用户上传的文档直接卖给(下载)用户,本站只是网络空间服务平台,本站所有原创文档下载所得归上传人所有,如您发现上传作品侵犯了您的版权,请立刻联系我们并提供证据,我们将在3个工作日内予以改正。


163文库-Www.163Wenku.Com |网站地图|