1、nDBMS contains information about a particular enterpriselCollection of interrelated datalSet of programs to access the data lAn environment that is both convenient and efficient to usenDatabase Applications:lBanking:transactionslAirlines:reservations,scheduleslUniversities:registration,gradeslSales:
2、customers,products,purchaseslOnline retailers:order tracking,customized recommendationslManufacturing:production,inventory,orders,supply chainlHuman resources:employee records,salaries,tax deductionsnDatabases can be very large.nDatabases touch all aspects of our livesnApplication program exampleslA
3、dd new students,instructors,and courseslRegister students for courses,and generate class rosterslAssign grades to students,compute grade point averages(GPA)and generate transcriptsnIn the early days,database applications were built directly on top of file systemslData redundancy and inconsistency4Mu
4、ltiple file formats,duplication of information in different fileslDifficulty in accessing data 4Need to write a new program to carry out each new tasklData isolation multiple files and formatslIntegrity problems4Integrity constraints (e.g.,account balance 0)become“buried”in program code rather than
5、being stated explicitly4Hard to add new constraints or change existing oneslAtomicity of updates4Failures may leave database in an inconsistent state with partial updates carried out4Example:Transfer of funds from one account to another should either complete or not happen at alllConcurrent access b
6、y multiple users4Concurrent access needed for performance4Uncontrolled concurrent accesses can lead to inconsistencies Example:Two people reading a balance(say 100)and updating it by withdrawing money(say 50 each)at the same timelSecurity problems4Hard to provide user access to some,but not all,data
7、Database systems offer solutions to all the above problemsnPhysical level:describes how a record(e.g.,customer)is stored.nLogical level:describes data stored in database,and the relationships among the data.type instructor=recordID:string;name:string;dept_name:string;salary:integer;end;nView level:a
8、pplication programs hide details of data types.Views can also hide information(such as an employees salary)for security purposes.An architecture for a database system nSimilar to types and variables in programming languages the logical structure of the database lExample:The database consists of info
9、rmation about a set of customers and accounts and the relationship between themlAnalogous to type information of a variable in a programlPhysical schema:database design at the physical levellLogical schema:database design at the logical levelnInstance the actual content of the database at a particul
10、ar point in time lAnalogous to the value of a variablenPhysical Data Independence the ability to modify the physical schema without changing the logical schemalApplications depend on the logical schemalIn general,the interfaces between the various levels and components should be well defined so that
11、 changes in some parts do not seriously influence others.nA collection of tools for describing lData lData relationshipslData semanticslData constraintsnRelational modelnEntity-Relationship data model(mainly for database design)nObject-based data models(Object-oriented and Object-relational)nSemistr
12、uctured data model (XML)nOther older models:lNetwork model lHierarchical modelnRelational model(Chapter 2)nExample of tabular data in the relational modelColumnsRowsnLanguage for accessing and manipulating the data organized by the appropriate data modellDML also known as query languagenTwo classes
13、of languages lProcedural user specifies what data is required and how to get those data lDeclarative(nonprocedural)user specifies what data is required without specifying how to get those datanSQL is the most widely used query languagenSpecification notation for defining the database schemaExample:c
14、reate table instructor(ID char(5),name varchar(20),dept_name varchar(20),salary numeric(8,2)nDDL compiler generates a set of table templates stored in a data dictionarynData dictionary contains metadata(i.e.,data about data)lDatabase schema lIntegrity constraints4Primary key(ID uniquely identifies i
15、nstructors)4Referential integrity(references constraint in SQL)e.g.dept_name value in any instructor tuple must appear in department relationlAuthorizationnSQL:widely used non-procedural languagelExample:Find the name of the instructor with ID 22222selectnamefrominstructorwhereinstructor.ID=22222lEx
16、ample:Find the ID and building of instructors in the Physics dept.select instructor.ID,department.buildingfrom instructor,departmentwhere instructor.dept_name=department.dept_name and department.dept_name=Physics nApplication programs generally access databases through one oflLanguage extensions to
17、allow embedded SQLlApplication program interface(e.g.,ODBC/JDBC)which allow SQL queries to be sent to a databasenChapters 3,4 and 5The process of designing the general structure of the database:nLogical Design Deciding on the database schema.Database design requires that we find a“good”collection of
18、 relation schemas.lBusiness decision What attributes should we record in the database?lComputer Science decision What relation schemas should we have and how should the attributes be distributed among the various relation schemas?nPhysical Design Deciding on the physical layout of the database nIs t
19、here any problem with this design?nNormalization Theory(Chapter 8)lFormalize what designs are bad,and test for themnEntity Relationship Model(Chapter 7)lModels an enterprise as a collection of entities and relationships4Entity:a“thing”or“object”in the enterprise that is distinguishable from other ob
20、jects Described by a set of attributes4Relationship:an association among several entitieslRepresented diagrammatically by an entity-relationship diagram:nModels an enterprise as a collection of entities and relationshipslEntity:a“thing”or“object”in the enterprise that is distinguishable from other o
21、bjects4Described by a set of attributeslRelationship:an association among several entitiesnRepresented diagrammatically by an entity-relationship diagram:What happened to dept_name of instructor and student?nRelational model:flat,“atomic”valuesnObject Relational Data ModelslExtend the relational dat
22、a model by including object orientation and constructs to deal with added data types.lAllow attributes of tuples to have complex types,including non-atomic values such as nested relations.lPreserve relational foundations,in particular the declarative access to data,while extending modeling power.lPr
23、ovide upward compatibility with existing relational languages.nDefined by the WWW Consortium(W3C)nOriginally intended as a document markup language not a database languagenThe ability to specify new tags,and to create nested tag structures made XML a great way to exchange data,not just documentsnXML
24、 has become the basis for all new generation data interchange formats.nA wide variety of tools is available for parsing,browsing and querying XML documents/datanStorage manager is a program module that provides the interface between the low-level data stored in the database and the application progr
25、ams and queries submitted to the system.nThe storage manager is responsible to the following tasks:lInteraction with the file manager lEfficient storing,retrieving and updating of datanIssues:lStorage accesslFile organizationlIndexing and hashing1.Parsing and translation2.Optimization3.EvaluationnAl
26、ternative ways of evaluating a given querylEquivalent expressionslDifferent algorithms for each operationnCost difference between a good and a bad way of evaluating a query can be enormousnNeed to estimate the cost of operationslDepends critically on statistical information about relations which the
27、 database must maintainlNeed to estimate statistics for intermediate results to compute cost of complex expressionsnWhat if the system fails?nWhat if more than one user is concurrently updating the same data?nA transaction is a collection of operations that performs a single logical function in a da
28、tabase applicationnTransaction-management component ensures that the database remains in a consistent(correct)state despite system failures(e.g.,power failures and operating system crashes)and transaction failures.nConcurrency-control manager controls the interaction among the concurrent transaction
29、s,to ensure the consistency of the database.DatabaseThe architecture of a database systems is greatly influenced by the underlying computer system on which the database is running:nCentralizednClient-servernParallel(multi-processor)nDistributed n1950s and early 1960s:lData processing using magnetic
30、tapes for storage4Tapes provided only sequential accesslPunched cards for inputnLate 1960s and 1970s:lHard disks allowed direct access to datalNetwork and hierarchical data models in widespread uselTed Codd defines the relational data model4Would win the ACM Turing Award for this work4IBM Research b
31、egins System R prototype4UC Berkeley begins Ingres prototypelHigh-performance(for the era)transaction processingn1980s:lResearch relational prototypes evolve into commercial systems4SQL becomes industrial standardlParallel and distributed database systemslObject-oriented database systemsn1990s:lLarge decision support and data-mining applicationslLarge multi-terabyte data warehouseslEmergence of Web commercenEarly 2000s:lXML and XQuery standardslAutomated database administrationnLater 2000s:lGiant data storage systems4Google BigTable,Yahoo PNuts,Amazon,.