1、Graph-basedRDFDataManagmentRDF and Semantic WebRDF is a language for the conceptual modeling of informationabout web resourcesA building block of semantic webFacilitates exchange of informationSearch engines can retrieve more relevant informationFacilitates data integration(mashes)Machine understand
2、ableUnderstand the information on the web and theinterrelationships among themWhats Sematic Web:A Simple Example(RDFa)The traditional Web(HTML)only considers the display of thecontent.How is the page displayed,such as which font and the format of thepictures?Lei Zou Email: Publications:Lei Zou,Jinhu
3、i Mo,Lei Chen,M.Tamer Ozsu,Dongyan Zhao,gStore:Answering SPARQL Queries ViaSubgraph Matching,VLDB,2011Whats Sematic Web:A Simple Example(RDFa)Sematic Web considers the sematics of the content.What does the content in the page mean?e.g.,What are the mean of“”and“VLDB”?LeiZou Publications:Lei Zou,Jing
4、hui Mo,Lei Chen,M.Tamer zsu,Dongyan Zhao,gStore:Answering SPARQLQueries Via Subgraph Matching,VLDB 2011Whats Sematic Web:Google SnippetWhats Sematic Web:Google SnippetWhats Sematic Web:Google SnippetWhats Sematic Web:Google SnippetWhats Sematic Web:Facebook Social GraphRDF,RDFS,OWL,OWL Full,OWL 2Mor
5、e Semantic;More Powerful ReasoningWhats Sematic Web:From Two PerspectivesExpressivenessScalabilityOpen Linked Data,Web-scale Triple Store,Semantic WikiHow to get more data?How to manage the Web-scale Semantic Data?RDF,RDFS,OWL,OWL Full,OWL 2More Semantic;More Powerful ReasoningWhats Sematic Web:From
6、 Two PerspectivesExpressivenessScalabilityHow to get more data?How to manage the Web-scale Semantic Data?More InterestingApplicationsApple Siri,Google Knowledge Graph;IBM Watson;Broadcasting:BBCPublishing:Thomson ReutersLife:Eli Lilly and CompanyOpen Linked Data,Web-scale Triple Store,Semantic WikiM
7、ore AreasSome Interesting ProductsIBM WatsonSome Interesting ProductsEVI acquired by Amazon on October 2012.William Tunstall-Pedoe:True Knowledge:Open-Domain Question Answering UsingStructured Knowledge and Inference.AI Magazine 31(3):80-92(2010)Some Interesting ProductsGoogle Knowledge GraphRDF Use
8、sYago and DBPedia extract facts from Wikipedia&representas RDF structural queriesCommunities build RDF dataE.g.,biologists:Bio2RDF and Uniprot RDFWeb data integrationLinked Data Cloud.RDF Data Volumes.are growing and fastLinked data cloud currently consists of 325 datasets with25B triplesSize almost
9、 doubling every yearJamendoFOAFACMLater+BaseprofilesTOTPProjectGeo-Euro-wrapprVirtuosonamesstat PisaSpongerMDBMagna-DBpediaSpecies DBLPBerlinDrugBankRDF Data Volumes.are growing and fastLinked data cloud currently consists of 325 datasets with25B triplesSize almost doubling every yearKEGGPubMedGeneI
10、DPfamUniProtOMIMPDBSymbolChEBIMedDisea-someCASHGNCInterProProDomGeneOntologyHomoloGenePubChemMGIUniSTSMusic-brainzSurgeRadioMySpaceWrapperAudio-ScrobblerIRITToulouseGuten-bergLinkedCTResexDBLPRKBExplorerIEEECiteSeerUniRefUniParcPROSITERAE2001Buda-pestBMEeprintsNew-castleIBMLAAS-CNRSTaxonomyEurcomDoa
11、p-spaceflickrWorld Linked ECSGEOBBC BBC CrunchSem-Web-CentralFlickrexporterWiki-companySemanticWeb.orgLIBRISQDOSPubGuiderieseUSCensusDataRDFohlohSWConferenceCorpusOpen-GuidesFact-South-book amptontune RDF BookMashupW3C DBLPWordNet HannoverReactomeBBCPlaycountDataBBCProgrammesGov-TrackOpenCycUMBELDai
12、lyYagoJohn SIOC RevyuPeel SitesOpenCalaislingvoj FreebaseMarch 09:89 datasetsAs of March 2009Linking Open Data cloud diagram,by Richard Cyganiak and Anja Jentzsch.AKTing)Linked Datafor IntervalsTWC LOGDsubjectsAudio-Lists ResourceMySpacescrobblerListsSHlobidGTAAOrgani-sationsMusicMagna-LCSHLibraryDB
13、LIBRISBrainz(DataamptonTropesIncubator)(zitgist)Man-EPrintsReadingRISKSMusicThe OpenECSListsBrainzDiscogsGem.LibraryUBSouth-FanHubz(Data In-Peel (DBTune)cubator)dateiheimRESEXTune)Pok-Last.fmArtistsLast.FMRDFLinked(DBTune)(rdfize)BookVIAFWikiLCCNProduc-P20Mashupsemanticclassicaltionsweb.org(DBECSPro
14、gramBBCOpenEILoticoRevyuSemanticOAIListChronic-Linked DogNSZLling Event-MDB RDF FoodCatalogAmericaBBC DBLP ACMBibBase(RKBWildlifeOpenlyRecht-winFamilyspraak.Local Tele-DBLPVIVO UFnlgraphis flickr (L3S)New-VIVOOpenTimesURIDBLPCiteSeerVIVOstatistics (FULOIUSRomaTaxonBerlin)IEEEiServeConcept.uk Cornell
15、GeoWorldFact-NamesdotACProject(DatabergePrintsDBpediatransportwarebator)FishesUN/.ukUberblicPubGeneChemGeoLAASDITLinked DrugEurostatData UMBEL MedDisea-ChEBICare NSFKEGGDrugLinked Linked KEGGGlycanrdfaboutBankOpen(Kno.e.sis)US SEC riese ReactomeCycMediaPfamway PDBHGNCKEGG GeographicCASEnzymeLinked T
16、axo-ReactionUniProtTwarqlrdfaboutPublicationsNumbersProDomChem2UniRefBio2RDFWordNet SGD HomoloAffy-LinkedGovernmentmetrixPubMedGeoDataUniParcCross-domainProductDBMGIRDF Data Volumes.are growing and fastLinked data cloud currently consists of 325 datasets with25B triplesSize almost doubling every yea
17、rThe Open ECSSTW GESIS Course-CORDISNHS(EnAKTing)Energy(En-CO2(En-AKTing)legislation Survey.gov.ukUK Post-codesESDstan-referencedards data.gov.ukTheLondonGazetteGovTrack(VUA)AirportsFolk NTUPlymouth ResourceReading ListsCodes Explorer)Climbingbiz.data.gov.ukPopula-tion(En-AKTing)Mortality(En-AKTing)
18、Ord-nanceSemanticXBRLTune)PBACNewYorkbookof Texas Geo LOCODEEuro-SpecieslingvojLexvo Path-CornettoGuten-data.gov (FUB)PRO-SITE(Talis)Norm-Mann-amptonKEGGCpdDEPLOYKEGGSensor Data CT PathwayWordNet KEGG(W3C)Genet4gmIndiana RAE2001OS dcsERAGeneOntologyMoseley (DBTune)(DBTune)RAMEAUlobid UlmResourcesdat
19、aGeneIDSussex St.Reading Andrews NDLListsBrainz Music South-chesterJamendopdiaPokedexMedia ohlohBurner CalaisNASA (FUB)FreebaseDaily OBOYAGO Medi someUEUNIS Open nomyUniSTSBank OMIM InterProtuneSurgeRadioBBC MARC (RKB Budapestdata.gov mes Music Crunch SWFinder Explorer)Incu-(es)John PSH(DBresearch E
20、UTCdata.gov Pisa Eurcom.ukeducation.uk Basedata.govUS CensusIRITGood-IBMwrappr castlestat dbpedia TCM SIDER KISTI(FUB)lite STITCH JISCUser-generated contentGen Life sciencesSeptember 10:203 datasetsAs of September 2010Linking Open Data cloud diagram,by Richard Cyganiak and Anja Jentzsch.UserSlidesha
21、reAudio2RDFScrobblerBricklinkReadingGTAAMagna-Lists AndrewstuneNTUResourceDBResourceyovistoLoticoTropesMan-MusicListsJohnMusicchesterNDLPeelBrainzBrainzHellenicsubjects(DataFBDEUTC (zitgist)Lists Opent4gmHellenicLibraryProduc-OpenSurgeinfoPDRDFDiscogstions base LibraryOntosSource Codeohloh(Talis)Rad
22、ioCrime(DataLEMReportsNews Ecosystem Reading RAMEAUPortalLinked DataSHListsdata.gov.MusicJamendoUKLinkedLuk(DBtune)BrainzOxFanHubzgnossCCNPointsPok-artistsLIBRISpdiaRdataLast.FMLCSHtheses.patentsreegleAKTing)myExperi-NHS Good-Classicalflickr(En-mentEnergywrapprSudocPSHFamilyAKTing)BBCGenera-Tune)gia
23、nProgramtorsMeSHsemanticmesIdRefBBCeducatioOpenEIGNDSWndlnaSudocEnergyEmission n.data.g Music VIAFChronic-LinkedEEAPortu-(En-ov.uk UBlingEventMDBgueseAmericaMediaCalamesDBpediaRecht-Ord-DDCRevyuFinderOpenlyOpenlobidspraak.nanceElectionLocallegislationnlRDF graphie NSZLData Survey data Ulm Resources
24、SwedishBookNewCatalogProject data.gov.uk graphis bnf.fr OpenYorkMashupOpenP20GreekURI CulturalTimesUK Post-DBpediaOrgani-GovWILDdata.gov.Taxon South-LOIUSBNBsationsuk Concept amptonSTWGeoBibBaseWorldGESISSouth-Poli-ESDNames Fact-(RKBticiansbookstan-reference amptondata.gov.ukFreebaseExplorer)EPrints
25、intervalstransportDBpediaLichfield uk (Data Project OAIPisaGuten-data.gov.RESEXScholaro-ISTATDBLPukding bator)Fishes berg DBLPGeoof Texas(FUmeterImmi-(L3S)UberblicBerlin)Pupils&Species DBLPdata-dbpediagration IRIT(RKBTCMopen-liteExplorer)London stat ACMIBMGeneGazetteGeoac-ukTrafficDailyTWC LOGD Euro
26、stat DITLinkedUN/MedDataLOCODEYAGOGov.ieCORDIS Disea-New-RAE2001SIDERsomeLOCAHCORDISExplorer)Linked EurcomEurostatCiteSeerSensor DataBank(FUB)OpenGovTrackCourse-(Kno.e.sis)Pfamriese EnipediaLinkedCTCycwareLexvoVIVOUniProtdotACEURESePrintsUS SECIndiana(OCentral)UniProtTwarqlEUNISUS Census (VUA)Taxono
27、 HGNCSemanticmyVIVO(rdfabout)ProDomSTITCHCornellFTSXBRL PRO-SITEKISTIGeoWordLODENetWordNet JISC(RKBAffy-KEGG(W3C)LinkedVIVO UFSISVUmetrixGeoDataSGDJournalsChemGeneFinnishAccomo-El Ontology TCP MediaMunici-dations AGROVViajeropalitiesOCTourismKEGGAustria PBAC GeographicGEMETChEMBLDrillingKEGGMetoffic
28、eItalian OMIMWeatherAEMETLinkedMGICodicespublicPathwayData Publicationsschools Forecasts Open InterPro GeneIDEARThColorsReactionrusUser-generated contentZaragoza Product SmartGlycanJanusUniParcUniRefGovernmentTypesAMP UniSTSHomoloYahoo!OntologyAirports Museums poundGeneGeoCross-domainChem2ArtPlanetw
29、rapperLife sciencesSearsLinkedAmster-OpenratesMuseumRDF Data Volumes.are growing and fastLinked data cloud currently consists of 325 datasets with25B triplesSize almost doubling every yearScotlandInsti-tutionsSeptember 11:295 datasetsLinkedLOV tags2conFeedback deliciousMoseley SussexFolk (DBTune)St.
30、Klapp-stuhl-club ListsSemantic Tweet(DBTune)ReadingIncubator)LinkedCrunch-Plymouthbusiness Incubator)Crime(En-AKTing)ntnusc(DBTune)SSWLast.FM Thesaur Thesau-Popula-Didactal us rus Wtion(En-(DBTune)iaresearch MARCdata.gov.data.go (rdfize)fr Codes n!Ren.uk v.uk Listwin(DB Pokedex Norwe-Mortality(En-AK
31、Ting)CO2 web.org(En-DogAKTing)FoodAKTing)Mann-EuropeanaBBC heimWildlife DeutscheBio-EU Tele-Burner Calais Heritagecodes statistics ECS Wiki lobidiServeBrazilian ECSOS ECSBudapestdards data.gov.NASAdataSpen-Incu-dcsScotlandExams Euro-NVD(FUB)Data UMBEL ERADEPLOYlingvoj(RKB castleDrug Roma(OntologyCen
32、tral)Linked PDBEDGARIEEE(rdfabout)WordNet RISKSCornetto (Bio2RDF)LAASScotland NSFGeo-graphy WordNetClimbingSMC Explorer)Pub DrugPiedmont PubMed ECCO-Alpine bibleSki ontologyOcean EnzymeOpenThesau-KEGGTurismodeKEGGWeather DB Link MediStations Product Care KEGGItalian Com-GoogleNationalRadio-Bio2RDFac
33、tivity UniPathJP Open OGOLOD wayCorpo-Reactomedam medu-cator NumbersAs of September 2011Linking Open Data cloud diagram,by Richard Cyganiak and Anja Jentzsch.RDF Data Volumes.are growing and fastLinked data cloud currently consists of 325 datasets with25B triplesSize almost doubling every yearApril
34、14:1091 datasets,?triplesMax Schmachtenberg,Christian Bizer,and Heiko Paulheim:Adoption of LinkedData Best Practices in Different Topical Domains.In Proc.ISWC,2014.OutlineRDF IntroductiongStore:a graph-based SPARQL query engineAnswering SPARQL queries using graph pattern matching Zouet al.,PVLDB 201
35、1,VLDB J 2014gAnswer:Natural Language Question Answering over RDFA Data Driven Approach Zou et al.,SIGMOD 2014;Zheng etal.,SIGMOD 2015OutlineRDF IntroductiongStore:a graph-based SPARQL query engineAnswering SPARQL queries using graph pattern matching Zouet al.,PVLDB 2011,VLDB J 2014gAnswer:Natural L
36、anguage Question Answering over RDFA Data Driven Approach Zou et al.,SIGMOD 2014;Zheng etal.,SIGMOD 2015RDF IntroductionEverything is an uniquely namedresource LincolnRDF IntroductionEverything is an uniquely namedresourceNamespaces can be used to scopethe namesxmlns:y=y:Abraham LincolnRDF Introduct
37、ionEverything is an uniquely namedresourceNamespaces can be used to scopethe namesProperties of resources can bedefinedxmlns:y=y:Abraham LincolnAbraham Lincoln:hasName“Abraham Lincoln”Abraham Lincoln:BornOnDate:“1809-02-12”Abraham Lincoln:DiedOnDate:“1865-04-15”RDF IntroductionEverything is an uniqu
38、ely namedresourceNamespaces can be used to scopethe namesProperties of resources can bedefinedRelationships with other resourcescan be definedy:Washington DCxmlns:y=y:Abraham LincolnAbraham Lincoln:hasName“Abraham Lincoln”Abraham Lincoln:BornOnDate:“1809-02-12”Abraham Lincoln:DiedOnDate:“1865-04-15”
39、Abraham Lincoln:DiedInRDF IntroductionEverything is an uniquely namedresourceNamespaces can be used to scopethe namesProperties of resources can bedefinedRelationships with other resourcescan be definedResources can be contributed bydifferent people/groups and can belocated anywhere in the webIntegr
40、ated web“database”xmlns:y=y:Abraham LincolnAbraham Lincoln:hasName“Abraham Lincoln”Abraham Lincoln:BornOnDate:“1809-02-12”Abraham Lincoln:DiedOnDate:“1865-04-15”Abraham Lincoln:DiedIny:Washington DCSubjectPredicateObjectAbrahamLincolnAbrahamLincolnAbrahamLincolnhasNameBornOnDateDiedOnDate“AbrahamLin
41、coln”“1809-02-12”“1865-04-15”RDF Data ModelTriple:Subject,Predicate(Property),Object(s,p,o)Subject:the entity that is described(URI or blank node)Predicate:a feature of the entity(URI)Object:value of the feature(URI,blank node or literal)(s,p,o)(U B)U (U B L)Set of RDF triples is called an RDF graph
42、SubjectObjectUBU B LU:set of URIsB:set of blank nodesL:set of literalsUPredicateRDF Example InstancePrefix:y=Subjecty:Abraham Lincolny:Abraham Lincolny:Abraham LincolnPredicatehasNameBornOnDateDiedOnDateObject“Abraham Lincoln”“1809-02-12”“1865-04-15”y:Abraham Lincolny:Abraham Lincolny:Abraham Lincol
43、ny:Abraham LincolnbornInDiedIntitlegendery:Hodgenville KYy:Washington DC“President”“Male”y:Washington DCy:Washington DCy:Hodgenville KYy:United Statesy:United Statesy:United Statesy:Reese Witherspoony:Reese Witherspoony:Reese Witherspoony:Reese Witherspoony:Reese Witherspoony:New Orleans LAy:New Orl
44、eans LAy:Franklin Roosevelty:Franklin Roosevelty:Franklin Roosevelty:Franklin Roosevelty:Hyde Park NYy:Hyde Park NYy:Marilyn Monroey:Marilyn Monroey:Marilyn Monroey:Marilyn MonroehasNamefoundingYearhasNamehasNamehasCapitalfoundingYearbornOnDatebornInhasNamegendertitlefoundingYearlocatedInhasNameborn
45、IntitlegenderfoundingYearlocatedIngenderhasNamebornOnDatediedOnDate“Washington D.C.”“1790”“Hodgenville”“United States”y:Washington DC“1776”“1976-03-22”y:New Orleans LA“Reese Witherspoon”“Female”“Actress”“1718”y:United States“Franklin D.Roosevelt”y:Hyde Park NY“President”“Male”“1810”y:United States“F
46、emale”“Marilyn Monroe”“1926-07-01”“1962-08-05”URILiteralURIRDF Graphy:Abraham Lincoln“Abraham Lincoln”hasName“1809-02-12”bornOnDate“1865-04-15”diedOnDate“President”“Male”title gendery:Washington D.C.“1790”“Washington D.C.”y:Hodgenville KY“Hodgenville”hasNamey:United States“1776”y:Reese Witherspoon“1
47、976-03-22”bornOnDate“Female”gender“Actress”“Reese Witherspoon”title hasNamey:New Orleans LAfoundingYear“1718”y:Franklin Roosevelt“Franklin D.Roosevelt”hasName“Male”gendertitle“President”y:Hyde Park NYfoundingYear“1810”y:Marilyn Monroe“1962-08-05”diedOnDate“1926-07-01”bornOnDate“Female”gender“Marilyn
48、 Monroe”hasNamediedInbornInfoundYear hasName hasCapital foundingYearbornInhasNamelocatedIn locatedIn“United States”bornInRDF Query ModelQuery Model-SPARQL Protocol and RDF Query LanguageGiven U (set of URIs),L(set of literals),and V (set ofvariables),a SPARQL expression is defined recursively:an ato
49、mic triple pattern,which is an element of(U V)(U V)(U V L)?x hasName“Abraham Lincoln”P FILTER R,where P is a graph pattern expression and R is abuilt-in SPARQL condition(i.e.,analogous to a SQL predicate)?x price?p FILTER(?p 30)P1 AND/OPT/UNION P2,where P1 and P2 are graphpattern expressionsExample:
50、SELECT?nameWHERE?m?city.?m?name.?m?bd.?city 1718 .FILTER(regex(str(?bd),1976 )?m?citybornIn?namehasName?bdbornOnDate“1718”foundingYearSPARQL QueriesSELECT?nameWHERE?m?city.?m?name.?m?bd.?city 1718 .FILTER(regex(str(?bd),1976 )FILTER(regex(str(?bd),“1976”)SubjectPropertyObjecty:AbrahamLincolny:Abraha