知识图谱梳理专题培训课件.ppt

上传人(卖家):三亚风情 文档编号:2671149 上传时间:2022-05-17 格式:PPT 页数:27 大小:3.88MB
下载 相关 举报
知识图谱梳理专题培训课件.ppt_第1页
第1页 / 共27页
知识图谱梳理专题培训课件.ppt_第2页
第2页 / 共27页
知识图谱梳理专题培训课件.ppt_第3页
第3页 / 共27页
知识图谱梳理专题培训课件.ppt_第4页
第4页 / 共27页
知识图谱梳理专题培训课件.ppt_第5页
第5页 / 共27页
点击查看更多>>
资源描述

1、知识图谱架构知识图谱一般架构:来源自百度百科复旦大学知识图谱架构:早期知识图谱架构知识图谱一般架构:来源自百度百科架构讨论早期知识图谱架构知识抽取实体概念抽取实体概念映射关系抽取质量评估KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014A sampler of research problemsGrowth: knowledge graphs are incomplete!Link prediction: add relationsOntology

2、matching: connect graphsKnowledge extraction: extract new entities and relations from web/textValidation: knowledge graphs are not always correct!Entity resolution: merge duplicate entities, split wrongly merged onesError detection: remove false assertionsInterface: how to make it easier to access k

3、nowledge?Semantic parsing: interpret the meaning of queriesQuestion answering: compute answers using the knowledge graphIntelligence: can AI emerge from knowledge graphs?Automatic reasoning and planningGeneralization and abstraction9关系抽取 定义: 常见手段: 语义模式匹配频繁模式抽取,基于密度聚类,基于语义相似性 层次主题模型弱监督KDD 2014 Tutori

4、al on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014Methods and techniquesSupervised modelsSemi-supervised modelsDistant supervision2. Entity resolutionSingle entity methodsRelational methods3. Link predictionRule-based methodsProbabilistic modelsFactorization methodsE

5、mbedding models80Not in this tutorial: Entity classification Group/expert detection Ontology alignment Object ranking1. Relation extraction:KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014 Extracting semantic relations between sets of grounded entiti

6、esNumerous variants:Undefined vs pre-determined set of relationsBinary vs n-ary relations, facet discoveryExtracting temporal informationSupervision: fully, un, semi, distant-supervisionCues used: only lexical vs full linguistic features82Relation ExtractionKobeBryantLA LakersplayForthe franchise pl

7、ayer ofonce again savedman of the match forthe Lakers”his team”Los Angeles”“Kobe Bryant,“Kobe“Kobe Bryant?KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014Supervised relation extractionSentence-level labels of relation mentionsApple CEO Steve Jobs sai

8、d. = (SteveJobs, CEO, Apple)Steve Jobs said that Apple will. = NILTraditional relation extraction datasetsACE 2004MUC-7Biomedical datasets (e.g BioNLP clallenges)Learn classifiers from +/- examplesTypical features: context words + POS, dependency path betweenentities, named entity tags, token/parse-

9、path/entity distance83KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014Semi-supervised relation extractionGeneric algorithm(遗传算法遗传算法)1.2.3.4.5.Start with seed triples / golden seed patternsExtract patterns that match seed triples/patternsTake the top-

10、k extracted patterns/triplesAdd to seed patterns/triplesGo to 2Many published approaches in this category:Dual Iterative Pattern Relation Extractor Brin, 98Snowball Agichtein & Gravano, 00TextRunner Banko et al., 07 almost unsupervisedDiffer in pattern definition and selection86founderOfKDD 2014 Tut

11、orial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014Distantly-supervised relation extraction88Existing knowledge base + unlabeled text generate examplesLocate pairs of related entities in textHypothesizes that the relation is expressedGoogle CEO Larry Page announced

12、 that.Steve Jobs has been Apple for a while.Pixar lost its co-founder Steve Jobs.I went to Paris, France for the summer.GoogleCEOcapitalOfLarryPageFranceAppleCEOPixarSteveJobsDistant supervision: modeling hypothesesTypical architecture:1. Collect many pairs of entities co-occurring in sentences from

13、 text corpus2. If 2 entities participate in a relation, several hypotheses:1.All sentences mentioning them express it Mintz et al., 09“Barack Obama is the 44th and current President of the US.” (BO, employedBy, USA)89KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York,

14、August 24, 2014KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014Sentence-level featuresLexical: words in between and around mentions and their parts-of-speech tags (conjunctive form)Syntactic: dependency parse path between mentions along withside node

15、sNamed Entity Tags: for the mentionsConjunctions of the above featuresDistant supervision is used on to lots of data sparsity of conjunctiveforms not an issue92Distant supervision: modeling hypothesesTypical architecture:1. Collect many pairs of entities co-occurring in sentences from text corpus2.

16、If 2 entities participate in a relation, several hypotheses:1.2.All sentences mentioning them express it Mintz et al., 09At least one sentence mentioning them express it Riedel et al., 10“Barack Obama is the 44th and current President of the US.” (BO, employedBy, USA)“Obama flew back to the US on We

17、dnesday.” (BO, employedBy, USA)95KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014Distant supervision: modeling hypothesesTypical architecture:1. Collect many pairs of entities co-occurring in sentences from text corpus2. If 2 entities participate in

18、a relation, several hypotheses:1.2.3.All sentences mentioning them express it Mintz et al., 09At least one sentence mentioning them express it Riedel et al., 10At least one sentence mentioning them express it and 2 entities can expressmultiple relations Hoffmann et al., 11 Surdeanu et al., 12“Barack

19、 Obama is the 44th and current President of the US.” (BO, employedBy, USA)“Obama flew back tothe US justWednesday.” said.” employedBy, USA)98KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014was born in on he always (BO, (BO, bornIn,KDD 2014 Tutorial o

20、n Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014Distant supervisionProsCan scale to the web, as no supervision requiredGeneralizes to text from different domainsGenerates a lot more supervision in one iterationConsNeeds high quality entity-matchingRelation-expression h

21、ypothesis can be wrongCan be compensated by the extraction model, redundancy, language modelDoes not generate negative examplesPartially tackled by matching unrelated entities101KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014104KobeBryantGasolteamma

22、tebornInplayInLeagueBlackMambaEntity resolutionLA LakersplayForplayForPau35ageKobe B.BryantVanessaL. BryantmarriedTo1978Single entityresolutionRelational entityresolutionEntity resolution / deduplication Multiple mentions of the same entity is wrong and confusing.KDD 2014 Tutorial on Constructing an

23、d Mining Web-scale Knowledge Graphs, New York, August 24, 2014Single-entity entity resolutionEntity resolution without using the relational context of entitiesMany distances/similarities for single-entity entity resolution:Edit distance (Levenshtein, etc.)Set similarity (TF-IDF, etc.)Alignment-based

24、Numeric distance between valuesPhonetic SimilarityEquality on a boolean predicateTranslation-basedDomain-specific105KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014Relational entity resolution Simple strategies Enrich model with relational features r

25、icher context for matchingRelational features:Value of edge or neighboring attributeSet similarity measuresOverlap/JaccardAverage similarity between set membersAdamic/Adar: two entities are more similar if they share more items that areoverall less frequentSimRank: two entities are similar if they a

26、re related to similar objectsKatz score: two entities are similar if they are connected by shorter paths114KobeBryant1978teammatebornInplayForplayInLeagueBlackMambaLA LakersplayFor35agePauGasolKDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014KobeBryan

27、t1978teammatebornInplayForplayInLeagueBlackMambaLA LakersplayFor35agePauGasolRelational entity resolution Advanced strategiesDependency graph approaches Dong et al., 05Relational clustering Bhattacharya & Getoor, 07Probabilistic Relational Models Pasula et al., 03Markov Logic Networks Singla & Domin

28、gos, 06Probabilistic Soft Logic Broecheler & Getoor, 10115KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014LINK PREDICTION116KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014117KobeBryantLink prediction

29、NY KnicksPauGasolteammateplayInLeagueteamInLeagueopponentplayForLA LakersplayFor Add knowledge from existing graph No external source Reasoning within the graph1. Rule-based methods2. Probabilistic models3. Factorization models4. Embedding modelsKDD 2014 Tutorial on Constructing and Mining Web-scale

30、 Knowledge Graphs, New York, August 24, 2014First Order Inductive Learner FOIL learns function-free Horn clauses:118Gasolgiven positive negative examples of a concepta set of background-knowledge predicatesFOIL inductively generates a logical rule for the concept that cover all + and no -LALakerspla

31、yForplayForPauteammate(x,y) playFor(y,z) playFor(x,z)teammateKobeBryant Computationally expensive: huge search space large, costly Horn clauses Must add constraints high precision but low recall Inductive Logic Programming: deterministic and potentially problematicKDD 2014 Tutorial on Constructing a

32、nd Mining Web-scale Knowledge Graphs, New York, August 24, 2014S(KB, playFor,LAL)iplayForh(pai(KB,LAL)ipathsPath Ranking Algorithm Lao et al., 11119LALakersplayForPauGasolplayForteammateKobeBryantRandom walks on the graph are used to sample pathsPaths are weighted with probability of reaching target

33、 from sourcePaths are used as ranking experts in a scoring functionNYKnicksplayInLeagueteamInLeagueopponenth(Pa2(KB,LAL) = 0.2h(Pa1(KB,LAL) = 0.95KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014Link prediction with scoring functionsA scoring function

34、 alone does not grant a decisionThresholding: determine a threshold (KB, playFor, LAL) is True iff120S(KB, playFor,LAL)Ranking: The most likely relation between Kobe Bryant and LA Lakers is:rel argmaxrrelsS(KB,r,LAL) The most likely team for Kobe Bryant is:obj argmaxeentsS(KB, playFor,e)As prior for extraction models (cf. Knowledge Vault)No calibration of scores like probabilities

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 办公、行业 > 各类PPT课件(模板)
版权提示 | 免责声明

1,本文(知识图谱梳理专题培训课件.ppt)为本站会员(三亚风情)主动上传,163文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。
2,用户下载本文档,所消耗的文币(积分)将全额增加到上传者的账号。
3, 若此文所含内容侵犯了您的版权或隐私,请立即通知163文库(发送邮件至3464097650@qq.com或直接QQ联系客服),我们立即给予删除!


侵权处理QQ:3464097650--上传资料QQ:3464097650

【声明】本站为“文档C2C交易模式”,即用户上传的文档直接卖给(下载)用户,本站只是网络空间服务平台,本站所有原创文档下载所得归上传人所有,如您发现上传作品侵犯了您的版权,请立刻联系我们并提供证据,我们将在3个工作日内予以改正。


163文库-Www.163Wenku.Com |网站地图|