基因组组装技术课件.pptx

上传人(卖家):晟晟文业 文档编号:3706023 上传时间:2022-10-06 格式:PPTX 页数:19 大小:19.31MB
下载 相关 举报
基因组组装技术课件.pptx_第1页
第1页 / 共19页
基因组组装技术课件.pptx_第2页
第2页 / 共19页
基因组组装技术课件.pptx_第3页
第3页 / 共19页
基因组组装技术课件.pptx_第4页
第4页 / 共19页
基因组组装技术课件.pptx_第5页
第5页 / 共19页
点击查看更多>>
资源描述

1、基因组组装 2019.10.29一、Genome survey Kmer:a continuous nucleic acid sequences,the length is K bp.Suppose the genome is unique to K,we can get G different kmers.when generate a read,the possibility of a certain kmer be sequenced is(L-K+1)/G.L/G is very small,the n_r is very large,this is obey to Poisson d

2、istribution.So,d_k=(L-K+1)/G*n_r n_k=(L-K+1)*n_rthen,G=n_k/d_kQuality control and filtering Reads having a N over 10%of its length.Reads from short insert-size libraries having more than 65%bases with the quality 7,and the reads from large insert-size libraries that contained more than 80%bases with

3、 the quality 7.Read 1 and read 2 of two paired-end reads that were completely identical(and thus considered to be the products of PCR duplication).Error correction before assembly二、SOAPdenovo algorithm SOAPdenovo was developed to assemble large genomes,such as human,it also works well for small geno

4、mes like bacteria.Include five major steps:De bruijn graph construction Graph simplification and obtain contigs Pair-end reads mapping to contigs Construct scaffolds Gap filling with pair-end readsSequence assembly refers to aligning and merging fragments to a much longer DNA sequence in order to re

5、construct the original sequence.Overlap:contigGe+en+no+om+mi+ic+csGenomicsPair-end:scaffoldnomGenomesemassemblyGenome*assembly221、De bruijn graph constructionReads:AGATCTTGTTATTGTTATTGATCTCCDe bruijn graph construction1.liding to take Kmer from reads,storing the links betweenneighboring Kmers.2.If t

6、he Kmer is already existent,merge the links of it with the first ones.AGATCATCTTCTTGTTTGTTTGTTAGTTATATCTCTCTCCGATCTTCTTGTTATTTATTGTTGATATTGATGATCDe bruijn graph2、Graph simplification Contigs:GATCTTGTTATTGATCT GATCTCCAGATCTset-R parameterContigs:AGATCTTGTTATTGATCTCCRead1:AGATCTTGTTATT Read2:GTTATTGAT

7、CTCCAGATC 1GATCTATCTTGTTATTGATCATCTCC234AGATCGATCTATCTTTCTTGCTTGTTTGTTTGTTAGTTATATCTCTCTCCTTATTTATTGATTGATTGATTGATC3、Pair-end mapping to contig4、Construct scaffoldsNote:1.For mate-pair(=2Kb),the order is just opposite.2.A reliable link will be built between two contigs,when pair-end/mate-pair readss

8、upport larger than the number be set.3.The gap size is estimated from the insert size of each reads pair.5、Gap closureGet reads located in the gap and then do local assembly.(1)Close gap by pair-end information(One end mapped on the contig,the other end fall in the gap)(2)Do a local assembly using t

9、he reads fall in the gap to get a sequence connect with the both edges of two contigs.Note:Gap closure here also means extend contigs.Schematic overview 三、Evaluation of assembly result Lengthcontig(scaffold)N50 size,N90 size,total length,coverage ratio of genome.AccuracyCoverage of gene sequences,compare to EST or transcriptome sequences.Compare with golden standard(such as BAC/fosmid).Evaluation of Gene Region CoverageCompare with golden standardComparative genomic analysisAccuracy of gene structuresThanThank k y yo ou u f fo or r l listeningistening!

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 办公、行业 > 各类PPT课件(模板)
版权提示 | 免责声明

1,本文(基因组组装技术课件.pptx)为本站会员(晟晟文业)主动上传,163文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。
2,用户下载本文档,所消耗的文币(积分)将全额增加到上传者的账号。
3, 若此文所含内容侵犯了您的版权或隐私,请立即通知163文库(发送邮件至3464097650@qq.com或直接QQ联系客服),我们立即给予删除!


侵权处理QQ:3464097650--上传资料QQ:3464097650

【声明】本站为“文档C2C交易模式”,即用户上传的文档直接卖给(下载)用户,本站只是网络空间服务平台,本站所有原创文档下载所得归上传人所有,如您发现上传作品侵犯了您的版权,请立刻联系我们并提供证据,我们将在3个工作日内予以改正。


163文库-Www.163Wenku.Com |网站地图|