1、高通量测序技术及原理介绍高通量测序技术及原理介绍童贻刚童贻刚 军事医学科学院军事医学科学院 微生物流行病研究所微生物流行病研究所 公司公司系统名系统名测序长度测序长度优点优点缺点缺点Roche/454 FLX System200_700读长最长;通量高同聚性错误;仪器和试剂价格贵Illumina HiSeq 2000/miSeq2 x 150 通量非常高价格贵;后期分析复杂ABI/SOLiD 5500 xl SOLiD25_35 通量高;试剂消耗少读长太短Helicos HeliScope 25_30 通量高读长太短1415测序平台测序平台测序长度测序长度进化过程进化过程产出产出测序时间测序时
2、间SOLiD30bp15bp30G10天Solexa Hiseq2000150bp X 230bp,50bp,75bp,100bp600G14天454750bp100bp,400bp0.7G7小时37301000bp X 96300bp,600bp0.0001G2小时IlluminaIllumina workflow workflowvSample preparation Shearing,ligate adaptervCluster generation Bridge PCRvSequencing on Genome Analyzer IIx RTA(Run Time Analysis)v
3、Analysis pipeline Offline analysis,alignment,SNPs calling,reads counting Visualize the data,reports the resultsSequencing processFragment DNARepair ends/Add A overhangLigate adaptersSelect ligated DNAHybridize to flow cellExtend hybridized oligosPerform bridge amplificationPerform sequencing on forw
4、ard strandRe-generate reverse strandPerform sequencing on reverse strandCONFIDENTIAL DO NOT DISTRIBUTE1 Library prep(6 hrs)2 Automated Cluster Generation(5 hrs)1-8 samples3 Sequencing(46 to 120 hrs)1-8 samplesSample Prep-ResequencingSurface bound adapter 1Sequencing primer binding siteSurface bound
5、adapter 2CONFIDENTIAL DO NOT DISTRIBUTECONFIDENTIAL DO NOT DISTRIBUTE Clonal clusters aregenerated in a containedenvironment(need noclean rooms)Sequencing alsoperformed in the flow cellon the generated clustersFlow cell8 channelsKey to the simplifiedworkflowSurface of flowcell coatedwith a lawn ofol
6、igo pairsCluster generation:Hybridize fragment&extendAdaptersequence 50 M singlemoleculeshybridize to thelawn of primersBound moleculesare then extendedby polymerases3 extensionCONFIDENTIAL DO NOT DISTRIBUTEDouble-strandedmolecule isdenatured.Original templateis washed away.Newly synthesizedcovalent
7、lyattached to theflow cell surface.CONFIDENTIAL DO NOT DISTRIBUTECluster generation:Denature double-stranded DNANewlysynthesizedstrandOriginaltemplatediscardCluster generation:Covalently boundspatially separated single moleculesSinglemoleculesbound toflow cell ina randompatternCONFIDENTIAL DO NOT DI
8、STRIBUTECluster generation:Bridge amplificationSingle-strand flipsover to hybridize toadjacent primers toform a bridge.Hybridized primeris extended bypolymerases.CONFIDENTIAL DO NOT DISTRIBUTECluster generation:Bridge amplificationdouble-strandedbridge is formed.CONFIDENTIAL DO NOT DISTRIBUTECluster
9、 generation:Bridge amplificationDouble-stranded bridgeis denatured.Result:Two copies ofcovalently bound single-stranded templates.CONFIDENTIAL DO NOT DISTRIBUTECluster generation:Bridge amplificationSingle-strands flip overto hybridize to adjacentprimers to form bridges.Hybridized primer isextended
10、by polymerase.CONFIDENTIAL DO NOT DISTRIBUTECluster generation:Bridge amplificationBridge amplificationcycle repeated tillmultiple bridgesare formedCONFIDENTIAL DO NOT DISTRIBUTECluster generationdsDNAbridgesdenatured.Reversestrandscleavedandwashedaway.CONFIDENTIAL DO NOT DISTRIBUTECluster generatio
11、n leavinga clusterwith forwardstrands only.CONFIDENTIAL DO NOT DISTRIBUTECluster generationFree 3 endsare blocked topreventunwantedDNA priming.CONFIDENTIAL DO NOT DISTRIBUTECONFIDENTIAL DO NOT DISTRIBUTEhybridizedto adaptersequence.SequencingSequencingprimer isSequencingprimerAdd 4 Fl-NTPs+Polymeras
12、eIncorporatedFl-NTP isimagedTerminator andfluorescent dyeare cleaved fromthe Fl-NTPX 36CONFIDENTIAL DO NOT DISTRIBUTESequencing primerFlow cell imagingTotal Internal Reflection FluorescenceFluidics portFlow cellPrismFluidics portCONFIDENTIAL DO NOT DISTRIBUTECONFIDENTIAL DO NOT DISTRIBUTEPaired end
13、sequencingSequencedstrandstripped off3-endsunblockedPaired end sequencingBridgeformation3extensionCONFIDENTIAL DO NOT DISTRIBUTEPaired end sequencingDoublestrandedDNA isdenaturedCONFIDENTIAL DO NOT DISTRIBUTEPaired end sequencing3 endsareblockedOriginalforwardstrand iscleavedCONFIDENTIAL DO NOT DIST
14、RIBUTEAdd 4 Fl-NTPs+PolymeraseIncorporatedFl-NTP isimagedTerminator andfluorescent dyeare cleaved fromthe Fl-NTPX 36-50CONFIDENTIAL DO NOT DISTRIBUTESequencing reverse strandHybridizesequencingprimerSOLEXASOLEXAFlow cell in Flow cell in GAIIxGAIIxCONFIDENTIAL DO NOT DISTRIBUTEImage re-analysis piple
15、lineImageAnalysisBasecallingSequenceAnalysisGA Analysis PipelineInstrument PCAnalysis PC/clusterdatatransferImages(.tif)Lane 1.8Cycle 1.36Tile_Cycle_Image_a,Tile_Cycle_Image_c,Tile_Cycle_Image_g,Tile_Cycle_Image_t.params fileFor each tile:Cluster intensitiesCluster noiseFor each tile:Corrected clust
16、er intensitiesCluster sequenceCluster probabilitiesFor all data:Quality FilteringSequence AlignmentRun Statistics VisualizationCONFIDENTIAL DO NOT DISTRIBUTEBustardBase with highest corrected intensity is calledACGTCGeraldI AI A+IBGEneration ofRecursiveAnalysesLinked byDependencyIAIBFiltering remove
17、s low quality base callsChastity:C=Default value 0.6Other filters include purity,similarity,neighbor andneighborhood.CONFIDENTIAL DO NOT DISTRIBUTEBustard output Bustard output*_ _qseq.txtqseq.txtMachine nameRun number Lane number Tile number X coordY coordSequence Quality PassedFilter IndexRead for
18、matEAS1 89 1 59 111 525 AACCTT 2 TGACCAGCGTCAACCAGTACTACGTCTTTGTCGATAG aaaaa_V_OYOZZYUPJZRX 1EAS1 89 1 59 111 726 AACCTT 2 TCTGGATGAAGAACGATCCGCTGCAGAGGTGCTGGCA _FNXXZWFZ_YYTYMUVBBBBBBBBBBB 0EAS1 89 1 59 111 860 AACCTT 2 TATCGCGTAGTGTAGCACGGCCTTTTTTTCGTCCACC aaaXFUWQUHVN_ZRWZZXFWYFTX 1EAS1 89 1 59 1
19、12 377 AACCTT 2 TTTTCTTCTCCTTCGCCATCAGCGACAAAATCAAGCA abbbabbbbbbaaaTaaaaaY_YNaZZ 1EAS1 89 1 59 112 538 AACCTT 2 TGTGAATTAACAGTATTGGCGTAGTTACAGGCAGTGT aa_aabbaaa_aSYZYUBBBBB 1EAS1 89 1 59 112 576 AACCTT 2 TCTCCTTCGTCTTCTTCCATCAGTTGTTCGACCGGCT GJRNGBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 0EAS1 89 1 59 112 607
20、AACCTT 2 TCCACCATCAACTGGTTGCCAGTGCGCGGGCAGTTAA aabaaaaaaX_YTTHTTZQYTX 1EAS1 89 1 59 112 255 AACCTT 2 TGATGCTGATAAGCAGCGTGCTCACAACCCAGATTTG aaba_abaabbbbabababbbbb_aba_Zabbb 1FastqFastq format formatvhttp:/www.bioinformatics.babraham.ac.uk/projects/fastqc/GERALD sequenceGERALD sequenceSummary.htmlSum
21、mary.html(PF:pass filter)FastQCFastQCThird part softwareThird part softwareBrief Bioinform.2011 Jan 18NGS NGS 技术论坛技术论坛vSEQwiki:http:/ http:/ VS.MATE-PAIRPAIRED-END VS.MATE-PAIRSOLIDSOLID SYSTEM MATE-PAIRED LIBRARY PREPARATION SYSTEM MATE-PAIRED LIBRARY PREPARATION genomic DNAsheared DNAEcoP15I CAP l
22、inkers ligated on to sheared,m e t h y l a t e d DNAdigestionF D V-R D V ligated library moleculesb i o t i n y l a t e d internal adaptors with 25-27bp tags from genomic DNA s h e a r e d,methylated DNAshearing&end repairmethylationligationcircularizationcircularized DNA with biotinylated i n t e r
23、 n a l adaptorsligationSOLEXA MATE PAIR LIBRARY I ISOLEXA MATE PAIR LIBRARY II II454 LONG SPAN PAIRED-END I454 Long Span Paired-End II四种关键酶四种关键酶1.Digestion of non-circular DNA 降解线状DNA(ATP dependent Plasmid-Safe DNAse)2.Nick-translation 切口平移(DNA polymerase I)3.T7 exonuclease digestion 从53进一步降解线状DNA,形成3伸出单链4.S1 nuclease digestion 消除3伸出单链谢 谢!