ImageVerifierCode 换一换
格式:PPTX , 页数:52 ,大小:6.86MB ,
文档编号:5950046      下载积分:22 文币
快捷下载
登录下载
邮箱/手机:
温馨提示:
系统将以此处填写的邮箱或者手机号生成账号和密码,方便再次下载。 如填写123,账号和密码都是123。
支付方式: 支付宝    微信支付   
验证码:   换一换

优惠套餐
 

温馨提示:若手机下载失败,请复制以下地址【https://www.163wenku.com/d-5950046.html】到电脑浏览器->登陆(账号密码均为手机号或邮箱;不要扫码登陆)->重新下载(不再收费)。

已注册用户请登录:
账号:
密码:
验证码:   换一换
  忘记密码?
三方登录: 微信登录  
下载须知

1: 试题类文档的标题没说有答案,则无答案;主观题也可能无答案。PPT的音视频可能无法播放。 请谨慎下单,一旦售出,概不退换。
2: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。
3: 本文为用户(ziliao2023)主动上传,所有收益归该用户。163文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知163文库(点击联系客服),我们立即给予删除!。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 本站仅提供交流平台,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

版权提示 | 免责声明

1,本文(下一代集成芯片介绍课件.pptx)为本站会员(ziliao2023)主动上传,163文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。
2,用户下载本文档,所消耗的文币(积分)将全额增加到上传者的账号。
3, 若此文所含内容侵犯了您的版权或隐私,请立即通知163文库(发送邮件至3464097650@qq.com或直接QQ联系客服),我们立即给予删除!

下一代集成芯片介绍课件.pptx

1、IC Technology What What Will Will thethe Next Next Node Node OfferOffer Us?Us?MOORESMOORES LAWLAWTransistors per microprocessor1010109 108 107 106 105 104 197119801990200020102017Source:Karl Rupp.40 Years of Microprocessor Trend Data.2MOORES LAWDENSITY DENSITY AND COST AND COST PER FUNCTIONSource:G.

2、Moore,Electronics,1965105104103102101051110102103104Number of Components per Integrated CircuitRelative Manufacturing Cost/Component19601965197010410310210110010-110-210-3MOORES MOORES LAW LAW IS WELL AND ALIVEDENSITYDENSITY:A NECESSARY ATTRIBUTE19701975198019851990199520002005201020152020Relative D

3、ensityYearStandard cell inverterHigh density SRAMLogic gatesTransistor density(microprocessors)4IMAGINE:TRANSISTOR PERFORMANCE TRANSISTOR PERFORMANCE W/OW/O DENSITYDENSITY5 Not enough memory No multi-core chips No accelerators Wire delay slows big chips.6IMAGINE:TRANSISTOR PERFORMANCE TRANSISTOR PER

4、FORMANCE W/OW/O DENSITYDENSITYTECHNOLOGY LEADERSHIPLEADERSHIPN77Worlds first 7 nmParticipated in all the products on 7 nmBest performance Highest density Extensive EUV layers Design ecosystem ready In risk productionTECHNOLOGY LEADERSHIPLEADERSHIPN7N5(P)8N3TECHNOLOGY LEADERSHIPLEADERSHIPN7N5(P)910TH

5、E ELEPHANTELEPHANTIN THE ROOM10-1 mBacteria2 mStrand of hair0.1 mmTennis ball10 cmVirus50 nmCarbon nanotube1.2 nmFinFETWater molecule0.28 nmHHO-+Hydrogen atom0.1 nm10-2 m10-3 m10-4 m10-5 m10-6 m10-7 m10-8 m10-9 m10-10 mCONTINUOUSBENEFITSNODE AFTER NODEMOORES LAW MOORES LAW A HISTORY OF INNOVATIONSDe

6、nnard scalingStrained Si,high-k/metal gateFinFET/DTCO11CONTINUOUSBENEFITSNODE AFTER NODEMULTIPLE ROADS MULTIPLE ROADS LEAD TO ROMEInnovations12INTEGRATING INTEGRATING CHIPS INTO SYSTEMSIt may prove to be more economical to build large systems out of smaller functions,which are separately packaged an

7、d interconnected.The availability of large functions,combined with functional design and construction,should allow the manufacturer of large systems to design and construct a considerable variety of equipment both rapidly and economically.13Source:G.Moore,Electronics,1965CoWoSCoWoS SYSTEM INTEGRATIO

8、NSource:2013 TSMC Technology Symposium14TSMC CoWoS fullyassembled test chip1 SoC+2 DRAMsCoWoSCoWoS SYSTEM INTEGRATION2500 mm2interposer:2 processors(600 mm2)+8 HBM DRAM15Integrated Si/Package Area,ReticleSYSTEM INTEGRATIONSYSTEM INTEGRATION TECHNOLOGIESI/O Pin Count16Package SizeInterposer Size(mm2)

9、GP100(Courtesy of Nvidia)7V580THeterogeneous Integration (Courtesy of Xilinx)7V2000THomogeneous Integration (Courtesy of Xilinx)XCVU440(Courtesy of Xilinx)GV100(Courtesy of Nvidia)mm2CHIPLETS INTEGRATIONREDUCES REDUCES SYSTEM COST PER FUNCTION2X1X1.5X17PC/InternetMobileAI/5GMini-ComputerTransistor R

10、adioSEMICONDUCTOR TECHNOLOGYSEMICONDUCTOR TECHNOLOGY EVOLVESEVOLVESDRIVEN BY CHANGING APPLICATION LANDSCAPEInvention of point-contact transistor1947Transistor ScalingPrinciple1974Intel 40041971Invention of IC1958Pentium CPU1995Flash Memory1984Mobile phone19734G20093GiPhone2002 2007FinFET1999GPU(21B

11、Transistors)20175nm CMOS20207nm FinFET20182050 and beyond1815%85%8%92%20%80%MemoryComputeDeep Learning AcceleratorsIntel performance counter monitors 2 CPUs,8-cores/CPU+128GB DRAMDATA MOVEMENT DATA MOVEMENT HITS THE MEMORY WALLABUNDANT-DATA APPLICATIONS:ENERGY MEASUREMENTSSource:S.Mitra(Stanford)19R

12、esNet-152 (CNN)AlexNet (CNN)Language Model (LSTM)Network(application)Type(LSTM/CNN)Training/InferenceModel SizeMemory Usage(GBytes)ResNet(vision)CNNTraining120 MBytes21*Inference0.12Language Model (NLP)LSTMTraining2.5 GBytes40*Inference2.5*Training memory usage:Batch size 64,word size 64-bit,memory

13、can increase with greater batch sizes,footprint of activations,weights,errors and gradients.Source:M.Lee,W.Hwang,Prof.S.Mitra(Stanford),M.Aly(NTU,Singapore),Y.Wang,K.Akarvardar(TSMC)DEEP DEEP NEURALNEURAL NETWORKSNETWORKSREQUIRE LARGE MEMORY CAPACITY20ON-CHIP ON-CHIP SRAMSRAM CAPACITY:CAPACITY:NEVER

14、 ENOUGH0102030405060Estimated On-chip SRAM(MB)200620182012Launch Year20092015Intel Xeon X5355NVIDIA Tesla K40NVIDIA Tesla V100Intel Xeon E7-8890 v4CPUGPU3.8 Gbytes1.4 nm nodeSource:W.Hwang,Prof.S.Mitra(Stanford)21CAN WE PUT LOTS OFMEMORY ON-CHIP?WHAT KINDS OF MEMORY,FOR WHICH APPLICATION?22Source:“I

15、nside Volta”,Nvidia GPU Tech.Conf.,May 10,2017.Heterogeneous Integration:GPU+High Bandwidth Memory(HBM2)CoWoS ModuleSuperior processing power that equals to 100 CPUs300 B transistorsSUPER AI ACCELERATORENABLED ENABLED BYBY CoWoSCoWoS HBM2HBM2HBM2HBM2GPU23COMPUTE-MEMORYCOMPUTE-MEMORY INTEGRATIONINTEG

16、RATIONPrinted Circuit BoardSi Logic DieOff-Chip DRAMLimited I/O Connectivity2D System(traditional baseline)24Source:W.Hwang,W.Wan,Y.Malviya,H.Li,M.Lee,M.Aly,H.-S.P.Wong,S.Mitra.Work in progress 2017 2019 w/TSMC2.5D SystemHBM-Type DRAMSi Logic DieSi InterposerMicron Scale ConnectivitySource:W.Hwang,W

17、.Wan,Y.Malviya,H.Li,M.Lee,M.Aly,H.-S.P.Wong,S.Mitra.Work in progress 2017 2019 w/TSMC25COMPUTE-MEMORYCOMPUTE-MEMORY INTEGRATIONINTEGRATIONHBM-Type DRAMSi Logic DieTSV+Bump Connectivity (Micron Scale)3D TSV SystemSource:W.Hwang,W.Wan,Y.Malviya,H.Li,M.Lee,M.Aly,H.-S.P.Wong,S.Mitra.Work in progress 201

18、7 2019 w/TSMC26COMPUTE-MEMORYCOMPUTE-MEMORY INTEGRATIONINTEGRATIONN3XT SystemDense ILV Connectivity(Nanometer Scale)Si Logic DieEnergy Efficient Logic(Thin Device Layers)High Density On-ChipNonvolatile MemoryHigh Speed On-Chip Nonvolatile MemoryEnergy Efficient Memory Access TransistorsNonvolatile M

19、emory CellsSource:W.Hwang,W.Wan,Y.Malviya,H.Li,M.Lee,M.Aly,H.-S.P.Wong,S.Mitra.Work in progress 2017 2019 w/TSMC27COMPUTE-MEMORYCOMPUTE-MEMORY INTEGRATIONINTEGRATIONBottom Electrodeoxide isolationswitching regionTop Electrode phase change materialPCMPhase change memoryTop ElectrodeBottom Electrodeme

20、tal oxideoxygen ion filamentoxygen vacancyRRAMResistive switching random access memoryfilamentBottom Electrodesolid electrolyteActive Top Electrodemetal atomsCBRAMConductive bridge random access memorySTT-MRAMSpin torque transfer magnetic random access memoryFERAMFerro-electric random access memoryF

21、erroelectric layerp-Sin+n+Interface Layertop gateSource:H.-S.P.Wong,S.Salahuddin,Nature Nanotech(2015)“NEW”MEMORIES FORCOMPUTE-MEMORY INTEGRATIONSoft MagnetPinned Magnettunnel barrier(oxide)currentRandom access,non-volatile,no erase before write,on-chip integration282DbaselinesystemAcceleratorCoresS

22、RAM on-chipmemoryNEW NEW MEMORYMEMORY:HIGH-BANDWIDTH,HIGH-CAPACITY,ON-CHIPSource:Stanford/NTU:M.Aly,S.Mitra,TSMC:Yih(Eric)Wang,K.Akarvardar,2019292DbaselinesystemOff-chip DRAM(LPDDR3)Capacity:4 GBytes Latency:50 ns BW:12 GBytes/s Read/write energy:17 pJ/bitAcceleratorCoresSRAM on-chipmemoryNEW NEW M

23、EMORYMEMORY:HIGH-BANDWIDTH,HIGH-CAPACITY,ON-CHIPSource:Stanford/NTU:M.Aly,S.Mitra,TSMC:Yih(Eric)Wang,K.Akarvardar,2019302DbaselinesystemOff-chip DRAM(LPDDR3)Capacity:4 GBytes Latency:50 ns BW:12 GBytes/s Read/write energy:17 pJ/bitAcceleratorCoresSRAM on-chipmemoryNEW NEW MEMORYMEMORY:HIGH-BANDWIDTH

24、,HIGH-CAPACITY,ON-CHIPNewsystemAcceleratorCoresSRAM on-chipmemoryOff-chip DRAM(LPDDR3)Capacity:(4 GBytes minus New Mem.Cap.)Latency:50 ns BW:12 GBytes/s Read/write energy:17 pJ/bitSource:Stanford/NTU:M.Aly,S.Mitra,TSMC:Yih(Eric)Wang,K.Akarvardar,2019312DbaselinesystemOff-chip DRAM(LPDDR3)Capacity:4

25、GBytes Latency:50 ns BW:12 GBytes/s Read/write energy:17 pJ/bitAcceleratorCoresSRAM on-chipmemoryNewsystemHigh Bandwidth,High Capacityboth criticalAcceleratorCoresSRAM on-chipmemoryOff-chip DRAM(LPDDR3)Capacity:(4 GBytes minus New Mem.Cap.)Latency:50 ns BW:12 GBytes/s Read/write energy:17 pJ/bitOn-c

26、hip New memory Capacity:sweep(up to 4 GBytes)Latency:sweep(down to 3ns)BW:sweep(up to 128 GBytes/s)Read/write energy:5 pJ/bitNEW NEW MEMORYMEMORY:HIGH-BANDWIDTH,HIGH-CAPACITY,ON-CHIPSource:Stanford/NTU:M.Aly,S.Mitra,TSMC:Yih(Eric)Wang,K.Akarvardar,2019325 ns memory access latency,5 pJ/bit access ene

27、rgyNEW MEMORY ESSENTIAL REQUIREMENTON-CHIP ON-CHIP CAPACITY CAPACITY MUST MUST EXCEED EXCEED DATADATA SIZESIZEEDP benefitsLanguage model(LSTM)2.5 GByte data sizeBandwidth(GBytes/s)Bandwidth(GBytes/s)ResNet-152(CNN)120 MByte data size1.3x4.2x1.3x2.9x-3.6x120 MByte64 GBytes/s4 GByte128102.1x 8x15x 30

28、x2.5 GByte2.1x 8x50 x100 GBytes/s1 MByte4 GByte1 MByteCapacityCapacity10128Source:Stanford/NTU:M.Aly,S.Mitra,TSMC:Yih(Eric)Wang,K.Akarvardar,201933ResNet-152(CNN)120 MByte data sizeBandwidth(GBytes/s)968064483216102420483072Capacity(MBytes)4096Bandwidth(GBytes/s)112128968064483216102420483072Capacit

29、y(MBytes)4096Language model(LSTM)2.5 GByte data size12850112NEW MEMORY ESSENTIAL REQUIREMENTON-CHIP ON-CHIP CAPACITY CAPACITY MUST MUST EXCEED EXCEED DATADATA SIZESIZEEDP benefitsSource:Stanford/NTU:M.Aly,S.Mitra,TSMC:Yih(Eric)Wang,K.Akarvardar,2019345 ns memory access latency,5 pJ/bit access energy

30、5 pJ/bit access energyNEW MEMORY ESSENTIAL REQUIREMENTHIGH BANDWIDTH MORE CRITICAL THANHIGH BANDWIDTH MORE CRITICAL THAN LATENCY LATENCYEDP benefitsLanguage model(LSTM)2.5 GByte data sizeBandwidth(GBytes/s)Bandwidth(GBytes/s)ResNet-152(CNN)120 MByte data size4.2x4.1x1.1x-3x20 ns64 GBytes/sLatency(ns

31、)35012810Latency(ns)50 x20 x 35x1.1x 20 x15x 30 x10 ns100 GBytes/s350101282.4x 3.9xSource:Stanford/NTU:M.Aly,S.Mitra,TSMC:Yih(Eric)Wang,K.Akarvardar,2019355 pJ/bit access energyBandwidth(GBytes/s)1129680644832165 1040Latency(ns)80Language model(LSTM)2.5 GByte data size50128120Bandwidth(GBytes/s)Sour

32、ce:Stanford/NTU:M.Aly,S.Mitra,TSMC:Yih(Eric)Wang,K.Akarvardar,2019361121289680644832165 1040Latency(ns)80ResNet-152(CNN)120 MByte data size4.220NEW MEMORY ESSENTIAL REQUIREMENTHIGH BANDWIDTH MORE CRITICAL THANHIGH BANDWIDTH MORE CRITICAL THAN LATENCY LATENCYEDP benefitsEnergy Execution Time1971X525X

33、320X159X63X100100010000System-Level BenefitsWorkload:Inference on ML AcceleratorN3XTN3XT:UP TO 2,000X2,000XENERGY EFFICIENCY BENEFITS101Lang.Model(LSTM)AlexNet(CNN)Captioning(LSTM)ResNet152(CNN)VGG19(CNN)N3XT Benefits:relative to 2D Baseline System(28nm silicon CMOS,LPDDR3)Inference:16-bit data,batc

34、h size of 1Source:Stanford/NTU:M.Aly,T.Wu,A.Bartolo,H.-S.P.Wong,S.Mitra et.al.,Proc.IEEE 201937N3XTN3XT SYSTEMSYSTEMSi Logic DieEnergy Efficient Logic(Thin Device Layers)Dense ILV Connectivity(Nanometer Scale)High Density On-ChipNonvolatile MemoryHigh Speed On-Chip Nonvolatile MemoryEnergy Efficient

35、 Memory Access TransistorsNonvolatile Memory Cells38N3XTN3XT SYSTEMSYSTEMSi Logic DieEnergy Efficient Logic(Thin Device Layers)Dense ILV Connectivity(Nanometer Scale)High Density On-ChipNonvolatile MemoryHigh Speed On-Chip Nonvolatile MemoryEnergy Efficient Memory Access TransistorsNonvolatile Memor

36、y Cells391D carbon nanotube(CNT)2D TMD(MoS2,WSe2,WS2)Photo credit:B.Radisavljevic et al.,Nature Nanotech.,p.147,2011NANOMETER-THIN NANOMETER-THIN TRANSISTOR CHANNEL1 nm 1 nm1101001,00010,0004Mobility(cm2/V-s)0123Channel thickness(nm)Source:S.-K.Su,L.-J.Li(TSMC),Nature Nanotech.,2019.MoS2WS2WSe2SiGeC

37、NTFilled:electronOpen:hole40Photo credit:User Mstroeck on en.wikipedia2D 2D LAYERED MATERIALS LAYERED MATERIALS(WS2,WSe2)0.50.40.30.20.1101000Effective mass(m0)100Mobility(cm2/V-s)MoS2-eWSe2-hION(A/m)20040060080020 nm10 nmGSDGSDSource:C.-C.Cheng et al.(TSMC),Symp.VLSI Tech.201941SHORT-CHANNELSHORT-C

38、HANNELCARBON NANOTUBE TRANSISTORS10 nm Gate Length5 nm Gate LengthVDS=-0.4VVDS=0.4VSS=70 mV/Dec70 mV/Dec10-510-610-710-8-1.00.00.5-0.5Vgs(V)Ids(A)10-510-610-710-8-1.00.5-0.50.0Vgs Vt(V)Ids(A)10-910-10VDS=-0.1 VSS=73 mV/DecLg=5 nmSource:C.Qiu,L-M.Peng(PKU),Science,201742CARBON NANOTUBECARBON NANOTUBE

39、 COMPUTERCOMPUTERSource:M.Shulaker,H.-S.P.Wong,S.Mitra(Stanford),Nature,2013instruction fetchdata fetcharithmetic block write-back43Kbit 6T SRAM(6144 CNFETs)CARBON NANOTUBE CARBON NANOTUBE FET CMOS SRAMSource:P.Kanhaiya,M.Shulaker(MIT),Symp.VLSI Tech.,201944MEMORYMEMORY INTEGRATIONINTEGRATIONON LOGI

40、C PLATFORMBetter transistor alone45Transistors integrated with memory in 3DMEMORYMEMORY INTEGRATIONINTEGRATIONON LOGIC PLATFORM46Normalized DensitySYSTEMSYSTEM INTEGRATIONINTEGRATIONA CONTINUUM FROM FAR BACK-END TO FRONT-ENDSource:IMECInterposerChip-on-wafer Wafer-on-waferMonolithic 3D10810710610510

41、4103102101 10047SOCIETAL NEEDS FOR ADVANCED ADVANCED TECHNOLOGY TECHNOLOGY IS INSATIABLEADVANCEDADVANCED TECHNOLOGYTECHNOLOGY A KEY DIFFERENTIATOR48CONTINUOUSBENEFITSNODE AFTER NODEContinuous transistor&memory advancesMemory logic integrationMULTIPLE ROADS MULTIPLE ROADS LEAD TO ROMESystem integrati

42、on withhigh connectivity4950A CALL TO ACTIONACTION:EARLY ENGAGEMENTSYSTEM TECHNOLOGYACADEMIA INDUSTRY RESEARCH51End of TalkQuestions?52CONTINUOUSBENEFITSNODE AFTER NODEContinuous transistor&memory advancesMemory logic integrationMULTIPLE ROADS MULTIPLE ROADS LEAD TO ROMESystem integration withhigh connectivityCOMMITTED TO PROVIDING THE MOSTADVANCED TECHNOLOGIES

侵权处理QQ:3464097650--上传资料QQ:3464097650

【声明】本站为“文档C2C交易模式”,即用户上传的文档直接卖给(下载)用户,本站只是网络空间服务平台,本站所有原创文档下载所得归上传人所有,如您发现上传作品侵犯了您的版权,请立刻联系我们并提供证据,我们将在3个工作日内予以改正。


163文库-Www.163Wenku.Com |网站地图|