1、SoC设计方法与实现设计方法与实现郭炜郭炜 郭筝郭筝 谢憬谢憬第十二章第十二章后端设计后端设计OutlinesBackend Design FlowFloorplanPlace&RoutePhysical VerificationSignal IntegrityDFM/DFYSteps of Backend/Physical DesignSynthesis Floor Planning PlacementScan chain insertion and re-ordering(optional)Clock Tree SynthesisRoutingParasitic and netlist e
2、xtractionPower analysisSignal integrity checkingFinal timing analysis(STA and simulation)ECO(optional)LVS/DRCExport GDSII LVS/DRC using sign-off toolsBackend Flow with ECO Engineering Change Order(ECO)Achieved by adding small amount of cells in limited area,sizing buffers and routing the connections
3、Prevent disturbing the placement and routing of the rest of the chipKeep in mind:Performance,Power,Size,ReliabilityIt is not impossible to develop“plug&Play”toolsFloorplanningBased on netlist,create areas of functionality on your chipDetermine the placement of blocksDetermine the placement of I/O pi
4、nsDetermine the power supply strategy Give feedback on how easy your floorplan might be to wire(Global routing)and how big the chip isChip Floorplanning ConsiderationsChip level floorplanningnHigh speed block issue nLocation affect the timing performancenAnalog block issue nclean Vdd/Vss;minimal spa
5、cing to digital block;IO locationnDie size issuenPin limited;Core limitednPower-Ground routing issuenPower ring width according to power analysisnPower strip/mesh spacingnPin placement and IO ring issue(will be talked in next class)nPad pitch vs.bounding rule;ESD;noise isolation;Die Size Issue cont.
6、Determine the area for standard cells“Utilization”70%?80%?90%?nExtra space for clk tree synthesisnExtra space for scan chainnLayers for routingHard Macro PlacementMacros are generally placed around the peripheral I/O ring nA contiguous area for standard cells.nHigher freedom for your place-and-route
7、 tools during placement and routing of the standard cellsThe goal of macro placement is to:nReduce timing-critical paths between the macros and interfacing logic.nReduce interconnections in the following order:n Chip I/O to macrosn Macro to macro n Macro to standard cell blocksPower/Ground Developme
8、ntIR Drop and ElectromigrationnPower-net IR drop degrades the supply voltage levelnExcessive current density in metal wire causes electromigration failure which breaks metal connectionnMore significant IR drop effect when Vdd gets smallernHigher current density when metal wire width is smallerVddRPo
9、wer/Ground Development-cont.Ring structurenPower rings around all layout blocksnMajor power trunks between layout blocksnDifficult to guarantee the worst IR dropStrap structurenSimple,easy for routingMesh structurenEvenly distribute of IRnSpacing of Power strips consideration IR drop analysisnFix th
10、e problem in early stage P/G StructuresBe Ware of Maximum Width RuleMaximum wire width limit due to thermal stress and local density rulesSlotting vs.“bus”of thin wiresDisadvantage of slotting:nSlots may not be aligned with current flownTrue IR drop not known until after slottingEspecial happen for
11、Power/Ground ringsM1M1GNDGNDGNDGNDCommonly used for power/groundPlacement Based on a given floorplan,determine the location of cells in a given netlist Goals&objectives RoutabilityGuarantee the router can complete the routing step (Global routing)TimingMinimize all critical net delays Minimize die s
12、izeMake the chip as dense as possible Signal IntegrityCheck feasibility of routing after placementnLogic effort-for those paths with positive slack,reduce cell sizeCongestion and FixBeforeAfterCongestion areasRoutingComplete power/ground/clock routing(clock tree synthesis)Complete detailed wire rout
13、ing,conform wiring rule and order)Improve the densityMinimize the layer changesImprove critical path and meet timing requirementProduce a routed design free of DRC/LVS violationsGeneral Routing Flow Clock Tree SynthesisnAdd buffers/inverters,minimize clk skew and delayPost Placement Optimization(PPO
14、)nFix setup violationPre-Route Standard CellsnVDD/VSS rails on metal 1nVerify PG connection and routingRoute Group Net nclocksnbus routingPost-Route CTOnFix clk skew and insertion delayGlobal Routingncritical pathnlong wire,interconnectionRouting flow cont.Track Assignment&Detail RoutingnWire connec
15、tionSearch&Repair(DRC/LVS)nfix routing violation(unconnected nets,shorts)Post Route OptimizationnFix timingCoarse LVS&DRC checkingnmetal width,notch&gap checkingData Outputnstream out:gds2 formatnverilog out:hierarchy(PT)/non-hierarchy(for Hercules)nparasitic out:spef format(cell view)Clock Tree Syn
16、thesis Objective:nminimize clock skewnoptimize clock buffersBasic CTS Flow&Concepts Clock ConstraintDefine:nClk source:root pin,target insertion delay,target transition time at clk portnClk endpoint:Synchronous pin,ignore/exclude pinnDriving cell,clk cell,delay cell:buffers,inverters,special clk cel
17、ls nDRC:maximum transition delay,maximum net capacitance,maximum fanout,clk number of buffer levelsClock Skew Global Skew and Local SkewnGlobal skewnGlobal skew is the clock arrival time difference between any two flip-flops.nLocal skewnLocal skew is the clock arrival time difference between two fli
18、p-flops that are adjacent through combinational logic.Concept of Useful SkewUseful skew is a method of intentionally skewing a clock to improve the timing on a circuit.It is also commonly used in ECOWarning:Could cause problem in DFT scan insertion Use CTS for High-Fanout Net SynthesisHigh-fanout pi
19、ns:rest,scan_enNeed to balance high-fanout pins to guarantee the functionalityUsing CTS tool:high-fanout nets by inserting a balanced buffer treenTo minimize both skew and insertion delaynBut should avoid using large buffers for power savingLarge SoC Clock DistributionPartition the design to several
20、 blocksCTS for each blockClk tree network at top levelExternal clockIP Coreor ModuleCore InternalClock NetPLLGlobal Clock NetH Tree for Top Clock NetworkUse big buffer to balance delay and clk skewnEqual distance,equal loads,equal driving ability Clock Distribution Case Study:Pentium SpinesFROM PLLK
21、urd et al.,A multigigahertz clocking scheme for the Pentium 4 microprocessor,JSSC2001Clock Distribution Case Study:Intels Itanium H Tree ClockingTam et al.,Clock generation and distribution for the first IA-64 microprocessor,JSSC 2000IssuesLarge amount of clock buffers added on clock treenPower cons
22、umptionnNoise to supply linesReduce power consumption nWide wire widthsnClock gating cell placementnLimitation of using large clock buffer cellsReduce noise nSpecial clock buffer cells with decoupling capacitorExtractionWhen complete detailed routenWrite out the hierarchical netlist and parasitic fo
23、r back annotationData management on huge file of extracted parasitic dataAccurate RC and timing model for nanometer designnWidth and spacing dependencenResistance shielding nLocal density effectSDF Back AnnotationUsed in cell-based design flowPerforms delay calculation on parasitic RCs in interconne
24、ct wiresDSPF-Detailed Standard Parasitic FormatSPEF Standard Parasitic exchange FormatSDF-Standard Delay Format used for post-layout simulation nCan be convert from PrimeTimePhysical Verification DRC-Design Rule checknVerify the manufacturing rules,example:nInternal layer checksnWide metal checksnMe
25、tal slotting needed for wide metalnLayer-to-layer checksnDFM/DFYnExample:Antenna Rule CheckLVS Layout vs.SchematicsnCompare layout to schematics-every cell and netDRC Trends and Challenge75%time on metal layer and via checkERC-type checks increasingRise of pre-tapeout DFM utilitiesNumber of Design R
26、ules by ProcessNumber of Design Rules by ProcessNodeNode020040060080035025018015013090(nm)(nm)LVSLayout vs.Schematic(LVS)nCheck physical layout against functional gate level schematic to ensure all intended connectivity has been maintainednSteps:nExtract the netlist from layout(GDSII)nCompare the ne
27、tlist with the one after routing and optimization Hints:nMost of LVS errors are caused by manual layout or congestionn“Virtual connect”(connected by text)could cause a killer failureSignal IntegritySignal Integrity is the ability of a signal to generate correct response in a circuitnSignal has digit
28、al levels at appropriate and required voltage levels at required instants of timeCrosstalk,IR Drop,ElectromigrationLayout Parasitic vs.Circuit PerformanceInterconnect parasitic resistors,capacitors and inductors cause extra timing delayAdditional power consumption caused by parasitic RC Inter-wire c
29、apacitances cause coupling noise and will dominate interconnect wire delaysParasitic resistances in power supply cause voltage drop and may degrade circuit performanceHigher current density in power net may cause electromigration failureInductance EffectsInductive coupling effect is significant for
30、long interconnects and for very fast signal edge rateInductive coupling is negligible at short trace interconnects,since the edge trace is long compared to the flight time of the signalInductance extraction and simulation are more difficult than capacitanceCLCrosstalk AnalysisDefinitionnAggressor:ge
31、nerating crosstalknVictim:receiving crosstalkTiming sensitivenCrosstalk analysis consisting signal transition timing window can eliminate pessimistic delay calculationnThe crosstalk spike is related to capacitance value and the victim driver impedanceCrosstalk Analysis cont.Timing sensitiveCrosstalk
32、 PreventionPrevent crosstalk from synthesis stage nMinimize the driving size on those non-critical path to reduce the number of aggressors nApply max transition time(set_max_transition)in physical synthesis/placement to avoid long netsCrosstalk Prevention cont.From routing stagenEffective spacing be
33、tween noise region and quite region nShielding between critical pathsCrosstalk Prevention cont.From routing stage cont.nBuffer insertionnInserted buffer breaks up the coupling capacitance of long wireCrosstalk Prevention cont.From routing stage cont.nBuffer sizingnIncrease the driver size of victimn
34、Decrease the driver size of aggressornTrack reorderingnTrack reordering is based on timing windowCrosstalk Prevention cont.For inductance crosstalknCoplanar ShieldsnReference PlannStagger Inverter/BufferElectromigration EffectsThe electrons flow through the wires and collide w/metal atoms,producing
35、a force that causes the wires to breakCaused by the high current densities and high frequencies going through the long,very thin metal wiresMTTF(Mean Time To Failure)increases when current density and temperature increaseCan be eliminated by using the appropriate wire sizing Open CircuitShort circui
36、tFix EM Controlling current density to limit electromigration failure is needed in design and verificationLayout optimization:nIncrease the power line width,layernIncrease the power padsnIncrease the connection IssuesnMore metal(add 8%cost per layer)nLarger,slower designs(grow in x and y)Other Consi
37、derationsESD(will be talked in next class)Package vs.performance(will be talked in next class)DFM/DFYDFM/DFY90nm and below technologies challenges in yieldDFM Design for ManufacturabilityDFY Design for Yield DFM and DFYDFM is the management of technology constraints(sizing rules)applied to the layou
38、tA manufacturable design however is not necessarily a high-robust or high-yielding design.DFY,as part of Design for Manufacturability,concentrates on the development and quality of the circuit design in the pre-and post-layout phase.DFY is the management of design sensitivities to the manufacturing
39、process and helps to guarantee high-yielding devicesDFM/DFY MethodologyOptimal resolution enhancement technology(RET)nMask and exposure nOptical Proximity correction(OPC)nPhase Shaft Mask(PSM)Yield enhancement and optimization technologynDFM rules implementationnTo overcome limits of OPCnYield check
40、ing during the layout stagenSupported by EDA toolsWhy Need RET?Wavelength used vs process generationDesign for Manufacturing Not all the things can be done by mask and exposure:nCorrections are not completenSome designs cannot be built at all with certain RET technologiesnOf those that CAN be built,
41、some are more manufacturable after RET than othersDFM/DFY-driven routingnOPC-driven routingnPSC-driven placementnDFM rule implementationDFM/Y RulesLimit the use of minimal poly-enclosed gates,minimally enclosed vias and singly contacted linesnBetter yieldnLess resistanceExample:Via Void rules-double
42、d vias Current DFM/Y Design Flow Supported Load DesignPerform antenna fixesAdd contacts/viaMetal Fill&SlottingVerify LVS and DRCWhy Need Double Vias?Copper processing causes new problems for viasnVoids in Cu migrate under thermal stress towards viasnIf enough voids migrate to a via it can cause fail
43、urenWorse at 90/65nm due to increased stress of smaller viaVoids can migrate long distances 10 micronsVoids can migrate around cornersYield vs.AreaAntenna RulesAntenna rules have nothing to do with traditional definition of antennanReally a collector of static charge,not electromagnetic radiationAnt
44、enna problem only happens during manufacturingnPlasma-based process for etching,oxide depositionnPlasma etcher include a voltage into floating wire,stressing the thin gate-oxidesNot a new problem,but sub-100nm materials may make it a lot worseAntenna EffectDepend on the gate size and the length of t
45、he wire The metal or poly leads act like an antenna -collect the charges(negative charge)Resulting in gate oxide breakdownDriver(diffusion)Poly gateM1Driver(diffusion)Poly gateM1M2Fix Antenna Rule violationAntenna effect check is part of DRC/ERCSolutions:nAdding antenna diode near the input pin to p
46、rovide a conducting path to GNDnAdding jumper to minimizes the amount of charge collected by a floating nodenAdding buffer to cut the wire and provide a discharge pathMetal FillingA narrow metal wire separated from other metal receives a higher density of enchant than closely spaced wiresThe narrow
47、metal can get over-etchednChange thickness of metal line Minimum metal density rules are used to control thisFills empty tracks with metal shapes to meet the minimum metal density rulesMetal Filling cont.Caution:metal fill changed the parasiticnWidth and spacing dependentNeed smart parasitic extract
48、ionTiming driven metal fill Problem Facing at 90nm and BelowDFM techniques such as wire spacing,wire widening,redundant via insertion,metal fill impact crosstalk and timing significantly The era of interconnection synthesisTrend of Backend Tools Considering timing,area,power,DFM/DFY at one timeSoC设计方法与实现设计方法与实现郭炜郭炜 郭筝郭筝 谢憬谢憬Thank you