1、TESLAComputational Fluid Dynamics ModuleGPU Perf compared against Multi-core x86 CPU socket,features and may be a kernel to kernel perf comparisonApplicationGPU FeaturesGPU PerfRelease StatusNotes Altair AcuSolveLinear eqn solver2x TotalToday,release 1.8aFE unstructured NS,multi-GPUANSYS FluentRadia
2、tion heat transfer model10 x RHT Model,2x AMG solver(beta)Today,release 14.5Multi-GPU RHT model,Single-GPU SolverAutodesk MoldflowLinear eqn solver1.5x Today,release 2013FE unstructured NS,single-GPUFluiDyna Culises-OpenFOAMLinear eqn solvers3x SolverToday,release 1.2Unstructured NS,single-GPUFluiDy
3、na LBultraLBM,particle CFD20 x TotalToday,release 2.0Structured LBM,multi-GPUVratis SpeedIT-OpenFOAM SolverLinear eqn solvers6x SolverToday,release 1.2Unstructured NS,multi-GPUVratis ARAELLinear eqn solvers3x SolverToday,release 1.0Single-GPUPrometech ParticleworksMPS,particle CFD4x-9x TotalToday,re
4、lease 3.0Particle based,multi-GPUSandia NL S3DChemistry kernel8x SP,5x DP kernelDemonstrationStructured DNS,multi-GPUSD+(SU-Jameson)Explicit solver 15x TotalIn developmentFE unstructured NS,multi-GPUFEFLO(GMU Lohner)Explicit solver 2-10 x TotalIn developmentFE unstructured NS,multi-GPUTurbostreamExp
5、licit solver 19x TotalToday,release 2.0Structured grid NS,multi-GPUSimulation of fluid flow for product developmentSpeed of simulations is critical to this workHigher resolution of physics,more complex/real world geometries,better turbulence treatmentMore problems become practical with GPU accelerat
6、ionFluiDyna LBultra20 x acceleration with 4 GPUs vs.2 x 6 core CPUs CPU Intel Xeon X5670 2.93 GHz;GPU Tesla M2070GPU READY APPLICATIONSAltair AcuSolveAutodesk MoldflowFluiDyna Culises for OpenFOAMFluiDyna LBultraVratis SpeedIT for OpenFOAMPrometech ParticleworksSandia NL and ORNLS3DSD+(SU-Jameson)FE
7、FLO(GMU-Lohner)TurbostreamANSYS CFD preliminary results of radiation heat transfer view-factor computations on GPUs vs.CPUsRHT on GPUs will release in 14.0 as betaRadiation HT Applications:NOTE:Growing CPU time of view-factor computations inhibit proper inclusion of radiation HT effectsNOTE:GPU time
8、 remains low even as view-factor computations grow very largeANSYS CFD 14.0 Offers First GPU CapabilityUnderhood coolingCabin comfort HVACFurnace simulationsSolar loads on buildingsCombustor in turbineElectronics passive coolingOther ANSYS CFD Evaluations:Models(e.g.disperse phase)Implicit equation
9、solversOpenFOAM on GPUs ISVs FluiDyna and VratisSpeedIT:3 GPUs 6x vs.4 Core i7 CPU Prometech and Particle-based CFD for Multi-GPUsMPS-based method developed at the University of TokyoProf.Koshizuka Results shown for Particleworks 2.5 released in 2011Performance is relative to 4 cores of Intel i7 CPU
10、Contact Prometech for license detailshttp:/www.prometech.co.jp Turbostream CFD for Gas Turbine EnginesTurbostream Simulation Speed-up 19x19xwww.turbostream-|www.many-core.group.cam.ac.uk/ukgpucc2/talks/Brandvik.pdf|Sources:www.hpc.cam.ac.uk/services/darwin.html University of Cambridge DARWIN Cluster
11、CUDA Center of Excellence Since 2008GPU sub-cluster:Dell T5500 servers,32 dual-socket CPUsTesla S1070 GPUs,4 GPUs per socketfor total 128 GPUsSample Turbostream GPU SimulationsTypical Routine SimulationLarge-scale Simulation19x Speeduphttp:/www.turbostream- Turbostream GPU Simulations Tokyo Institut
12、e of Technology AOKI Laboratory CFD Research on#5 of Top 500 TSUBAME 2.0 Simulations that scale to 4000 Fermi GPUs Presentation at Supercomputing 2010 Conference:“Large-scale CFD Applications on TSUBAME 2”Dr.Takayuki Aoki,Global Scientific Information and Computing Center(GSIC)of Tokyo Institute of
13、Technology(Tokyo Tech)http:/ Highlights on CFD Applications From the Top 500FEFLO:Porting of an Edge-Based CFD Solver to GPUs AIAA-2010-0523 Andrew Corrigan,Ph.D.,Naval Research Lab;Rainald Lohner,Ph.D.,GMUFAST3D:Using GPU on HPC Applications to Satisfy Low Power Computational Requirement AIAA-2010-
14、0524 Gopal Patnaik,Ph.D.,US Naval Research LabOVERFLOW:Rotor Wake Modeling with a Coupled Eulerian and Vortex Particle Method AIAA-2010-0312 Chris Stone,Ph.D.,Intelligent LightSOLAR:Unstructured CFD Solver on GPUs Jamil Appa,Ph.D.,BAE Systems Advanced Technology CentreelsA:Recent Results with elsA o
15、n Many-Cores Michel Gazaix and Steve Champagneux,ONERA/Airbus FranceTurbostream:Turbostream:A CFD Solver for Many-Core Processors Tobias Brandvik,Ph.D.,Whittle Lab,University of CambridgeOVERFLOW:Acceleration of a CFD Code with a GPU Dennis Jespersen,NASA Ames Research Center48th AIAA Aerospace Scie
16、nces Meeting|Jan 2010|Orlando,FL,USACFD on Future Architectures|Oct 2009|DLR Braunschweig,DEParallel CFD 2009|May 2009|NASA Ames,Moffett Field,CA,USAPublished CFD Developments on Tesla GPU Total 110 technical papers:32 or 30%included GPU-developments,up from 12 papers in 2010(Taipei,TW)and 4 papers
17、in 2009(NASA,US)Included an invited full-day workshop on CUDA and GPUs for CFD Applications attended by more than 100 delegatesGPUs in talks from 6 of 7 plenary speakers:GPU-specific CFD:Aoki Tokyo Inst Tech,Lohner GMU,Barber BU GPU evaluation:Chalot Dassault Aviation,Gonzlez Next Limit,Jgerskpper D
18、LRGPUs Highlights from ParCFD 201123rd ParCFD 2011|16 20 May 2011|Barcelona,ESGPU Application Jameson-developed CFD software SD+for high order method aerodynamic simulationsGPU Benefit Use of 16 x Tesla M2070:15 hrs vs.202 hrs for 16 x Xeon X5670Fast turnaround of complex LES simulations that would
19、otherwise be impractical for CPU-only use Stanford UniversityAerospace Computing Lab Prof.Antony JamesonTransitional flow over SD70053 airfoil,21M DOF,Ma=.2,Re=60K,AoA=4,4th order,400K RK itersGPU Application SJTU-developed CFD software NUS3D for aerodynamic simulations of wing shapesGPU Benefit Use
20、 of Tesla C2070:20 x 37x vs.single core Intel core i7 CPU Faster simulations for more wing design candidates vs.wind tunnel testingExpanding to multi-GPU and full aircraft COMAC and SJTUCommercial Aircraft Corporation of China COMAC Wing CandidateONERA M6 WingCFD SimulationGPU Application BAE-develo
21、ped CFD software Veloxi for aerodynamic simulation of aircraftGPU Benefit Use of 2 x Tesla C2050:15x vs.QC Intel i7 CPUFaster simulations enabled design exploration of full aerodynamic envelopeGPU Speed-upvs.Multi-coreBAE SystemsTechnology and Engineering ServicesGPU Application NRL-developed CFD so
22、ftware JENRE for simulation of jet engine acousticsGPU Benefit Use of Tesla M2070:3x vs.Hex core Intel(Westmere)CPU More detailed mesh simulations possible for longer durations of jet engine transient conditionsU.S.DoD Naval Research LabLab for Computational Physics and Fluid DynamicsGPU Application
23、 EM Photonics-developed CFD software for unsteady aerodynamic simulationsGPU Benefit Use of Tesla C2070:54x for CFD kernel vs.8 core Intel i7 CPUsFast turnaround of simulations enables more flight conditions and aircraft approach directionsNAVAIR and EM PhotonicsU.S.DoD Naval Air Weapons Center,Pax River MD