BigDataBench 大数据和AI基准测试程序集.pptx_163文库

资源描述

1、大数据分析与生态系统论坛BigDataBench: 大数据和AI基准测试程序集大数据和BigDataBench:AI基准测试程序集1. 背景2. 基准测试基本原理3. 基准测试方法学4. 大数据和AI基准测试程序集： BigDataBenchBigDataBench: 大数据和AI基准测试程序集技术变革的基础. Technology. End of Dennard scaling: power becomes the key constraint. Ending of Moores Law: transistors improvement slows. Architectural. Limit

2、ation and inefficiencies in exploiting instruction levelparallelism end the uniprocessor era in 2004. Amdahls Law and its implications end “easy” multicore era. Products. PC/Server IoT, Mobile/CloudA New Golden Age for Computer Architecture: Domain-Specific Hardware/Software Co-Design,Enhanced Secur

3、ity, Open Instruction Sets, and Agile Chip Development. John Hennessy and DavidPatterson, Stanford and UC Berkeley. June 4, 2018BigDataBench: 大数据和AI基准测试程序集技术变革的机遇. Software-centric. Modern scripting languages are interpreted,dynamically-typed and encourage reuse. Efficient for programmers but not fo

4、r execution. Hardware-centric. Only path left is Domain Specific Architectures. Just do a few tasks, but extremely well. Combination. Domain Specific Languages & ArchitecturesA New Golden Age for Computer Architecture: Domain-Specific Hardware/Software Co-Design,Enhanced Security, Open Instruction S

5、ets, and Agile Chip Development. John Hennessy and DavidPatterson, Stanford and UC Berkeley. June 4, 2018BigDataBench: 大数据和AI基准测试程序集面临的关键问题 Understanding workloads Domain-specific hardware & Software co-design Open-source softwares/ hardwaresA New Golden Age for Computer Architecture: Domain-Specifi

6、c Hardware/Software Co-Design,Enhanced Security, Open Instruction Sets, and Agile Chip Development. John Hennessy and DavidPatterson, Stanford and UC Berkeley. June 4, 2018BigDataBench: 大数据和AI基准测试程序集1. 背景2. 基准测试基本原理3. 基准测试方法学4. 大数据和AI基准测试程序集： BigDataBenchBigDataBench: 大数据和AI基准测试程序集基准测试（Benchmark）“ T

7、he process of running a specificprogram or workload on a specific machineor system and measuring the resultingperformance .”Saavedra, R. H., Smith, A. J.: Analysis of benchmark characteristics andbenchmark performance prediction, ACM Transactions on Computer System,vol. 14, no. 4, (1996) 344-384BigD

8、ataBench: 大数据和AI基准测试程序集基准测试集（Benchmark Suite） A popular measure of performance with avariety of applications To overcome the danger of placing too manyeggs in one basket the weakness of any one benchmark is lessenedby the presence of the other benchmarks characterize the relative performance e.g. EE

9、MBC, SPECBigDataBench: 大数据和AI基准测试程序集基准测试集的构建RelevantGoodBenchmarkPortableScalableSimpleBigDataBench: 大数据和AI基准测试程序集TPC系列基准测试程序集The Transaction Processing Performance Council Domain specific TPC Benchmarks: talked by Charles Levine at 1997 No single metric possible The more general the benchmark, the

10、less useful it is for anythingin par ticular. A benchmark is a distillation of the essential attributes of aworkload Principles Charles Levine: TPC-C: The OLTP Benchmark, Sigmod, 1997 Relevant meaningful within the target domain Simple Good metric(s) linear, orthogonal, monotonic Portable applicable

11、 to a broad spectrum ofhardware/architecture Coverage does not oversimplify the typical environment Acceptance Vendors and Users embrace itBigDataBench: 大数据和AI基准测试程序集SPEC系列基准测试程序集Systems Performance Evaluation Cooperative Principles Application-oriented test “real-life” situations Portability writte

12、n in a platform neutral programminglanguage Repeatable and reliable Consistency and fairness each specification mustdefine clear rules for executing and reporting resultsBigDataBench: 大数据和AI基准测试程序集PARSEC基准测试程序集A parallel benchmark suite for multiprocessors Principles: flexibility and easy to use Aut

13、omatization single, common interface Modularity simply handling Abstraction abstract from details Encapsulation details encapsulated in standardizedconfiguration files Logging logging important information for recreation-CHRISTIAN BIENIA: Benchmarking Modern Multiprocessors, 2011BigDataBench: 大数据和AI

14、基准测试程序集大数据基准测试程序集 Proposed by Big Data BenchmarkingCommunity (http:/clds.sdsc.edu/bdbc) simple to implement and execute Cost effective Timely? not fully understood VerifiableBigDataBench: 大数据和AI基准测试程序集1. 背景2. 基准测试基本原理3. 基准测试方法学4. 大数据和AI基准测试程序集： BigDataBenchBigDataBench: 大数据和AI基准测试程序集基准测试程序的构建方法 Top-

15、down: representative program selection can yield accurate representations of the program space of interest usually impossible to make any form of hard statements about therepresentativeness Bottom-up: diverse range of characteristics program characteristics are quantities that can be measured andcom

16、pared not all portions of the characteristics space are equally important- C. Bienia. Benchmarking modern multiprocessors. PhDthesis, Princeton University, 2011.BigDataBench: 大数据和AI基准测试程序集TPC-C 构建方法学 Functions of Abstraction a mid-weight read-write trans- action (i.e., New-Order) a light-weight read

17、-write transaction (i.e., Payment) a mid-weight read-only transaction (i.e., Order-Status) a batch of mid-weight read-write transactions (i.e., Delivery) a heavy-weight read-only transaction (i.e., Stock-Level) Functional Workload Model captures in an implementation-independent manner the loadthat t

18、he system needs to serviceBigDataBench: 大数据和AI基准测试程序集关系代数的原语抽象 Relational AlgebraS Five primitive androjfundamental operators Theoretical foundationdatabase Strong expression powerUnion Compose complexqueriesfereFrom E. F. Codd, A relational Model of Data for Large shared data banks. Communication o

19、f ACM, vol 13. no.6,1970BigDataBench: 大数据和AI基准测试程序集数据计算的抽象 Seven motifs would be important for thenext decade7“Motifs”Unstru-cturedGridsStructu-redGridsPhillip Colella proposedSimulation in the physicalciences is done out usingvarious combinations ofthe following corealgorithmsFFTParticlesalgebraSpa

20、rselinearalgebraMonteCarlo distinctive combinationof computation and dataaccessFrom P . Colella, “Defining software requirements for scientific computing,” 2004.BigDataBench: 大数据和AI基准测试程序集并行计算的抽象 Landscape of Parallel ComputingRe13 dwarfsUnstru-cturedGridsBacktrackandbranchboundDynamicprogrammingStr

21、uctu-redGridsN-Bodymethod Berkeley research group Define building blocks forcreating libraries & frameworks A pattern of computation andommunicationDenseSpectralmethodCombinationlogiclinearalgebramSparselinearalgebraFinitestatemachineMonteCarloGraphtraversalFrom K. Asanovic, R. Bodik, B. C. Catanzar

22、o, J. J. Gebis, P . Husbands, K. Keutzer, D. A. Patterson, et al, “The landscape of parallelcomputing research: A view from berkeley,” tech. rep., Technical Report UCB/EECS-2006-183, EECS Department, University ofCalifornia, Berkeley, 2006.BigDataBench: 大数据和AI基准测试程序集1. 背景2. 基准测试基本原理3. 基准测试方法学4. 大数据和

23、AI基准测试程序集： BigDataBenchBigDataBench: 大数据和AI基准测试程序集算法分析：SIFT Workloads21BigDataBench: 大数据和AI基准测试程序集算法分析： AlexNet22BigDataBench: 大数据和AI基准测试程序集Data motif Data Motif: abstractions of time-consuming units ofcomputation Eight classes of units of computation The impacts of data type, source, size, patternR

24、un time breakdownA pipeline of units ofcomputationAnalysisStatisticSummarizeInitial or intermediatedata inputsBig Data & AIWanling Gao, Jianfeng Zhan, Lei Wang, et al. Data Motifs: A Lens Towards Fully Understanding Big Dataand AI Workloads. PACT18.BigDataBench: 大数据和AI基准测试程序集Data motif的构建 Data Motif

25、 Big data and AI workloads Units of computation Methodology Algorithmic analysis Profiling analysisBigDataBench: 大数据和AI基准测试程序集Data Motifs抽取方法 40+ algorithms with a broad spectrum Data mining/Machine learning Natural language processing Computer visionOperations DescriptionMatrixMatrix/Vector operati

26、onsSamplingSelecting a subset samples according tocertain statistical population BioinformaticsLogicBit manipulation operationsFFT, DCT, Wavelet transformUnion, intersection, complementTransformSetGraphGraph-theoretical computations, i.e.graph traversalStatisticSortStatistical computationsSorting th

27、e elements in a certain orderBigDataBench: 大数据和AI基准测试程序集Data Motifs：矩阵计算（Matrix computation） Operations on one/multiple rectangulararrays of numbers or other objects Vector-vector Vector-matrix Matrix-matrixBigDataBench: 大数据和AI基准测试程序集Data Motifs：采样操作（ Sampling ） The selection of a subset of origina

28、l data. Random sampling Importance sampling Acceptance sampling Monte Carlo samplingBigDataBench: 大数据和AI基准测试程序集Data Motifs：变换操作（Transform computation） Equation from its original domain intoanother domain. Fourier transform Laplace transformBigDataBench: 大数据和AI基准测试程序集Data Motifs：图计算（Graph computation

29、） Nodes represent entities and edgesrepresent dependencies. Community Detection PageRankBigDataBench: 大数据和AI基准测试程序集Data Motifs：逻辑计算（Logic computation） Bit manipulation AND OR XORBigDataBench: 大数据和AI基准测试程序集Data Motifs：集合操作（Set computation） Operations on one/multiple collection ofdistinct objects Set

30、theory Union Intersection Complement Similarity analysisBigDataBench: 大数据和AI基准测试程序集Data Motifs：排序（ Sort ） Sorting algorithm that puts elements of alist in a certain order Top-K Sort Memory sort External sort Sort algorithms QuickSort BubbleSortBigDataBench: 大数据和AI基准测试程序集Data Motifs：统计操作（Basic statis

31、tic computation） Data models and statistics Probability distribution Count statistics Time-series analysisBigDataBench: 大数据和AI基准测试程序集Data Motif 实现 Multiple software stacks Hadoop, Spark, TensorFlow, PthreadsBigDataBench: 大数据和AI基准测试程序集BigDataBench构造方法学 Data motif-based Scalable Methodology Micro Benc

32、hmark-Single data motif Component Benchmark-Data motif combinationwith different weights Application Benchmark-End-to-end applicationmodelBigDataBench: 大数据和AI基准测试程序集BigDataBench 4.0Unified Big Data and AI Benchmark Suite - http:/ BenchmarkComponent Benchmark100X Runtime Speedup90%+ Average AccuracyC

33、rossDataConfigurationAdaptabilityApplication BenchmarkAdaptabilityArchitectureLarge scale system-level benchmarksProxy benchmarks for simulationOfflineAnalyticsOnlineServiceTableTextGraphMatrixSemi-ImageAudioAIGraphDataWarehouseStructuredUn-structuredStreamingNoSQLstructuredReal-world dataset and data generation tools 47 Workloads covering 7 typesMPINoSqlImpalaShark16 Software stackDataMPIHadoop RDMABigDataBench: 大数据和AI基准测试程序集Micro BenchmarksBigDataBench: 大数据和AI基准测试程序集Component Benchmarks谢谢！

展开阅读全文