Note that “compute” is not a simple metric like “instructions” or “floating-point operations”. The assumption that “memory accesses overlap with other memory accesses about as well as they do in the STREAM Benchmark” is based on trying lots of other formulations and getting poor consistency with observations. In most current machines, memory frequency can be configured by the user at boot time, while CPU frequency can be varied on a live system. Over time, the mechanisms introduced for reducing energy consumption (first in laptops) became available more broadly. When I started this work over 20 years ago, most of the parameters I was varying could only be changed in the vendor’s laboratory. Bounds on overlap are discussed near the end of the presentation, in the slides titled “Analysis”. The specific models and results that will be presented here fit the data better when it is assumed that there is no overlap between components. Why no overlap? The model actually includes some kinds of overlap - this will be discussed in the context of specific models below - an can be extended to include overlap between components. The model can provide much better fit to the data if the HW and SW are restricted, as we will see in the next section… Here I assumed a particular analytical function for the amount of memory traffic as a function of cache size to scale the bandwidth time.ĭetails are not particularly important since I am trying to model something that is a geometric mean of 14 individual values and the results are across many architectures and compilers.ĭoing separate models for the 14 benchmarks does not reduce the variance much further – there is about 15% that remains unexplainable in such a broad dataset.
![xps peak fell off element data range xps peak fell off element data range](https://venturebeat.com/wp-content/uploads/2020/04/xilinx-Versal_Angled_Reflection.jpg)
(It is better than peak MFLOPS, but still has roughly a factor of three range when projecting in either direction.) (Also from the 2007 presentation.) To show that I can criticize my own work as well, here I show that sustained memory bandwidth (using an approximation to the STREAM Benchmark) is also inadequate as a single figure of metric. It is not surprising that there is a lot of scatter, but the factor of four range in Peak MFLOPS at fixed SPECfp_rate2000/core and the factor of four range in SPECfp_rate2000/core at fixed Peak MFLOPS was higher than I expected….
![xps peak fell off element data range xps peak fell off element data range](https://ars.els-cdn.com/content/image/1-s2.0-S0079642519300738-gr26.jpg)
This includes all architectures, all compilers, all operating systems, and all system configurations. All of the SPECfp_rate2000 results were downloaded from the results were sorted by processor type, and “peak floating-point operations per cycle” was manually added for each processor type. With a staff of slightly over 175 full-time employees (less than 1/2 in consulting roles), we must therefore focus on highly-leveraged performance analysis projects, rather than labor-intensive ones.Īn earlier presentation on this topic (including extensions of the method to incorporate cost modeling) is from 2007: “System Performance Balance, System Cost Balance, Application Balance, & the SPEC CPU2000/CPU2006 Benchmarks” (invited presentation at the SPEC Benchmarking Joint US/Europe Colloquium, June 22, 2007, Dresden, Germany. We have over 5,000 direct users (logging into the systems and running jobs) and tens of thousands of indirect users (who access TACC resources via web portals). Most of TACC’s supercomputer systems are national resources, open to (unclassified) scientific research in all areas. The presentation discusses a family of simple performance models that I developed over the last 20 years - originally in support of processor and system design at SGI (1996-1999), IBM (1999-2005), and AMD (2006-2008), but more recently in support of system procurements at The Texas Advanced Computing Center (TACC) (2009-present).
![xps peak fell off element data range xps peak fell off element data range](https://media.springernature.com/full/springer-static/image/art%3A10.1038%2Fs41467-020-16266-w/MediaObjects/41467_2020_16266_Fig1_HTML.png)
This was a keynote presentation at the “2nd International Workshop on Performance Modeling: Methods and Applications” (PMMA16), June 23, 2016, Frankfurt, Germany (in conjunction with ISC16).