https://github.com/converged-computing/supermarket-fish-problem

What architecture do you get for your cloud instance? It's like white fish at the supermarket - maybe you don't know.

https://github.com/converged-computing/supermarket-fish-problem

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.4%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

What architecture do you get for your cloud instance? It's like white fish at the supermarket - maybe you don't know.

Basic Info
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created almost 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme License

README.md

The Supermarket Fish Problem

This is part of the Performance study and the single-node-benchmark analysis. The analyses afforded generation of a lot of intermediate data and a web interface, and were moved here for better organization.

DOI

Have you ever been to the supermarket and ordered white fish? You may be getting tilapia, flounder, branzino, catfish, cod, haddock, hake, halibut, pollock, sea bass, sole, or whiting. The same is true for cloud CPU architectures. You may know that you are getting some flavor of Intel, but it's unclear if it's Skylake, Icelake, Sandy Bridge, or some other flavor. We did a large performance study in August 2024 that looked across many different environments, clouds, and instance types, and can now reflect on what we found. In the case of finding a potpourri of architectures, we call this the supermarket fish problem.

Under development data processing is underway - a table will be added to each view!

TODO:

  • summary data file for each output file
  • table that summarizes each environment (with counts)
  • flags and bugs should have some kind of venn diagram that crosses spaces
  • sysbench metrics should be plots (not tables)
  • cpuinfo -> cpu MHz and bogomips also needs plots (values are all over the place)
  • Not sure if this is interesting, but data/azure/cyclecloud/cpu/256/node-0/raw/dmidecode has Core Enabled for each of 32 and 64.

Generate

Make some pngs (they render better in react):

bash for filename in $(find . -name machine.svg) do echo $filename directory=$(dirname $filename) outpng="$directory/machine.png" echo inkscape $filename -o $outpng inkscape $filename -o $outpng done

To generate data for the gallery:

bash python 1-generate-gallery.py

Note that I did manually add the index.html/script.js to each directory, and tweaked them (titles, dimensions) for each. This generates the table (requires pip install pandas):

bash python 2-generate-table.py

Again, I copy pasted the same table snippet into the UI that would read the data generated by the script.

Results

Here are some one off result images:

CPU Clock Speed

We can see that there is a hidden supermarket problem for AWS and clock speed. When a group doesn't show up (e.g., Google and Azure for many) it's because the values are all the same. I think these are the lines we see in the graph without color - they are histograms for one value.

Clock Speed CPU Size 32

web/img/clock-speeds-cpu-size-32.png

Clock Speed CPU Size 64

web/img/clock-speeds-cpu-size-64.png

Clock Speed CPU Size 128

web/img/clock-speeds-cpu-size-128.png

Clock Speed CPU Size 256

web/img/clock-speeds-cpu-size-256.png

Clock Speed GPU

web/img/clock-speeds-gpu.png

Max and Current Speeds

CPU Speeds

console CPU Size: 32 Max speed: 2000.0 for google-gke-cpu Max speed: 3725.0 for aws-eks-cpu Max speed: 3725.0 for aws-parallel-cluster-cpu Max speed: 3525.0 for azure-cyclecloud-cpu Max speed: 3525.0 for azure-aks-cpu CPU Size: 64 Max speed: 2000.0 for google-gke-cpu Max speed: 3725.0 for aws-eks-cpu Max speed: 3725.0 for aws-parallel-cluster-cpu Max speed: 3525.0 for azure-cyclecloud-cpu Max speed: 3525.0 for azure-aks-cpu CPU Size: 128 Max speed: 2000.0 for google-gke-cpu Max speed: 3725.0 for aws-eks-cpu Max speed: 3525.0 for azure-cyclecloud-cpu Max speed: 3525.0 for azure-aks-cpu CPU Size: 256 Max speed: 2000.0 for google-gke-cpu Max speed: 3725.0 for aws-eks-cpu Max speed: 3525.0 for azure-cyclecloud-cpu Max speed: 3525.0 for azure-aks-cpu console CPU Size: 32 Current speed: 2000.0 for google-gke-cpu Current speed: 2650.0 for aws-eks-cpu Current speed: 2650.0 for aws-parallel-cluster-cpu Current speed: 1850.0 for azure-cyclecloud-cpu Current speed: 1850.0 for azure-aks-cpu CPU Size: 64 Current speed: 2000.0 for google-gke-cpu Current speed: 2650.0 for aws-eks-cpu Current speed: 2650.0 for aws-parallel-cluster-cpu Current speed: 1850.0 for azure-cyclecloud-cpu Current speed: 1850.0 for azure-aks-cpu CPU Size: 128 Current speed: 2000.0 for google-gke-cpu Current speed: 2650.0 for aws-eks-cpu Current speed: 1850.0 for azure-cyclecloud-cpu Current speed: 1850.0 for azure-aks-cpu CPU Size: 256 Current speed: 2000.0 for google-gke-cpu Current speed: 2650.0 for aws-eks-cpu Current speed: 1850.0 for azure-cyclecloud-cpu Current speed: 1850.0 for azure-aks-cpu

GPU Speeds

console GPU Size: 4 Current speed: 2000.0 for google-gke-gpu Current speed: 2000.0 for google-compute-engine-gpu Current speed: 3700.0 for azure-cyclecloud-gpu Current speed: 3700.0 for azure-aks-gpu GPU Size: 8 Current speed: 2000.0 for google-gke-gpu Current speed: 2000.0 for google-compute-engine-gpu Current speed: 3500.0 for aws-eks-gpu Current speed: 3700.0 for azure-cyclecloud-gpu Current speed: 3700.0 for azure-aks-gpu GPU Size: 16 Current speed: 2000.0 for google-gke-gpu Current speed: 2000.0 for google-compute-engine-gpu Current speed: 3500.0 for aws-eks-gpu Current speed: 3700.0 for azure-cyclecloud-gpu GPU Size: 32 Current speed: 2000.0 for google-gke-gpu Current speed: 2000.0 for google-compute-engine-gpu Current speed: 3700.0 for azure-cyclecloud-gpu Current speed: 3700.0 2300.0 for azure-aks-gpu

Sysbench Plots

web/img/sysbench-cpu-run-cpu_speed_events_per_second-cpu.png web/img/sysbench-cpu-run-cpu_speed_events_per_second-gpu.png web/img/sysbench-cpu-run-latency_ms_95th_percentile-cpu.png web/img/sysbench-cpu-run-latency_ms_95th_percentile-gpu.png web/img/sysbench-cpu-run-latency_ms_avg-cpu.png web/img/sysbench-cpu-run-latency_ms_avg-gpu.png web/img/sysbench-cpu-run-latency_ms_max-cpu.png web/img/sysbench-cpu-run-latency_ms_max-gpu.png web/img/sysbench-cpu-run-latency_ms_min-cpu.png web/img/sysbench-cpu-run-latency_ms_min-gpu.png web/img/sysbench-cpu-run-latency_ms_sum-cpu.png web/img/sysbench-cpu-run-latency_ms_sum-gpu.png web/img/sysbench-cpu-run-total_number_events-cpu.png web/img/sysbench-cpu-run-total_number_events-gpu.png web/img/sysbench-fileio-run-file-test-modeseqwr-fsyncs_per_second-cpu.png web/img/sysbench-fileio-run-file-test-modeseqwr-fsyncs_per_second-gpu.png web/img/sysbench-fileio-run-file-test-modeseqwr-latency_ms_95th_percentile-cpu.png web/img/sysbench-fileio-run-file-test-modeseqwr-latency_ms_95th_percentile-gpu.png web/img/sysbench-fileio-run-file-test-modeseqwr-latency_ms_avg-cpu.png web/img/sysbench-fileio-run-file-test-modeseqwr-latency_ms_avg-gpu.png web/img/sysbench-fileio-run-file-test-modeseqwr-latency_ms_max-cpu.png web/img/sysbench-fileio-run-file-test-modeseqwr-latency_ms_max-gpu.png web/img/sysbench-fileio-run-file-test-modeseqwr-latency_ms_min-cpu.png web/img/sysbench-fileio-run-file-test-modeseqwr-latency_ms_min-gpu.png web/img/sysbench-fileio-run-file-test-modeseqwr-latency_ms_sum-cpu.png web/img/sysbench-fileio-run-file-test-modeseqwr-latency_ms_sum-gpu.png web/img/sysbench-fileio-run-file-test-modeseqwr-total_number_events-cpu.png web/img/sysbench-fileio-run-file-test-modeseqwr-total_number_events-gpu.png web/img/sysbench-fileio-run-file-test-modeseqwr-writes_per_second-cpu.png web/img/sysbench-fileio-run-file-test-modeseqwr-writes_per_second-gpu.png web/img/sysbench-fileio-run-file-test-modeseqwr-written_mib_per_second-cpu.png web/img/sysbench-fileio-run-file-test-modeseqwr-written_mib_per_second-gpu.png web/img/sysbench-mutex-run-latency_ms_95th_percentile-cpu.png web/img/sysbench-mutex-run-latency_ms_95th_percentile-gpu.png web/img/sysbench-mutex-run-latency_ms_avg-cpu.png web/img/sysbench-mutex-run-latency_ms_avg-gpu.png web/img/sysbench-mutex-run-latency_ms_max-cpu.png web/img/sysbench-mutex-run-latency_ms_max-gpu.png web/img/sysbench-mutex-run-latency_ms_min-cpu.png web/img/sysbench-mutex-run-latency_ms_min-gpu.png web/img/sysbench-mutex-run-latency_ms_sum-cpu.png web/img/sysbench-mutex-run-latency_ms_sum-gpu.png web/img/sysbench-threads-run-latency_ms_95th_percentile-cpu.png web/img/sysbench-threads-run-latency_ms_95th_percentile-gpu.png web/img/sysbench-threads-run-latency_ms_avg-cpu.png web/img/sysbench-threads-run-latency_ms_avg-gpu.png web/img/sysbench-threads-run-latency_ms_max-cpu.png web/img/sysbench-threads-run-latency_ms_max-gpu.png web/img/sysbench-threads-run-latency_ms_min-cpu.png web/img/sysbench-threads-run-latency_ms_min-gpu.png web/img/sysbench-threads-run-latency_ms_sum-cpu.png web/img/sysbench-threads-run-latency_ms_sum-gpu.png web/img/sysbench-threads-run-total_number_events-cpu.png web/img/sysbench-threads-run-total_number_events-gpu.png

License

HPCIC DevTools is distributed under the terms of the MIT license. All new contributions must be made under this license.

See LICENSE, COPYRIGHT, and NOTICE for details.

SPDX-License-Identifier: (MIT)

LLNL-CODE- 842614

Owner

  • Name: Converged Computing
  • Login: converged-computing
  • Kind: organization

The best of cloud and high performance computing: technology and community combined.

GitHub Events

Total
  • Delete event: 1
  • Push event: 4
  • Pull request event: 4
  • Create event: 2
Last Year
  • Delete event: 1
  • Push event: 4
  • Pull request event: 4
  • Create event: 2