https://github.com/atlarge-research/trace-archive

https://github.com/atlarge-research/trace-archive

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.4%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: atlarge-research
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: master
  • Size: 115 MB
Statistics
  • Stars: 0
  • Watchers: 9
  • Forks: 0
  • Open Issues: 1
  • Releases: 0
Created over 1 year ago · Last pushed over 1 year ago
Metadata Files
Readme License

README.md

Trace Archive

This repository contains traces of various real-world infrastructures, obtained from our partners and collaborators, or obtained from open-source resources. This repository is (constantly) work in progress, rapidly evolving, and we are continuously adding new traces.

The Trace Archive started as part of M3SA, a Multi- and Meta-Simulation and Analysis tool for ICT infrastructure. Double-blinded link: https://anonymous.4open.science/r/m3sa/.


Contents


Marconi-22

Marconi-22 is one of the most detailed open-source workload traces currently available. It traces scientific jobs, and also multiple operational layers and the electrical network of Marconi HPC, a powerful supercomputing facility hosted by CINECA in Italy [1,2].

Solvinity-13

Solvinity-13 is a business-critical workload hosted in a Solvinity datacenter[3] former BitBrains, a mid-sized cloud service provider in the Netherlands. Solvinity-13 has been used in highly cited, peer-reviewed publications on business-critical workloads[4]. The Solvinity-13 traces long-running jobs, with an average of 2,722\,CPU-hours per job.

SURF-22

SURF-22 is a scientific workload trace used in peer-reviewed experiments[5]. SURF-22 was traced in one of the largest HPC facilities at SURF, the Netherlands. It traces scientific jobs run in production, batch, % for a pre-determined amount of time, with an average duration of 39.52\,CPU-hours, and also detailed energy use over the same period.

ENTSOE-23

ENTSO-E Transparency Platform[6] consists of 40 European Transmission System Operators (TSOs), providing open source datasets about energy and sustainability metrics since 2015. In this work, we use traces from 29 countries from the ENTSO-E datasets.

References

[1] Marconi-22: https://zenodo.org/records/7590583

[2] A. Borghesi et al., "M100 ExaData: a data collection campaign on the CINECAs Marconi100 Tier-0 supercomputer," Nature Scientific Data, 2023.

[3] Solvinity: https://www.solvinity.com/

[4] S. Shen, et al., "Statistical Characterization of Business-Critical Workloads Hosted in Cloud Datacenters," CCGrid, 2015.

[5] D. Niewenhuis et al., "FootPrinter: Quantifying Data Center Carbon Footprint," ICPE, 2024.

[6] ENTSO-E https://www.entsoe.eu/

License

Trace Archive is distributed under the MIT license. See LICENSE.

Owner

  • Name: @Large Research
  • Login: atlarge-research
  • Kind: organization
  • Email: info@atlarge-research.com

Massivizing Computer Systems

GitHub Events

Total
  • Issues event: 1
  • Member event: 1
  • Push event: 12
  • Create event: 3
Last Year
  • Issues event: 1
  • Member event: 1
  • Push event: 12
  • Create event: 3