carbontracker

Track and predict the energy consumption and carbon footprint of training deep learning models.

https://github.com/lfwa/carbontracker

Science Score: 46.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
    1 of 7 committers (14.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.8%) to scientific vocabulary
Last synced: 7 months ago · JSON representation

Repository

Track and predict the energy consumption and carbon footprint of training deep learning models.

Basic Info
  • Host: GitHub
  • Owner: lfwa
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 2.86 MB
Statistics
  • Stars: 457
  • Watchers: 13
  • Forks: 36
  • Open Issues: 18
  • Releases: 7
Created almost 6 years ago · Last pushed 7 months ago
Metadata Files
Readme Contributing License

README.md

carbontracker

Build PyPI Python Unit Tests License Downloads

Website

About

carbontracker is a tool for tracking and predicting the energy consumption and carbon footprint of training deep learning models as described in Anthony et al. (2020).

Citation

Kindly cite our work if you use carbontracker in a scientific publication: @misc{anthony2020carbontracker, title={Carbontracker: Tracking and Predicting the Carbon Footprint of Training Deep Learning Models}, author={Lasse F. Wolff Anthony and Benjamin Kanding and Raghavendra Selvan}, howpublished={ICML Workshop on Challenges in Deploying and monitoring Machine Learning Systems}, month={July}, note={arXiv:2007.03051}, year={2020}} _

Installation

PyPi

pip install carbontracker

Basic usage

Command Line Mode

Wrap any of your scripts (python, bash, etc.):

carbontracker python script.py

Embed into Python Scripts

Required arguments

  • epochs: Total epochs of your training loop. #### Optional arguments
  • epochs_before_pred (default=1): Epochs to monitor before outputting predicted consumption. Set to -1 for all epochs. Set to 0 for no prediction.
  • monitor_epochs (default=1): Total number of epochs to monitor. Outputs actual consumption when reached. Set to -1 for all epochs. Cannot be less than epochs_before_pred or equal to 0.
  • update_interval (default=10): Interval in seconds between power usage measurements are taken.
  • interpretable (default=True): If set to True then the CO2eq are also converted to interpretable numbers such as the equivalent distance travelled in a car, etc. Otherwise, no conversions are done.
  • stop_and_confirm (default=False): If set to True then the main thread (with your training loop) is paused after epochs_before_pred epochs to output the prediction and the user will need to confirm to continue training. Otherwise, prediction is output and training is continued instantly.
  • ignore_errors (default=False): If set to True then all errors will cause energy monitoring to be stopped and training will continue. Otherwise, training will be interrupted as with regular errors.
  • components (default="all"): Comma-separated string of which components to monitor. Options are: "all", "gpu", "cpu", or "gpu,cpu".
  • devices_by_pid (default=False): If True, only devices (under the chosen components) running processes associated with the main process are measured. If False, all available devices are measured (see Section 'Notes' for jobs running on SLURM or in containers). Note that this requires your devices to have active processes before instantiating the CarbonTracker class.
  • log_dir (default=None): Path to the desired directory to write log files. If None, then no logging will be done.
  • log_file_prefix (default=""): Prefix to add to the log file name.
  • verbose (default=1): Sets the level of verbosity.
  • decimal_precision (default=6): Desired decimal precision of reported values.
  • sim_cpu (default=None): Name of the simulated CPU. If set, will use simulated CPU power measurements.
  • sim_cpu_tdp (default=None): Thermal Design Power (TDP) in Watts for the simulated CPU. Required if sim_cpu is set.
  • sim_cpu_util (default=None): CPU utilization factor between 0 and 1. If not set, defaults to 0.5 (50% utilization).
  • sim_gpu (default=None): Name of the simulated GPU. If set, will use simulated GPU power measurements.
  • sim_gpu_watts (default=None): Power consumption in Watts for the simulated GPU. Required if sim_gpu is set.
  • sim_gpu_util (default=None): GPU utilization factor between 0 and 1. If not set, defaults to 0.5 (50% utilization).

Example usage

```python from carbontracker.tracker import CarbonTracker

tracker = CarbonTracker(epochs=max_epochs)

Training loop.

for epoch in range(maxepochs): tracker.epochstart()

# Your model training.

tracker.epoch_end()

Optional: Add a stop in case of early termination before all monitor_epochs has

been monitored to ensure that actual consumption is reported.

tracker.stop() ```

Example output

Default settings

CarbonTracker: Actual consumption for 1 epoch(s): Time: 0:00:10 Energy: 0.000038 kWh CO2eq: 0.003130 g This is equivalent to: 0.000026 km travelled by car CarbonTracker: Predicted consumption for 1000 epoch(s): Time: 2:52:22 Energy: 0.038168 kWh CO2eq: 4.096665 g This is equivalent to: 0.034025 km travelled by car CarbonTracker: Finished monitoring.

verbose=2

CarbonTracker: The following components were found: CPU with device(s) cpu:0. CarbonTracker: Average carbon intensity during training was 82.00 gCO2/kWh at detected location: Copenhagen, Capital Region, DK. CarbonTracker: Actual consumption for 1 epoch(s): Time: 0:00:10 Energy: 0.000041 kWh CO2eq: 0.003357 g This is equivalent to: 0.000028 km travelled by car CarbonTracker: Carbon intensity for the next 2:59:06 is predicted to be 107.49 gCO2/kWh at detected location: Copenhagen, Capital Region, DK. CarbonTracker: Predicted consumption for 1000 epoch(s): Time: 2:59:06 Energy: 0.040940 kWh CO2eq: 4.400445 g This is equivalent to: 0.036549 km travelled by car CarbonTracker: Finished monitoring.

Parsing log files

Aggregating log files

carbontracker supports aggregating all log files in a specified directory to a single estimate of the carbon footprint.

Example usage

```python from carbontracker import parser

parser.printaggregate(logdir="./mylogdirectory/") ```

Example output

The training of models in this work is estimated to use 4.494 kWh of electricity contributing to 0.423 kg of CO2eq. This is equivalent to 3.515 km travelled by car. Measured by carbontracker (https://github.com/lfwa/carbontracker).

Convert logs to dictionary objects

Log files can be parsed into dictionaries using parser.parse_all_logs() or parser.parse_logs().

Example usage

```python from carbontracker import parser

logs = parser.parsealllogs(logdir="./logs/") firstlog = logs[0]

print(f"Output file name: {firstlog['outputfilename']}") print(f"Standard file name: {firstlog['standardfilename']}") print(f"Stopped early: {firstlog['earlystop']}") print(f"Measured consumption: {firstlog['actual']}") print(f"Predicted consumption: {firstlog['pred']}") print(f"Measured GPU devices: {first_log['components']['gpu']['devices']}") ```

Example output

Output file name: ./logs/2020-05-17T19:02Z_carbontracker_output.log Standard file name: ./logs/2020-05-17T19:02Z_carbontracker.log Stopped early: False Measured consumption: {'epochs': 1, 'duration (s)': 8.0, 'energy (kWh)': 6.5e-05, 'co2eq (g)': 0.019201, 'equivalents': {'km travelled by car': 0.000159}} Predicted consumption: {'epochs': 3, 'duration (s)': 25.0, 'energy (kWh)': 1000.000196, 'co2eq (g)': 10000.057604, 'equivalents': {'km travelled by car': 10000.000478}} Measured GPU devices: ['Tesla T4']

Compatibility

carbontracker is compatible with: - NVIDIA GPUs that support NVIDIA Management Library (NVML) - Intel CPUs that support Intel RAPL - Slurm - Google Colab / Jupyter Notebook

Notes

Availability of GPUs and Slurm

  • Available GPU devices are determined by first checking the environment variable CUDA_VISIBLE_DEVICES (only if devices_by_pid=False otherwise we find devices by PID). This ensures that for Slurm we only fetch GPU devices associated with the current job and not the entire cluster. If this fails we measure all available GPUs.
  • NVML cannot find processes for containers spawned without --pid=host. This affects the device_by_pids parameter and means that it will never find any active processes for GPUs in affected containers.

Extending carbontracker

See CONTRIBUTING.md.

Star History

Star History Chart

carbontracker in media

  • Official press release from University of Copenhagen can be obtained here: en da

  • Carbontracker has recieved some attention in popular science forums within, and outside of, Denmark [1][2][3][4][5][6][7][8]

Owner

  • Login: lfwa
  • Kind: user
  • Location: Denmark/Switzerland

GitHub Events

Total
  • Create event: 18
  • Release event: 2
  • Issues event: 10
  • Watch event: 69
  • Delete event: 9
  • Member event: 1
  • Issue comment event: 8
  • Push event: 37
  • Pull request event: 14
  • Fork event: 6
Last Year
  • Create event: 18
  • Release event: 2
  • Issues event: 10
  • Watch event: 69
  • Delete event: 9
  • Member event: 1
  • Issue comment event: 8
  • Push event: 37
  • Pull request event: 14
  • Fork event: 6

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 187
  • Total Committers: 7
  • Avg Commits per committer: 26.714
  • Development Distribution Score (DDS): 0.695
Past Year
  • Commits: 48
  • Committers: 5
  • Avg Commits per committer: 9.6
  • Development Distribution Score (DDS): 0.417
Top Committers
Name Email Commits
Lasse l****y@g****m 57
Rasmus Hag Løvstad r****d@g****m 44
kanding b****2@l****k 39
Pedram Bakh 5****h 31
Raghav r****v@d****k 14
Laurențiu Nicola l****a 1
Andreas Fehlner f****r@a****e 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 67
  • Total pull requests: 48
  • Average time to close issues: 12 months
  • Average time to close pull requests: 8 days
  • Total issue authors: 25
  • Total pull request authors: 9
  • Average comments per issue: 1.93
  • Average comments per pull request: 0.06
  • Merged pull requests: 37
  • Bot issues: 0
  • Bot pull requests: 1
Past Year
  • Issues: 5
  • Pull requests: 28
  • Average time to close issues: 20 days
  • Average time to close pull requests: 1 day
  • Issue authors: 5
  • Pull request authors: 5
  • Average comments per issue: 0.8
  • Average comments per pull request: 0.07
  • Merged pull requests: 20
  • Bot issues: 0
  • Bot pull requests: 1
Top Authors
Issue Authors
  • lfwa (21)
  • leondz (8)
  • raghavian (7)
  • kanding (3)
  • lnicola (3)
  • PedramBakh (2)
  • Princec711 (2)
  • LukasHedegaard (2)
  • Snailed (2)
  • mpariente (2)
  • Teamsusai (1)
  • tuanaqeelbohoran (1)
  • HishamAbulfeilat (1)
  • sagnik (1)
  • ib31iat (1)
Pull Request Authors
  • Snailed (37)
  • yoviny (2)
  • andife (2)
  • DaniG2106 (2)
  • lfwa (1)
  • leondz (1)
  • dependabot[bot] (1)
  • snehaaprabhu (1)
  • lnicola (1)
Top Labels
Issue Labels
enhancement (22) bug (22) question (3) API (3) help wanted (1) duplicate (1)
Pull Request Labels
enhancement (1) dependencies (1) github_actions (1)

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 1,984 last-month
  • Total docker downloads: 41
  • Total dependent packages: 2
    (may contain duplicates)
  • Total dependent repositories: 4
    (may contain duplicates)
  • Total versions: 44
  • Total maintainers: 3
pypi.org: carbontracker

Tracking and predicting the carbon footprint of training deep learning models.

  • Documentation: https://carbontracker.readthedocs.io/
  • License: MIT License Copyright (c) 2020 Lasse F. Wolff Anthony & Benjamin Kanding Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
  • Latest release: 2.3.1
    published 10 months ago
  • Versions: 29
  • Dependent Packages: 2
  • Dependent Repositories: 4
  • Downloads: 1,984 Last month
  • Docker Downloads: 41
Rankings
Dependent packages count: 3.2%
Docker downloads count: 3.3%
Average: 5.1%
Downloads: 6.4%
Dependent repos count: 7.5%
Maintainers (3)
Last synced: 7 months ago
proxy.golang.org: github.com/lfwa/carbontracker
  • Versions: 15
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.4%
Average: 5.6%
Dependent repos count: 5.8%
Last synced: 7 months ago

Dependencies

.github/workflows/publish.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite
  • pypa/gh-action-pypi-publish v1.4.2 composite
.github/workflows/test.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite
pyproject.toml pypi
  • geocoder *
  • importlib-metadata *
  • numpy *
  • pandas *
  • psutil *
  • pynvml *
  • requests *