emissioncommonds

Measuring the carbon footprint of common data science

https://github.com/bgmeulem/emissioncommonds

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (2.2%) to scientific vocabulary
Last synced: 8 months ago · JSON representation ·

Repository

Measuring the carbon footprint of common data science

Basic Info
  • Host: GitHub
  • Owner: bgmeulem
  • License: other
  • Language: Python
  • Default Branch: main
  • Size: 54.7 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 3 years ago · Last pushed about 3 years ago
Metadata Files
Readme License Citation

README.md

EmissionCommonDS

Measuring the carbon footprint of common data science

run automated data science project (in some virtual python environment) by running console sudo -E PATH=$PATH ./run_automated_DS_project.sh [suffix] [-sample]

Owner

  • Name: Bjorge Meulemeester
  • Login: bgmeulem
  • Kind: user
  • Location: Bonn, Germany
  • Company: @mpinb

Doctoral researcher at Max-Planck Institute for Neurobiology of Behaviour (MPINB).

Citation (citation.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: >-
  How sustainable is “common” data science in terms of power
  consumption?
message: 'If you use this software, please cite it as below.'
type: software
authors:
  - family-names: Meulemeester
    given-names: Bjorge
    orcid: 'https://orcid.org/0000-0003-3935-2006'
  - family-names: Martens
    given-names: David
    orcid: 'https://orcid.org/0000-0001-8397-2937'
identifiers:
  - type: doi
    value: 10.1016/j.suscom.2023.100864
    description: >-
      Published in Sustainable Computing: Informatics and
      Systems
url: 'https://github.com/bgmeulem/EmissionCommonDS'
abstract: >-
  Continuous developments in data science have brought forth
  an exponential increase in complexity of machine learning
  models. Additionally, data scientists have become
  ubiquitous in the private market and academic
  environments. All of these trends are on a steady rise,
  and are associated with an increase in power consumption
  and associated carbon footprint. The increasing carbon
  footprint of large-scale advanced data science has already
  received attention, but the latter trend has not. This
  work aims to estimate the contribution of the increasingly
  popular “common” data science to the global carbon
  footprint. To this end, the power consumption of several
  typical tasks in the aforementioned common data science
  tasks are measured and compared to: large-scale “advanced”
  data science, common computer-related tasks, and everyday
  non-computer related tasks. An automated data science
  project is also run on various hardware architectures. To
  assess its sustainability in terms of carbon emission, the
  measurements are converted to and an equivalent unit of
  “km driven by car”. Our main findings are: “common” data
  science consumes 2.57 more power than regular computer
  usage, but less than some common everyday power-consuming
  tasks such as lighting or heating; advanced data science
  consumes substantially more power than common data
  science, and can be either on par or vastly surpass common
  everyday power-consuming tasks, depending on the scale of
  the project. In addition to the reporting of these
  results, this work also aims to inspire researchers to
  include power usage and estimated carbon emission as a
  secondary result in their work.
keywords:
  - ai
  - sustainable
  - carbon
  - emission
  - co2
  - data science
license: CC-BY-NC-ND-4.0
version: 2.0.4
date-released: '2022-09-06'

GitHub Events

Total
Last Year

Dependencies

requirements.txt pypi
  • carbontracker *
  • category_encoders *
  • imblearn *
  • matplotlib *
  • numpy *
  • openpyxl *
  • pandas *
  • scikit-learn *
  • seaborn *
  • sklearn *