emissioncommonds
Measuring the carbon footprint of common data science
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (2.2%) to scientific vocabulary
Last synced: 8 months ago
·
JSON representation
·
Repository
Measuring the carbon footprint of common data science
Basic Info
- Host: GitHub
- Owner: bgmeulem
- License: other
- Language: Python
- Default Branch: main
- Size: 54.7 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Created over 3 years ago
· Last pushed about 3 years ago
Metadata Files
Readme
License
Citation
README.md
EmissionCommonDS
Measuring the carbon footprint of common data science
run automated data science project (in some virtual python environment) by running
console
sudo -E PATH=$PATH ./run_automated_DS_project.sh [suffix] [-sample]
Owner
- Name: Bjorge Meulemeester
- Login: bgmeulem
- Kind: user
- Location: Bonn, Germany
- Company: @mpinb
- Website: https://mpinb.mpg.de/en/research-groups/groups/in-silico-brain-sciences/group-members.html
- Repositories: 2
- Profile: https://github.com/bgmeulem
Doctoral researcher at Max-Planck Institute for Neurobiology of Behaviour (MPINB).
Citation (citation.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: >-
How sustainable is “common” data science in terms of power
consumption?
message: 'If you use this software, please cite it as below.'
type: software
authors:
- family-names: Meulemeester
given-names: Bjorge
orcid: 'https://orcid.org/0000-0003-3935-2006'
- family-names: Martens
given-names: David
orcid: 'https://orcid.org/0000-0001-8397-2937'
identifiers:
- type: doi
value: 10.1016/j.suscom.2023.100864
description: >-
Published in Sustainable Computing: Informatics and
Systems
url: 'https://github.com/bgmeulem/EmissionCommonDS'
abstract: >-
Continuous developments in data science have brought forth
an exponential increase in complexity of machine learning
models. Additionally, data scientists have become
ubiquitous in the private market and academic
environments. All of these trends are on a steady rise,
and are associated with an increase in power consumption
and associated carbon footprint. The increasing carbon
footprint of large-scale advanced data science has already
received attention, but the latter trend has not. This
work aims to estimate the contribution of the increasingly
popular “common” data science to the global carbon
footprint. To this end, the power consumption of several
typical tasks in the aforementioned common data science
tasks are measured and compared to: large-scale “advanced”
data science, common computer-related tasks, and everyday
non-computer related tasks. An automated data science
project is also run on various hardware architectures. To
assess its sustainability in terms of carbon emission, the
measurements are converted to and an equivalent unit of
“km driven by car”. Our main findings are: “common” data
science consumes 2.57 more power than regular computer
usage, but less than some common everyday power-consuming
tasks such as lighting or heating; advanced data science
consumes substantially more power than common data
science, and can be either on par or vastly surpass common
everyday power-consuming tasks, depending on the scale of
the project. In addition to the reporting of these
results, this work also aims to inspire researchers to
include power usage and estimated carbon emission as a
secondary result in their work.
keywords:
- ai
- sustainable
- carbon
- emission
- co2
- data science
license: CC-BY-NC-ND-4.0
version: 2.0.4
date-released: '2022-09-06'
GitHub Events
Total
Last Year
Dependencies
requirements.txt
pypi
- carbontracker *
- category_encoders *
- imblearn *
- matplotlib *
- numpy *
- openpyxl *
- pandas *
- scikit-learn *
- seaborn *
- sklearn *