building-data-genome-project-2

Whole building non-residential hourly energy meter data from the Great Energy Predictor III competition

https://github.com/buds-lab/building-data-genome-project-2

Science Score: 33.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 5 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org, researchgate.net, nature.com, zenodo.org
  • Committers with academic emails
    1 of 3 committers (33.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.2%) to scientific vocabulary

Keywords

building-automation building-energy electricity-consumption electricity-meter energy-consumption energy-efficiency open-data open-data-science open-source smart-city smart-meter
Last synced: 6 months ago · JSON representation

Repository

Whole building non-residential hourly energy meter data from the Great Energy Predictor III competition

Basic Info
  • Host: GitHub
  • Owner: buds-lab
  • License: other
  • Language: Jupyter Notebook
  • Default Branch: master
  • Homepage: https://www.budslab.org/
  • Size: 422 MB
Statistics
  • Stars: 233
  • Watchers: 16
  • Forks: 91
  • Open Issues: 4
  • Releases: 1
Topics
building-automation building-energy electricity-consumption electricity-meter energy-consumption energy-efficiency open-data open-data-science open-source smart-city smart-meter
Created almost 6 years ago · Last pushed over 2 years ago
Metadata Files
Readme License

README.md

logo

DOI

The Building Data Genome 2 (BDG2) Data-Set

Data-set description

BDG2 is an open data set made up of 3,053 energy meters from 1,636 buildings. The time range of the times-series data is the two full years (2016 and 2017) and the frequency is hourly measurements of electricity, heating and cooling water, steam, and irrigation meters. A subset of the data was used in the Great Energy Predictor III (GEPIII) competition hosted by the ASHRAE organization in late 2019. A full overview of the GEPIII competition can be found in a Science and Technology for the Built Environment Journal - Preprint found on arXiv

The GEPIII sub-set includes hourly data from 2,380 meters from 1,449 buildings that were used in a machine learning competition for long-term prediction with an application to measurement and verification in the building energy analysis domain. This data set can be used to benchmark various statistical learning algorithms and other data science techniques. It can also be used simply as a teaching or learning tool to practice dealing with measured performance data from large numbers of non-residential buildings. The charts below illustrate the breakdown of the buildings according to primary use category and subcategory, industry and subindustry, timezone and meter type.

cat_features

Getting Started

We recommend you download the Anaconda Python Distribution and use Jupyter to get an understanding of the data. - Temporal meters data are found in /data/meters/ - Metadata is found in data/metadata/ - To join all meters raw data into one dataset follow this notebook

Example notebooks are found in /notebooks/ -- a few good overview examples: - Exploratory Data Analysis of metadata - Exploratory Data Analysis of weather - Exploratory Data Analysis of meter reading

Detailed Documentation

The detailed documentation of how this data set was created can be found in the repository's wiki and in the following publication:

Citation of BDG2 Data-Set

Miller, C., Kathirgamanathan, A., Picchetti, B. et al. The Building Data Genome Project 2, energy meter data from the ASHRAE Great Energy Predictor III competition. Sci Data 7, 368 (2020). https://doi.org/10.1038/s41597-020-00712-x

```

@ARTICLE{Miller2020-yc, title = "The Building Data Genome Project 2, energy meter data from the {ASHRAE} Great Energy Predictor {III} competition", author = "Miller, Clayton and Kathirgamanathan, Anjukan and Picchetti, Bianca and Arjunan, Pandarasamy and Park, June Young and Nagy, Zoltan and Raftery, Paul and Hobson, Brodie W and Shi, Zixiao and Meggers, Forrest", abstract = "This paper describes an open data set of 3,053 energy meters from 1,636 non-residential buildings with a range of two full years (2016 and 2017) at an hourly frequency (17,544 measurements per meter resulting in approximately 53.6 million measurements). These meters were collected from 19 sites across North America and Europe, with one or more meters per building measuring whole building electrical, heating and cooling water, steam, and solar energy as well as water and irrigation meters. Part of these data was used in the Great Energy Predictor III (GEPIII) competition hosted by the American Society of Heating, Refrigeration, and Air-Conditioning Engineers (ASHRAE) in October-December 2019. GEPIII was a machine learning competition for long-term prediction with an application to measurement and verification. This paper describes the process of data collection, cleaning, and convergence of time-series meter data, the meta-data about the buildings, and complementary weather data. This data set can be used for further prediction benchmarking and prototyping as well as anomaly detection, energy analysis, and building type classification. Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.13033847", journal = "Scientific Data", publisher = "Nature Publishing Group", volume = 7, pages = "368", month = oct, year = 2020, language = "en" }

```

Preprints

Publications or Projects that use BDG2 data-set

Please update this list if you add notebooks or R-Markdown files to the notebook folder. Naming convention is a number (for ordering), the creator's initials, and a short - delimited description, e.g. 1.0-jqp-initial-data-exploration.

  • (publication here)

Repository structure

building-data-genome-project-2 README.md <- BDG2 README for developers using this data-set data | metadata <- buildings metadata | weather <- weather data | meters | raw <- all meter reading datasets | cleaned <- cleaned meter data based on several filtering steps | kaggle <- the 2017 meter data that aligns with the Kaggle competition notebooks <- Jupyter notebooks, named after the naming convention figures <- figures created during exploration of BDG 2.0 Data-set

Owner

  • Name: Building and Urban Data Science (BUDS) Group
  • Login: buds-lab
  • Kind: organization
  • Email: clayton@nus.edu.sg
  • Location: Singapore

Building and Urban Data Science (BUDS) at the National University of Singapore

GitHub Events

Total
  • Watch event: 44
  • Fork event: 15
Last Year
  • Watch event: 44
  • Fork event: 15

Committers

Last synced: 6 months ago

All Time
  • Total Commits: 81
  • Total Committers: 3
  • Avg Commits per committer: 27.0
  • Development Distribution Score (DDS): 0.198
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Pony Biam! 4****m@u****m 65
Clayton Miller m****n@g****m 14
Clayton Miller c****n@n****g 2
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 26
  • Total pull requests: 2
  • Average time to close issues: 29 days
  • Average time to close pull requests: less than a minute
  • Total issue authors: 7
  • Total pull request authors: 1
  • Average comments per issue: 2.38
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • cmiller8 (15)
  • ponybiam (5)
  • david-waterworth (2)
  • rinzebloem (1)
  • mai-n-coleman (1)
  • zixiaoshawnshi (1)
  • oso5 (1)
Pull Request Authors
  • cmiller8 (2)
Top Labels
Issue Labels
enhancement (1)
Pull Request Labels