https://github.com/camels-de/camelsp

Temporary camels-processing repo

https://github.com/camels-de/camelsp

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.4%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Temporary camels-processing repo

Basic Info
  • Host: GitHub
  • Owner: CAMELS-DE
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 15.5 MB
Statistics
  • Stars: 1
  • Watchers: 2
  • Forks: 2
  • Open Issues: 0
  • Releases: 2
Created over 3 years ago · Last pushed over 1 year ago
Metadata Files
Readme License

README.md

camelsp

This repo helps processing the data dumps received from the authorities.

It is important to install the package locally and with editable flag:

```bash git clone git@github.com:camels-de/camelsp cd camelsp pip install -e .

then wget the data into the input data folder

cd inputdata wget https://bwsyncarndshare.kit.edu/s/<SHARETOKEN>/download ```

NOTE: wget the input data folder only works if you are part of the CAMELS-DE team and have access to the bwsyncandshare folder.
Unfortunately, we are not allowed to share the raw data provided by the federal states openly.

On https://hub.camels-de.org, the installation can be skipped

Scripts

Functions for handling one state specifically are located in the respective subbmodule, utility functions helpful for all states are imported at top-level

Help scripts indicating the usage for possible re-processing are located at the top-level scripts folder.

NUTS

We use nuts2 codes to derive ids and to build a folder structure. There is a case-insensitve utility function for looking up the top level 2 NUTS codes in the package, which should accept many existing abbreviations:

```python from camelsp import nuts3

nuts('Ba-Wü') # gives DE1 nuts('NRW') # gives DEA nuts('Pfalz') # gives DEB nuts('mecklenburg Vorpommern') # gives DE8 ```

save new data

The Bundesland context manager accepts any kind of data, to be added to the CAMELS dataset. Within a context, the manager has the save_timeseries function, which takes a pd.DataFrame with columns ['date', <variable>, 'flag'] and merges them with possibly existing data:

```python df = readinfunction()

with Bundesland('Bayern') as bl: bl.savetimeseries(df, seriesid='DE210060')

```

metadata

There are two ways how the current metdata can be read.

This will read all metadata of all federal states into a single DataFrame

python from camelsp import get_metadata get_metadata()

Or for only one federal state:

```python from camelsp import Bundesland

with Bundesland('Sachsen') as bl: print(bl.metadata) ```

The context manager can also update the metadata. This can only be done on federal state level. The new metadata needs to reference an existing column in the metadata and will default to 'camels_id' or 'provider_id' if not given. All other columns in the new DataFrame will be updated (created or overwritten) for the respective federal state only.

```python

get the new metadata from somewhere

newmetadata = readnew_metadata()

with Bundesland('Sachsen-Anhalt') as bl: # either use the property setter shortcut - this can only handle updates on camels or provider id bl.metadata = new_metadata

# OR the function - use other column optionally
bl.update_metadata(new_metadata, 'existing_primary_key')

```

Docker container:

bash docker run -v ./input_data:/camelsp/input_data -v ./output_data_docker:/camelsp/output_data -it --rm camelsp

Owner

  • Name: CAMELS-DE
  • Login: CAMELS-DE
  • Kind: organization
  • Location: Germany

GitHub Events

Total
  • Push event: 1
  • Fork event: 1
Last Year
  • Push event: 1
  • Fork event: 1