https://github.com/aariq/foresttime-builder

Scripts to generate a forestTIME database. Ancestral code is in forestTIME and automatic-trees, which has a bloated git history.

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: zenodo.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.8%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Scripts to generate a forestTIME database. Ancestral code is in forestTIME and automatic-trees, which has a bloated git history.

Basic Info

Host: GitHub
Owner: Aariq
License: gpl-3.0
Language: R
Default Branch: pre_carbon
Size: 12.8 MB

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 1
Releases: 0

Fork of mekevans/forestTIME-builder

Created over 1 year ago · Last pushed over 1 year ago

https://github.com/Aariq/forestTIME-builder/blob/pre_carbon/

# forestTIME-builder
Scripts to generate a forestTIME database. Ancestral code is in forestTIME and automatic-trees, which has a bloated git history.

The code in this repo will download raw FIA data from DataMart, process it to create forestTIME tables, and store these tables as (currently) .parquet files. This section of the workflow can be run in parallel broken out by state. These tables are then stacked to create one database with forestTIME tables for the whole country. This database can then be uploaded and shared, e.g. via Zenodo or Box.

I anticipate that *most* users of forestTIME will not run this code. Instead it will run automatically via github actions and push the finished database someplace accessible to users, who will then download it and query it. Functions to query an already-generated database can be found in https://github.com/mekevans/forestTIME/. However, anyone who wants to can download this repo and run the scripts to create a database locally.

## The `pre_carbon` branch

This branch, the default branch, contains the stable version of forestTIME prior to the addition of NSVB carbon estimation.
For the (WIP) branch with Renata's work towards annualized carbon estimates, see the `add-annual-carbon-good` branch [here](https://github.com/diazrenata/forestTIME-builder/tree/add-annual-carbon-good).

## Organization

- `R` contains functions to download data, create tables, and add them to the database. These functions hardly ever change.
- `scripts` contains a workflow to run the functions in `R` to generate a database and push it to Zenodo. These workflows have undergone a lot of recent change to navigate trade-offs in terms of local vs. automated, all at once vs. state by state, etc. To generate a forestTIME .duckdb, run the scripts in `scripts` in order/following the instructions in the comments.

## Automation and Zenodo push

- These scripts run automatically via GitHub actions, currently on a push to this branch. This can be updated to a scheduled job.
- One workflow runs for each state, generating state-level database tables which are stored as .parquet files. The .parquet files are stored as GitHub artifacts. A final workflow runs to stack all of the state-level tables into one database, which is uploaded to a Zenodo archive. This is currently private, located at: https://zenodo.org/records/13377070. This can be updated to a public archive when we are ready.
- To modify the workflow scripts, *don't* modify the files in `scripts/01-state-by-state`. Instead modify the text in `scripts/create-workflow-yml.R` and then run that script to automatically generate the state-level scripts.
- To set up a push to Zenodo from GitHub actions, generate a Zenodo token in your Zenodo account and supply this as an environment variable as an Actions secret in the GitHub repository.

------------------------------------------------------------------------
Developed in collaboration with the University of Arizona [CCT Data Science](https://datascience.cct.arizona.edu/) team

Owner

Name: Eric R. Scott
Login: Aariq
Kind: user
Company: University of Arizona, @cct-datascience

Website: www.ericrscott.com
Twitter: leafyericscott
Repositories: 125
Profile: https://github.com/Aariq

Scientific Programmer & Educator at University of Arizona

GitHub Events

Total

Delete event: 2
Push event: 16
Pull request event: 2
Create event: 1

Last Year

Delete event: 2
Push event: 16
Pull request event: 2
Create event: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 0
Total pull requests: 2
Average time to close issues: N/A
Average time to close pull requests: 3 months
Total issue authors: 0
Total pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 2
Average time to close issues: N/A
Average time to close pull requests: 3 months
Issue authors: 0
Pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/aariq/foresttime-builder

Science Score: 10.0%

Repository

Basic Info

Statistics

https://github.com/Aariq/forestTIME-builder/blob/pre_carbon/

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels