https://github.com/aariq/foresttime-builder

Scripts to generate a forestTIME database. Ancestral code is in forestTIME and automatic-trees, which has a bloated git history.

https://github.com/aariq/foresttime-builder

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.8%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Scripts to generate a forestTIME database. Ancestral code is in forestTIME and automatic-trees, which has a bloated git history.

Basic Info
  • Host: GitHub
  • Owner: Aariq
  • License: gpl-3.0
  • Language: R
  • Default Branch: pre_carbon
  • Size: 12.8 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 1
  • Releases: 0
Fork of mekevans/forestTIME-builder
Created over 1 year ago · Last pushed over 1 year ago

https://github.com/Aariq/forestTIME-builder/blob/pre_carbon/

# forestTIME-builder
Scripts to generate a forestTIME database. Ancestral code is in forestTIME and automatic-trees, which has a bloated git history. 

The code in this repo will download raw FIA data from DataMart, process it to create forestTIME tables, and store these tables as (currently) .parquet files. This section of the workflow can be run in parallel broken out by state. These tables are then stacked to create one database with forestTIME tables for the whole country. This database can then be uploaded and shared, e.g. via Zenodo or Box. 

I anticipate that *most* users of forestTIME will not run this code. Instead it will run automatically via github actions and push the finished database someplace accessible to users, who will then download it and query it. Functions to query an already-generated database can be found in https://github.com/mekevans/forestTIME/. However, anyone who wants to can download this repo and run the scripts to create a database locally. 

## The `pre_carbon` branch

This branch, the default branch, contains the stable version of forestTIME prior to the addition of NSVB carbon estimation. 
For the (WIP) branch with Renata's work towards annualized carbon estimates, see the `add-annual-carbon-good` branch [here](https://github.com/diazrenata/forestTIME-builder/tree/add-annual-carbon-good).

## Organization

- `R` contains functions to download data, create tables, and add them to the database. These functions hardly ever change.
- `scripts` contains a workflow to run the functions in `R` to generate a database and push it to Zenodo. These workflows have undergone a lot of recent change to navigate trade-offs in terms of local vs. automated, all at once vs. state by state, etc. To generate a forestTIME .duckdb, run the scripts in `scripts` in order/following the instructions in the comments.

## Automation and Zenodo push

- These scripts run automatically via GitHub actions, currently on a push to this branch. This can be updated to a scheduled job.
- One workflow runs for each state, generating state-level database tables which are stored as .parquet files. The .parquet files are stored as GitHub artifacts. A final workflow runs to stack all of the state-level tables into one database, which is uploaded to a Zenodo archive. This is currently private, located at: https://zenodo.org/records/13377070. This can be updated to a public archive when we are ready.
- To modify the workflow scripts, *don't* modify the files in `scripts/01-state-by-state`. Instead modify the text in `scripts/create-workflow-yml.R` and then run that script to automatically generate the state-level scripts.
- To set up a push to Zenodo from GitHub actions, generate a Zenodo token in your Zenodo account and supply this as an environment variable as an Actions secret in the GitHub repository. 


------------------------------------------------------------------------
Developed in collaboration with the University of Arizona [CCT Data Science](https://datascience.cct.arizona.edu/) team

Owner

  • Name: Eric R. Scott
  • Login: Aariq
  • Kind: user
  • Company: University of Arizona, @cct-datascience

Scientific Programmer & Educator at University of Arizona

GitHub Events

Total
  • Delete event: 2
  • Push event: 16
  • Pull request event: 2
  • Create event: 1
Last Year
  • Delete event: 2
  • Push event: 16
  • Pull request event: 2
  • Create event: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 0
  • Total pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: 3 months
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: 3 months
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • Aariq (2)
Top Labels
Issue Labels
Pull Request Labels