neotoma_lakes
A repository for managing the matching of lake data between national hydrographic databases and Neotoma records.
Science Score: 41.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
1 of 4 committers (25.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.6%) to scientific vocabulary
Keywords
Repository
A repository for managing the matching of lake data between national hydrographic databases and Neotoma records.
Basic Info
- Host: GitHub
- Owner: NeotomaDB
- License: mit
- Language: HTML
- Default Branch: master
- Size: 34.8 MB
Statistics
- Stars: 1
- Watchers: 6
- Forks: 1
- Open Issues: 5
- Releases: 1
Topics
Metadata Files
README.md
Revisiting Neotoma Lakes
A number of data records within Neotoma have been obtained from publications and legacy records from COHMAP and other sources. Older records were often transformed from Degree-Minute-Second records to decimal degrees. In this case there are two issues:
- There is the appearance of greater accuracy/precision for the record's spatial location
- The location of the sample site is not centered within the depositional basin from which the sample was obtained.
In particular, pollen based reconstruction methods (for climate or vegetation) often require knowledge of the size of the depositional basin. For example, REVEALS (Sugita, 2007) requires knowledge of lake size when reconstructing vegetation.
Work underway for Northern Hemisphere reconstructions requires knowledge of lake sizes to be able to accurately estimate vegetation for the band from 40oN - 80oN, covering most of North America. This repository contains code and summary output for a hybrid process that combines numerical analysis and GIS with itterative hand editing by individuals to align lacustrine and palustrine pollen datasets within North America with their depositional basins.
Contributions
- Simon Goring (code and repository management)
- Bailey Zak
- Claire Rubbelke - University of Wisconsin
- Andria Dawson - Mount Royal University
- Mathias Trachel - University of Wisconsin
Project Components
Rendered Document
The rendered document can be viewed as an HTML file from the Neotoma Open Computing pages:
The summary document, rendered as an Rmd to html using rmarkdown::render(), provides an overview of the code and its overall operation. The hope is to develop a process that work interactively with both the Neotoma Paleoecology Database directly, and a web interface to provide the opportunity to dynamically examine and update lake locations.
Currently there are two part to the re-analysis:
- Geographic co-location with existing hydological databases: This code is run in two steps using files
R/GetLakeAreas_usa.RandR/GetLakeAreas_canada.R. This analysis requires downloading a large volume of data and is not recommended to be run on an individual user's computer. - Manual adjustment and measurement of lakes. This work uses a GIS application such as ArcMap or QGIS to locate individual sites, adjust location if neccessary and subsequently measure the basin in which a site is located.
Geographic Co-Location
For this work two files (R/GetLakeAreas_usa.R and R/GetLakeAreas_canada.R) are used. These files run from the R directory and download ZIP files from servers for Canadian and US hydrological data. These (often large) shapefiles are placed in temporary files using R's tempfile() function. Intermediate data is saved to data/runs.RDS so that partial runs can be managed. Output is saved as a file in data/usa_lakes.csv with a format:
| Parameter | Variable Type | | --- | --- | | site.id | int | | DatasetID | int | | site.name | char | | long | num | | lat | num | | state | char | | GNISNAME | char | | GNISID | int | | AREASQKM | num | | distmatch | num | | datasource | char |
Manual Adjustment
Manual adjustment involved working with csv and shapefiles in a workflow that evolved through time. Intermediate results were saved in a folder called data/by_hand. Within this folder there are a number of estimates for site locations and lake areas, in some cases these values overlap with two individuals having edited the same site.
Importing these files happens through the function load_lakes(), which is contained in the file R/load_entered_data.R. This file recognizes either csv or shp files and processes them accordingly, returning a single output data.frame with the columns:
| Variable | Data Type | | --- | --- | | stid | int | | edited | int| | notes | char | | area | num | | type | char | | geometry | wkt (char) | source | char | | dsid | int |
These are combined with the aligned lakes from the earlier process, to create a composite dataset that is saved to file using a version number defined early in the Rmd file. The output is then two files:
area_lakes_....csv: All lakes with defined areas.dataset_....shp: Lakes that still require defined areas.
Owner
- Name: The Neotoma Paleoecology Database Collective
- Login: NeotomaDB
- Kind: organization
- Location: Global
- Website: https://neotomadb.org
- Repositories: 56
- Profile: https://github.com/NeotomaDB
Data and code supporting collaboration and outreach around the Neotoma Paleoecology Database
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Goring" given-names: "Simon" orcid: "https://orcid.org/0000-0002-2700-4605" title: "Neotoma Lakes Assignment" version: 0.1 doi: 10.5281/zenodo.10535435 date-released: 2024-01-19 url: "https://github.com/NeotomaDB/neotoma_lakes"
GitHub Events
Total
- Delete event: 1
- Push event: 1
- Pull request event: 4
- Create event: 2
Last Year
- Delete event: 1
- Push event: 1
- Pull request event: 4
- Create event: 2
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Simon | s****g@g****m | 53 |
| Erik Zepeda | E****9@g****m | 2 |
| Simon Goring | s****5@w****u | 2 |
| mtrachs | m****s@g****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 5
- Total pull requests: 2
- Average time to close issues: N/A
- Average time to close pull requests: about 3 hours
- Total issue authors: 2
- Total pull request authors: 2
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 2
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- SimonGoring (4)
- andydawson (1)
Pull Request Authors
- ErikZepeda59 (2)
- SimonGoring (1)
- mtrachs (1)