https://github.com/crim-ca/stac-populator
Workflow logic to populate STAC catalog with demo datasets.
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.9%) to scientific vocabulary
Repository
Workflow logic to populate STAC catalog with demo datasets.
Basic Info
- Host: GitHub
- Owner: crim-ca
- License: mit
- Language: Python
- Default Branch: master
- Size: 522 KB
Statistics
- Stars: 2
- Watchers: 7
- Forks: 2
- Open Issues: 28
- Releases: 0
Metadata Files
README.md
STAC Catalog Populator
This repository contains a framework STACpopulator that can be used to implement concrete populators (see implementations) for populating the STAC Catalog, Collections and Items from various dataset/catalog sources, and pushed using STAC API on a server node.
It can also be used to export data from an existing STAC API or catalog to files on disk. These can then later
be used to populate a STAC API with the DirectoryLoader implementation.
Framework
The framework is centered around a Python Abstract Base Class: STACpopulatorBase that implements all the logic
for populating a STAC catalog. This class provides abstract methods that should be overridden by implementations that
contain all the logic for constructing the STAC representation for an item in the collection that is to be processed.
Implementations
Provided implementations of STACpopulatorBase:
| Implementation | Description | |----------------------------------------------|-------------------------------------------------------------------------------------------------------------------------| | CMIP6_UofT | Crawls a THREDDS Catalog for CMIP6 NCML-annotated NetCDF references to publish corresponding STAC Collection and Items. | | DirectoryLoader | Crawls a subdirectory hierarchy of pre-generated STAC Collections and Items to publish to a STAC API endpoint. | | CORDEX-CMIP6_Ouranos | Crawls a THREDDS Catalog for CORDEX-CMIP6 NetCDF references to publish corresponding STAC Collection and Items. |
Installation and Execution
Either with Python directly (in an environment of your choosing):
```shell pip install .
OR
make install ```
With development packages:
```shell pip install .[dev]
OR
make install-dev ```
You should then be able to call the STAC populator CLI with following commands:
```shell
obtain the installed version of the STAC populator
stac-populator --version
obtain general help about available commands
stac-populator --help
obtain general help about available STAC populator implementations
stac-populator run --help
obtain help specifically for the execution of a STAC populator implementation
stac-populator run [implementation] --help
obtain general help about exporting STAC catalogs to a directory on disk
stac-populator export --help ```
CMIP6 extension: extra requirements
The CMIP6 stac-populator extension requires that the pyessv-archive data
files be installed. To install this package to the default location in your home directory at ~/.esdoc/pyessv-archive:
```shell git clone https://github.com/ES-DOC/pyessv-archive ~/.esdoc/pyessv-archive
OR
make setup-pyessv-archive ```
You can also choose to install them to a location on disk other than the default:
```shell git clone https://github.com/ES-DOC/pyessv-archive /some/other/place
OR
PYESSVARCHIVEHOME=/some/other/place make setup-pyessv-archive ```
Note:
If you have installed the pyessv-archive data files to a non-default
location, you need to specify that location with the PYESSV_ARCHIVE_HOME environment variable. For example,
if you've installed the pyessv-archive files to /some/other/place then run the following before executing
any of the example commands above:
shell
export PYESSV_ARCHIVE_HOME=/some/other/place
Docker
You can also employ the pre-built Docker, which can be called as follows,
where [command] corresponds to any of the above example operations.
shell
docker run -ti ghcr.io/crim-ca/stac-populator:0.9.0 [command]
Note:
If files needs to provided as input or obtained as output for using a command with docker, you will need to either
mount files individually or mount a workspace directory using -v {local-path}:{docker-path} inside the Docker
container to make them accessible to the command.
Testing
The provided docker-compose configuration file can be used to launch a test STAC server.
Consider using make docker-start to start this server, and make docker-stop to stop it.
Alternatively, you can also use your own STAC server accessible from any remote location.
To run the STAC populator, follow the steps from Installation and Execution.
For more tests validation, you can also run the test suite with coverage analysis.
shell
make test-cov
Contributing
We welcome any contributions to this codebase. To submit suggested changes, please do the following:
- create a new feature branch off of
master - update the code, write/update tests, write/update documentation
- submit a pull request targetting the
masterbranch
Coding Style
This codebase uses the ruff formatter and linter to enforce style policies.
To check that your changes conform to these policies please run:
sh
ruff format
ruff check
You can also set up pre-commit hooks that will run these checks before you create any commit in this repo:
sh
pre-commit install
Writing tests
Unit tests use the pytest-recording package to cache network responses. This allows the tests to be run offline and allows them to reliably pass regardless of whether a remote resource is available or not.
Whenever you're writing tests that make a request to an external resource, please use the @pytest.mark.vcr
decorator and record a new cassette (response cache) which can be committed to version control with the new
tests.
Owner
- Name: crim-ca
- Login: crim-ca
- Kind: organization
- Repositories: 79
- Profile: https://github.com/crim-ca
GitHub Events
Total
- Issues event: 11
- Delete event: 24
- Issue comment event: 33
- Push event: 67
- Pull request event: 57
- Pull request review comment event: 145
- Pull request review event: 155
- Create event: 28
Last Year
- Issues event: 11
- Delete event: 24
- Issue comment event: 33
- Push event: 67
- Pull request event: 57
- Pull request review comment event: 145
- Pull request review event: 155
- Create event: 28
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 8
- Total pull requests: 32
- Average time to close issues: 5 days
- Average time to close pull requests: about 1 month
- Total issue authors: 2
- Total pull request authors: 4
- Average comments per issue: 0.0
- Average comments per pull request: 0.78
- Merged pull requests: 18
- Bot issues: 0
- Bot pull requests: 4
Past Year
- Issues: 8
- Pull requests: 32
- Average time to close issues: 5 days
- Average time to close pull requests: about 1 month
- Issue authors: 2
- Pull request authors: 4
- Average comments per issue: 0.0
- Average comments per pull request: 0.78
- Merged pull requests: 18
- Bot issues: 0
- Bot pull requests: 4
Top Authors
Issue Authors
- fmigneault (8)
- dchandan (2)
- mishaschwartz (2)
- huard (1)
Pull Request Authors
- mishaschwartz (17)
- fmigneault (12)
- huard (7)
- dchandan (7)
- dependabot[bot] (4)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/checkout v2 composite
- docker/build-push-action v3 composite
- docker/login-action f054a8b539a109f9f41c372932f1ae047eff08c9 composite
- boto3 *
- fsspec *
- pystac *
- pyyaml *
- requests *
- shapely *
- siphon *
- stac-generator *
- xarray *
- python 3.11-slim build
- colorlog *
- pystac *
- pyyaml *
- siphon *