woudc-data-registry
WOUDC Data Registry is a platform that manages Ozone and Ultraviolet Radiation data in support of the World Ozone and Ultraviolet Radiation Data Centre (WOUDC), one of six World Data Centres as part of the Global Atmosphere Watch programme of the WMO.
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.1%) to scientific vocabulary
Keywords
Repository
WOUDC Data Registry is a platform that manages Ozone and Ultraviolet Radiation data in support of the World Ozone and Ultraviolet Radiation Data Centre (WOUDC), one of six World Data Centres as part of the Global Atmosphere Watch programme of the WMO.
Basic Info
- Host: GitHub
- Owner: woudc
- License: other
- Language: Python
- Default Branch: master
- Homepage: https://woudc.org
- Size: 1 MB
Statistics
- Stars: 4
- Watchers: 2
- Forks: 10
- Open Issues: 1
- Releases: 0
Topics
Metadata Files
README.md
WOUDC Data Registry
Overview
WOUDC Data Registry is a platform that manages ozone and ultraviolet radiation data in support of the World Ozone and Ultraviolet Radiation Data Centre (WOUDC), one of six World Data Centres as part of the Global Atmosphere Watch programme of the WMO.
Installation
Requirements
- Python 3 and above
- virtualenv
- Elasticsearch (5.5.0 and above)
- woudc-extcsv
Dependencies
Dependencies are listed in requirements.txt. Dependencies are automatically installed during installation.
Installing woudc-data-registry
```bash
setup virtualenv
python3 -m venv woudc-data-registryenv cd woudc-data-registryenv source bin/activate
clone woudc-extcsv and install
git clone https://github.com/woudc/woudc-extcsv.git cd woudc-extcsv pip install -r requirements.txt pip install . cd ..
clone codebase and install
git clone https://github.com/woudc/woudc-data-registry.git cd woudc-data-registry pip install .
optional: for PostgreSQL backends
pip install -r requirements-pg.txt
set system environment variables
cp default.env foo.env vi foo.env # edit database connection parameters, etc. . foo.env ```
Initializing the Database
```bash
NOTE: -v/--verbosity option applies to all CLI commands
create database
make ENV=foo.env createdb
drop database
make ENV=foo.env dropdb
show configuration
woudc-data-registry admin config
show configuration and set output verbosity
woudc-data-registry admin config --verbosity DEBUG
initialize model (database tables)
woudc-data-registry admin registry setup
initialize search engine
woudc-data-registry admin search setup
load core metadata
woudc-data-registry admin init -d data/
cleanups
re-initialize model (database tables)
woudc-data-registry admin registry teardown woudc-data-registry admin registry setup
re-initialize search engine
woudc-data-registry admin search teardown woudc-data-registry admin search setup
If required reinitialized StationDobsonCorrections table and index
woudc-data-registry admin setup-dobson-correction -d data/ ```
Running woudc-data-registry
TIP: autocompletion can be made available in some shells via:
bash
eval "$(_WOUDC_DATA_REGISTRY_COMPLETE=source woudc-data-registry)"
Core Metadata Management
```bash
list all instances of foo (where foo is one of:
project|dataset|contributor|country|station|instrument|deployment)
woudc-data-registry
e.g.
woudc-data-registry contributor list
show a specific instance of foo with a given registry identifier
woudc-data-registry
e.g.
woudc-data-registry station show 023 woudc-data-registry instrument show ECC:2Z:4052:002:OzoneSonde
add a new instance of foo (contributor|country|station|instrument|deployment)
woudc-data-registry
e.g.
woudc-data-registry deployment add -s 001 -c MSC:WOUDC woudc-data-registry contributor add -id foo -n "Contributor name" -c Canada -w IV -u https://example.org -e you@example.org -f foouser -g -75,45
update an existing instance of foo with a given registry identifier
woudc-data-registry
e.g.
woudc-data-registry station update -n "New station name" woudc-data-registry deployment update --end-date 'Deployment end date'
delete an instance of foo with a given registry identifier
woudc-data-registry
e.g.
woudc-data-registry deployment delete 018:MSC:WOUDC
for more information about options on operation (add|update):
woudc-data-registry
e.g.
woudc-data-registry instrument update --help ```
Data Processing
```bash
Gather the files from the ftp account
woudc-data-registry data gather /path/to/dir
ingest directory of files (walks directory recursively)
woudc-data-registry data ingest /path/to/dir
ingest single file
woudc-data-registry data ingest foo.dat
ingest without asking permission checks
woudc-data-registry data ingest foo.dat -y
verify directory of files (walks directory recursively)
woudc-data-registry data verify /path/to/dir
verify single file
woudc-data-registry data verify foo.dat
verify core metadata only
woudc-data-registry data verify foo.dat -l
ingest with only core metadata checks
woudc-data-registry data ingest /path/to/dir -l ```
Dobson Section Corrections
```bash
Corrects both AD and CD data from TotalOzone Dobson Data
woudc-data-registry correction dobson-correction /path/to/dir --mode [test|ops]
--code gives to option to choose to correct a specific code
woudc-data-registry correction dobson-correction /path/to/dir --code [AD|CD] --mode [test|ops]
--weeklyingest outputs the files in a specific folder structure, similar to incoming folders
woudc-data-registry correction dobson-correction /path/to/dir --mode [test|ops] --weeklyingest ```
Search Index Generation
```bash
sync all data and metadata tables (except data product tables) to ElasticSearch
woudc-data-registry admin search sync
sync the data product tables (uvindexhourly, totalozone, and ozonesonde) to ElasticSearch
woudc-data-registry admin search product-sync ```
UV Index Generation
```bash
Teardown and generate entire uvindexhourly table
woudc-data-registry product uv-index generate /path/to/archive/root
Only generate uvindexhourly records within year range
woudc-data-registry product uv-index update -sy start-year -ey end-year /path/to/archive/root ```
Total Ozone Generation
```bash
Teardown and generate entire totalozone table
woudc-data-registry product totalozone generate /path/to/archive/root ```
OzoneSonde Generation
```bash
Teardown and generate entire ozonesonde table
woudc-data-registry product ozonesonde generate /path/to/archive/root ```
Report Generation
The woudc-data-registry data ingest command accepts a -r/--report flag, which is a path pointing to a directory.
When that flag is provided, an operator report and a run report are automatically written to that directory
while the files are being processing.
woudc-data-registry data ingest /path/to/dir -r /path/to/reports/location
The run report has a filename run_report. The file contains a series of blocks,
one per contributor in a processing run, of the following format:
<contributor acronym>
<status>: <filepath>
<status>: <filepath>
<status>: <filepath>
...
Where <status> is either Pass or Fail, depending on how the file reported in that line fared in processing.
The operator report is a more in-depth error log in CSV format, with a filename like operator-report-<date>.csv.
Operator reports contain one line per error or warning that happened during the processing run. The operator report
is meant to be a human-readable log which makes specific errors easy to find and diagnose.
Sending Emails to Contributors
To generate emails for contributors:
bash
woudc-data-registry data generate-emails /path/to/dir
Publishing Notifications to MQTT Server
bash
woudc-data-registry publish publish-notification --hours number_of_hours
Delete Record
bash
woudc-data-registry data delete-record /path/to/bad/file/
If a bad file was previously ingested, it can be removed using this command. This removes the file from the registry and the WAF.
Development
```bash
install dev requirements
pip install -r requirements-dev.txt ```
Building the Documentation
```bash
build local copy of https://woudc.github.io/woudc-data-registry
cd docs make html python3 -m http.server # view on http://localhost:8000/ ```
Running Tests
```bash
run tests like this:
cd woudcdataregistry/tests python3 testdataregistry.py python3 testdeleterecord.py
or this:
python3 setup.py test
measure code coverage
coverage run --source=woudcdataregistry -m unittest woudcdataregistry.tests.testdataregistry coverage report -m ```
Code Conventions
Bugs and Issues
All bugs, enhancements and issues are managed on GitHub.
Contact
Owner
- Name: World Ozone and Ultraviolet Radiation Data Centre
- Login: woudc
- Kind: organization
- Location: Canada
- Website: https://woudc.org
- Repositories: 10
- Profile: https://github.com/woudc
Collaborative software, issue tracker and wiki for WOUDC, one of six World Data Centres as part of the Global Atmosphere Watch programme of the WMO.
GitHub Events
Total
- Watch event: 1
- Issue comment event: 12
- Push event: 45
- Pull request review event: 15
- Pull request review comment event: 30
- Pull request event: 38
- Fork event: 1
- Create event: 1
Last Year
- Watch event: 1
- Issue comment event: 12
- Push event: 45
- Pull request review event: 15
- Pull request review comment event: 30
- Pull request event: 38
- Fork event: 1
- Create event: 1
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Alex Hurka | a****a@c****a | 115 |
| Tom Kralidis | t****s@h****m | 49 |
| Kevin Ngai | k****i | 28 |
| danielwaiforssell | 5****l | 22 |
| Victoria Rose Spada | 8****a | 22 |
| ahurka | 3****a | 18 |
| Simran Mattu | m****s@w****a | 18 |
| Kevin Ngai | n****k@w****a | 10 |
| Bob Du | b****u@c****a | 3 |
| Simran Mattu | 7****4 | 1 |
| Victoria Spada | v****a@e****a | 1 |
| Noor Al-Duhaidahawi | a****n@w****a | 1 |
| Kevin Ngai | n****k@w****a | 1 |
| BobMDu | n****e@g****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 0
- Total pull requests: 139
- Average time to close issues: N/A
- Average time to close pull requests: 9 days
- Total issue authors: 0
- Total pull request authors: 8
- Average comments per issue: 0
- Average comments per pull request: 0.53
- Merged pull requests: 121
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 44
- Average time to close issues: N/A
- Average time to close pull requests: 4 days
- Issue authors: 0
- Pull request authors: 4
- Average comments per issue: 0
- Average comments per pull request: 0.52
- Merged pull requests: 33
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- simranmattu14 (27)
- victoriarspada (25)
- tomkralidis (22)
- danielwaiforssell (22)
- ahurka (22)
- kngai (9)
- BobMDu (6)
- nalduu (6)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- alembic * development
- coverage * development
- flake8 * development
- sphinx * development
- wheel * development
- sphinx *
- sphinx-click *
- psycopg2 *
- click *
- elasticsearch <8
- jsonschema <4.4.0
- pyyaml *
- requests *
- sqlalchemy *
- woudc-extcsv >=0.5.0
- actions/checkout v2 composite
- actions/setup-python v2 composite