woudc-data-registry

WOUDC Data Registry is a platform that manages Ozone and Ultraviolet Radiation data in support of the World Ozone and Ultraviolet Radiation Data Centre (WOUDC), one of six World Data Centres as part of the Global Atmosphere Watch programme of the WMO.

https://github.com/woudc/woudc-data-registry

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.1%) to scientific vocabulary

Keywords

gaw ozone ozonesonde spectral totalozone ultraviolet umkehr uv wmo
Last synced: 6 months ago · JSON representation

Repository

WOUDC Data Registry is a platform that manages Ozone and Ultraviolet Radiation data in support of the World Ozone and Ultraviolet Radiation Data Centre (WOUDC), one of six World Data Centres as part of the Global Atmosphere Watch programme of the WMO.

Basic Info
  • Host: GitHub
  • Owner: woudc
  • License: other
  • Language: Python
  • Default Branch: master
  • Homepage: https://woudc.org
  • Size: 1 MB
Statistics
  • Stars: 4
  • Watchers: 2
  • Forks: 10
  • Open Issues: 1
  • Releases: 0
Topics
gaw ozone ozonesonde spectral totalozone ultraviolet umkehr uv wmo
Created over 8 years ago · Last pushed 6 months ago
Metadata Files
Readme License Support

README.md

WOUDC Data Registry

Build Status Coverage Status Documentation

Overview

WOUDC Data Registry is a platform that manages ozone and ultraviolet radiation data in support of the World Ozone and Ultraviolet Radiation Data Centre (WOUDC), one of six World Data Centres as part of the Global Atmosphere Watch programme of the WMO.

Installation

Requirements

Dependencies

Dependencies are listed in requirements.txt. Dependencies are automatically installed during installation.

Installing woudc-data-registry

```bash

setup virtualenv

python3 -m venv woudc-data-registryenv cd woudc-data-registryenv source bin/activate

clone woudc-extcsv and install

git clone https://github.com/woudc/woudc-extcsv.git cd woudc-extcsv pip install -r requirements.txt pip install . cd ..

clone codebase and install

git clone https://github.com/woudc/woudc-data-registry.git cd woudc-data-registry pip install .

optional: for PostgreSQL backends

pip install -r requirements-pg.txt

set system environment variables

cp default.env foo.env vi foo.env # edit database connection parameters, etc. . foo.env ```

Initializing the Database

```bash

NOTE: -v/--verbosity option applies to all CLI commands

create database

make ENV=foo.env createdb

drop database

make ENV=foo.env dropdb

show configuration

woudc-data-registry admin config

show configuration and set output verbosity

woudc-data-registry admin config --verbosity DEBUG

initialize model (database tables)

woudc-data-registry admin registry setup

initialize search engine

woudc-data-registry admin search setup

load core metadata

woudc-data-registry admin init -d data/

cleanups

re-initialize model (database tables)

woudc-data-registry admin registry teardown woudc-data-registry admin registry setup

re-initialize search engine

woudc-data-registry admin search teardown woudc-data-registry admin search setup

If required reinitialized StationDobsonCorrections table and index

woudc-data-registry admin setup-dobson-correction -d data/ ```

Running woudc-data-registry

TIP: autocompletion can be made available in some shells via:

bash eval "$(_WOUDC_DATA_REGISTRY_COMPLETE=source woudc-data-registry)"

Core Metadata Management

```bash

list all instances of foo (where foo is one of:

project|dataset|contributor|country|station|instrument|deployment)

woudc-data-registry list

e.g.

woudc-data-registry contributor list

show a specific instance of foo with a given registry identifier

woudc-data-registry show

e.g.

woudc-data-registry station show 023 woudc-data-registry instrument show ECC:2Z:4052:002:OzoneSonde

add a new instance of foo (contributor|country|station|instrument|deployment)

woudc-data-registry add

e.g.

woudc-data-registry deployment add -s 001 -c MSC:WOUDC woudc-data-registry contributor add -id foo -n "Contributor name" -c Canada -w IV -u https://example.org -e you@example.org -f foouser -g -75,45

update an existing instance of foo with a given registry identifier

woudc-data-registry update -id

e.g.

woudc-data-registry station update -n "New station name" woudc-data-registry deployment update --end-date 'Deployment end date'

delete an instance of foo with a given registry identifier

woudc-data-registry delete

e.g.

woudc-data-registry deployment delete 018:MSC:WOUDC

for more information about options on operation (add|update):

woudc-data-registry --help

e.g.

woudc-data-registry instrument update --help ```

Data Processing

```bash

Gather the files from the ftp account

woudc-data-registry data gather /path/to/dir

ingest directory of files (walks directory recursively)

woudc-data-registry data ingest /path/to/dir

ingest single file

woudc-data-registry data ingest foo.dat

ingest without asking permission checks

woudc-data-registry data ingest foo.dat -y

verify directory of files (walks directory recursively)

woudc-data-registry data verify /path/to/dir

verify single file

woudc-data-registry data verify foo.dat

verify core metadata only

woudc-data-registry data verify foo.dat -l

ingest with only core metadata checks

woudc-data-registry data ingest /path/to/dir -l ```

Dobson Section Corrections

```bash

Corrects both AD and CD data from TotalOzone Dobson Data

woudc-data-registry correction dobson-correction /path/to/dir --mode [test|ops]

--code gives to option to choose to correct a specific code

woudc-data-registry correction dobson-correction /path/to/dir --code [AD|CD] --mode [test|ops]

--weeklyingest outputs the files in a specific folder structure, similar to incoming folders

woudc-data-registry correction dobson-correction /path/to/dir --mode [test|ops] --weeklyingest ```

Search Index Generation

```bash

sync all data and metadata tables (except data product tables) to ElasticSearch

woudc-data-registry admin search sync

sync the data product tables (uvindexhourly, totalozone, and ozonesonde) to ElasticSearch

woudc-data-registry admin search product-sync ```

UV Index Generation

```bash

Teardown and generate entire uvindexhourly table

woudc-data-registry product uv-index generate /path/to/archive/root

Only generate uvindexhourly records within year range

woudc-data-registry product uv-index update -sy start-year -ey end-year /path/to/archive/root ```

Total Ozone Generation

```bash

Teardown and generate entire totalozone table

woudc-data-registry product totalozone generate /path/to/archive/root ```

OzoneSonde Generation

```bash

Teardown and generate entire ozonesonde table

woudc-data-registry product ozonesonde generate /path/to/archive/root ```

Report Generation

The woudc-data-registry data ingest command accepts a -r/--report flag, which is a path pointing to a directory. When that flag is provided, an operator report and a run report are automatically written to that directory while the files are being processing.

woudc-data-registry data ingest /path/to/dir -r /path/to/reports/location

The run report has a filename run_report. The file contains a series of blocks, one per contributor in a processing run, of the following format:

<contributor acronym> <status>: <filepath> <status>: <filepath> <status>: <filepath> ...

Where <status> is either Pass or Fail, depending on how the file reported in that line fared in processing.

The operator report is a more in-depth error log in CSV format, with a filename like operator-report-<date>.csv. Operator reports contain one line per error or warning that happened during the processing run. The operator report is meant to be a human-readable log which makes specific errors easy to find and diagnose.

Sending Emails to Contributors

To generate emails for contributors:

bash woudc-data-registry data generate-emails /path/to/dir

Publishing Notifications to MQTT Server

bash woudc-data-registry publish publish-notification --hours number_of_hours

Delete Record

bash woudc-data-registry data delete-record /path/to/bad/file/

If a bad file was previously ingested, it can be removed using this command. This removes the file from the registry and the WAF.

Development

```bash

install dev requirements

pip install -r requirements-dev.txt ```

Building the Documentation

```bash

build local copy of https://woudc.github.io/woudc-data-registry

cd docs make html python3 -m http.server # view on http://localhost:8000/ ```

Running Tests

```bash

run tests like this:

cd woudcdataregistry/tests python3 testdataregistry.py python3 testdeleterecord.py

or this:

python3 setup.py test

measure code coverage

coverage run --source=woudcdataregistry -m unittest woudcdataregistry.tests.testdataregistry coverage report -m ```

Code Conventions

Bugs and Issues

All bugs, enhancements and issues are managed on GitHub.

Contact

Owner

  • Name: World Ozone and Ultraviolet Radiation Data Centre
  • Login: woudc
  • Kind: organization
  • Location: Canada

Collaborative software, issue tracker and wiki for WOUDC, one of six World Data Centres as part of the Global Atmosphere Watch programme of the WMO.

GitHub Events

Total
  • Watch event: 1
  • Issue comment event: 12
  • Push event: 45
  • Pull request review event: 15
  • Pull request review comment event: 30
  • Pull request event: 38
  • Fork event: 1
  • Create event: 1
Last Year
  • Watch event: 1
  • Issue comment event: 12
  • Push event: 45
  • Pull request review event: 15
  • Pull request review comment event: 30
  • Pull request event: 38
  • Fork event: 1
  • Create event: 1

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 290
  • Total Committers: 14
  • Avg Commits per committer: 20.714
  • Development Distribution Score (DDS): 0.603
Past Year
  • Commits: 56
  • Committers: 7
  • Avg Commits per committer: 8.0
  • Development Distribution Score (DDS): 0.5
Top Committers
Name Email Commits
Alex Hurka a****a@c****a 115
Tom Kralidis t****s@h****m 49
Kevin Ngai k****i 28
danielwaiforssell 5****l 22
Victoria Rose Spada 8****a 22
ahurka 3****a 18
Simran Mattu m****s@w****a 18
Kevin Ngai n****k@w****a 10
Bob Du b****u@c****a 3
Simran Mattu 7****4 1
Victoria Spada v****a@e****a 1
Noor Al-Duhaidahawi a****n@w****a 1
Kevin Ngai n****k@w****a 1
BobMDu n****e@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 139
  • Average time to close issues: N/A
  • Average time to close pull requests: 9 days
  • Total issue authors: 0
  • Total pull request authors: 8
  • Average comments per issue: 0
  • Average comments per pull request: 0.53
  • Merged pull requests: 121
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 44
  • Average time to close issues: N/A
  • Average time to close pull requests: 4 days
  • Issue authors: 0
  • Pull request authors: 4
  • Average comments per issue: 0
  • Average comments per pull request: 0.52
  • Merged pull requests: 33
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • simranmattu14 (27)
  • victoriarspada (25)
  • tomkralidis (22)
  • danielwaiforssell (22)
  • ahurka (22)
  • kngai (9)
  • BobMDu (6)
  • nalduu (6)
Top Labels
Issue Labels
Pull Request Labels
bug (2) enhancement (1)

Dependencies

requirements-dev.txt pypi
  • alembic * development
  • coverage * development
  • flake8 * development
  • sphinx * development
  • wheel * development
requirements-docs.txt pypi
  • sphinx *
  • sphinx-click *
requirements-pg.txt pypi
  • psycopg2 *
requirements.txt pypi
  • click *
  • elasticsearch <8
  • jsonschema <4.4.0
  • pyyaml *
  • requests *
  • sqlalchemy *
  • woudc-extcsv >=0.5.0
.github/workflows/main.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
setup.py pypi