ncompare

ncompare: A Python package for comparing netCDF structures - Published in JOSS (2024)

https://github.com/nasa/ncompare

Science Score: 100.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 7 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org, zenodo.org
  • Committers with academic emails
    1 of 8 committers (12.5%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

data-comparison hierarchical-data netcdf

Keywords from Contributors

mesh hydrology energy-system exoplanet hydraulic-modelling polygon gravitational-lensing geoscience ode chemical-bonding

Scientific Fields

Earth and Environmental Sciences Physical Sciences - 38% confidence
Last synced: 4 months ago · JSON representation ·

Repository

Compare the structure of two netCDF (or HDF5) files

Basic Info
Statistics
  • Stars: 29
  • Watchers: 4
  • Forks: 10
  • Open Issues: 5
  • Releases: 10
Topics
data-comparison hierarchical-data netcdf
Created over 2 years ago · Last pushed 5 months ago
Metadata Files
Readme Changelog Contributing License Code of conduct Citation

README.md

ncompare


Project Status: Active – The project has reached a stable, usable state and is being actively developed Code coverage Documentation Status Python Versions Package version Mypy checked Contributions welcome Zenodo pyOpenSci DOI badge

Compare the structure of two netCDF files at the command line or via Python. ncompare generates a view of the matching and non-matching groups and variables between two netCDF datasets.

Allthough tailored for netCDF files, ncompare also works with some HDF5 files (see notes and known limitations).

Installing

The latest release of ncompare can be installed with mamba, conda or pip:

bash mamba install -c conda-forge ncompare bash conda install -c conda-forge ncompare bash pip install ncompare

Usage Examples

At a command line:

To compare two netCDF files, pass the filepaths for each of the two netCDF files directly to ncompare, as follows:

console ncompare <netcdf file #1> <netcdf file #2>

With an additional --file-text argument specified, a common use of ncompare may look like this example:

console ncompare S001G01.nc S001G01_SUBSET.nc --file-text subset_comparison.txt

In a Python kernel:

```python from ncompare import compare

totalnumberofdifferences = compare("", "", onlydiffs=True, showchunks=True, showattributes=True) ```

More complete usage demonstrations, with example output, are shown in this example notebook.

Contributing

Contributions are welcome! For more information, see CONTRIBUTING.md. ncompare is licensed under the Apache License 2.0, which is included in the LICENSE file.

Developing

Development within this repository should occur on a feature branch. Pull Requests (PRs) are created with a target of the develop branch before being reviewed and merged.

Installing locally

For local development, one can clone the repository and then use poetry or pip from the local directory:

console git clone https://github.com/nasa/ncompare.git

(Option A) using poetry:

ii) Follow the instructions for installing poetry here.

iii) Run poetry install from the repository directory.

(Option B) using pip:

ii) Run pip install . from the repository directory.

Testing locally

If installed using a poetry environment, the tests can be run with: console poetry run pytest tests

Or from another virtual environment, one can use: console pytest tests

To run as a locally installed poetry module

console poetry run ncompare <netcdf file #1> <netcdf file #2>

Why ncompare?

The cdo (climate data operators) tool does not support netCDF4 groups. Moreover, nco operators' ncdiff function computes value differences, but --- as far as the developers of this tool are aware --- nco does not have a simple function to show structural differences between NetCDF4 datasets. Note that h5diff, provided in the HDF5 software, can also be used to find differences. In comparison to h5diff, ncompare is written and runnable in Python; ncompare provides aligned and colorized difference report for quicker assessments of groups, variable names, types, shapes, and attributes; and can generate report files formatted for other applications. However, note that h5diff provides comparison of some otherwise "hidden" hdf5 properties, such as _Netcdf4Dimid or _Netcdf4Coordinates, which are not currently assessed by ncompare.

Notes and known limitations

  • ncompare works successfully with select HDF5 files, although it has not been tested extensively; therefore, it would not be surprising to find additional limitations with other HDF files.
  • ncompare uses xarray to access the root-level dimensions. In some cases, xarray will miss dimensions whose names do not also exist as variable names in the dataset (also known as non-coordinate dimensions).
  • Some underlying HDF5 properties, such as _Netcdf4Dimid or _Netcdf4Coordinates, are not currently assesssed by ncompare.

Notices:

Copyright 2023 United States Government as represented by the Administrator of the National Aeronautics and Space Administration. All Rights Reserved.

This software calls the following third-party software, which is subject to the terms and conditions of its licensor, as applicable at the time of licensing. The third-party software is not bundled with this software but may be available from the licensor.

License hyperlinks are provided here for information purposes only.

| Title | license | link | |:---------|:-------------------------------------------------------------------:|:--------------------------------------------------------------| | colorama | BSD-3-Clause | https://opensource.org/licenses/BSD-3-Clause | | netCDF4 | MIT License | https://opensource.org/licenses/MIT | | numpy | BSD-3-Clause | https://opensource.org/licenses/BSD-3-Clause | | openpyxl | MIT License | https://opensource.org/licenses/MIT | | xarray | Apache License, version 2.0 | https://www.apache.org/licenses/LICENSE-2.0 | | Python | Standard Library Python Software Foundation (PSF) License Agreement | https://docs.python.org/3/license.html#psf-licenseDisclaimers |

The ncompare: NetCDF structural comparison tool framework is licensed under the Apache License, Version 2.0 (the "License"); you may not use this application except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.


This package is NASA Software Release Authorization (SRA) # LAR-20274-1

Owner

  • Name: NASA
  • Login: nasa
  • Kind: organization
  • Email: nasa-data@lists.arc.nasa.gov
  • Location: United States of America

ReadOpen Data initiative here: https://www.nasa.gov/open/ & Instructions here: https://github.com/nasa/nasa.github.io/blob/master/docs/INSTRUCTIONS.md

JOSS Publication

ncompare: A Python package for comparing netCDF structures
Published
June 06, 2024
Volume 9, Issue 98, Page 6490
Authors
Daniel E. Kaufman ORCID
NASA Langley Research Center, Atmospheric Science Data Center, Hampton, VA, USA, Booz Allen Hamilton, Inc., McLean, VA, USA
Walter E. Baskin ORCID
NASA Langley Research Center, Atmospheric Science Data Center, Hampton, VA, USA, Adnet Systems, Inc., Bethesda, MD, USA
Editor
Arfon Smith ORCID
Tags
netCDF comparison data storage

Citation (CITATION.cff)

cff-version: "1.2.0"
authors:
- family-names: Kaufman
  given-names: Daniel E.
  orcid: "https://orcid.org/0000-0002-1487-7298"
- family-names: Baskin
  given-names: Walter E.
  orcid: "https://orcid.org/0000-0002-2241-3266"
doi: 10.5281/zenodo.11448464
message: If you use this software, please cite our article in the
  Journal of Open Source Software.
preferred-citation:
  authors:
  - family-names: Kaufman
    given-names: Daniel E.
    orcid: "https://orcid.org/0000-0002-1487-7298"
  - family-names: Baskin
    given-names: Walter E.
    orcid: "https://orcid.org/0000-0002-2241-3266"
  date-published: 2024-06-06
  doi: 10.21105/joss.06490
  issn: 2475-9066
  issue: 98
  journal: Journal of Open Source Software
  publisher:
    name: Open Journals
  start: 6490
  title: "ncompare: A Python package for comparing netCDF structures"
  type: article
  url: "https://joss.theoj.org/papers/10.21105/joss.06490"
  volume: 9
title: "ncompare: A Python package for comparing netCDF structures"

contact:
  - name: "The NASA ncompare issues page"
    website: "https://github.com/nasa/ncompare/issues"
keywords:
  - "netcdf"
  - "data comparison"
  - "hierarchical data"
url: "https://ncompare.readthedocs.io"
repository-code: "https://github.com/nasa/ncompare"

GitHub Events

Total
  • Create event: 90
  • Release event: 3
  • Issues event: 17
  • Watch event: 4
  • Delete event: 37
  • Issue comment event: 29
  • Push event: 163
  • Pull request review event: 1
  • Pull request event: 68
  • Fork event: 1
Last Year
  • Create event: 90
  • Release event: 3
  • Issues event: 17
  • Watch event: 4
  • Delete event: 37
  • Issue comment event: 29
  • Push event: 163
  • Pull request review event: 1
  • Pull request event: 68
  • Fork event: 1

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 889
  • Total Committers: 8
  • Avg Commits per committer: 111.125
  • Development Distribution Score (DDS): 0.447
Past Year
  • Commits: 142
  • Committers: 5
  • Avg Commits per committer: 28.4
  • Development Distribution Score (DDS): 0.479
Top Committers
Name Email Commits
danielfromearth d****n@n****v 492
ncompare bot n****e@n****m 270
dependabot[bot] 4****] 109
Shiv Kokroo k****o 9
pre-commit-ci[bot] 6****] 3
Eric Berquist e****t@g****m 3
Nic Annau n****u@e****a 2
TKantz 1****z 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 50
  • Total pull requests: 336
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 6 days
  • Total issue authors: 8
  • Total pull request authors: 7
  • Average comments per issue: 0.74
  • Average comments per pull request: 0.42
  • Merged pull requests: 268
  • Bot issues: 1
  • Bot pull requests: 205
Past Year
  • Issues: 8
  • Pull requests: 76
  • Average time to close issues: 16 days
  • Average time to close pull requests: 6 days
  • Issue authors: 1
  • Pull request authors: 4
  • Average comments per issue: 0.13
  • Average comments per pull request: 0.53
  • Merged pull requests: 56
  • Bot issues: 0
  • Bot pull requests: 53
Top Authors
Issue Authors
  • danielfromearth (43)
  • sc0tts (1)
  • nannau (1)
  • jhkennedy (1)
  • kokroo (1)
  • dependabot[bot] (1)
  • nlenssen2013 (1)
  • hazemmahmoud88 (1)
Pull Request Authors
  • dependabot[bot] (191)
  • danielfromearth (126)
  • pre-commit-ci[bot] (16)
  • kokroo (8)
  • nannau (2)
  • berquist (2)
  • arfon (1)
Top Labels
Issue Labels
enhancement (27) documentation (12) bug (8) testing (2) dependencies (1) python (1)
Pull Request Labels
dependencies (195) python (174) github_actions (17) documentation (17) enhancement (15) bug (7) testing (2) good first issue (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 1,169 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 22
  • Total maintainers: 1
pypi.org: ncompare

Compare the structure of two NetCDF files at the command line

  • Versions: 22
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 1,169 Last month
Rankings
Dependent packages count: 7.4%
Forks count: 15.5%
Stargazers count: 21.6%
Average: 28.4%
Dependent repos count: 69.2%
Maintainers (1)
Last synced: 4 months ago

Dependencies

poetry.lock pypi
  • astroid 2.15.6
  • certifi 2023.7.22
  • cftime 1.6.2
  • colorama 0.4.6
  • dill 0.3.7
  • et-xmlfile 1.1.0
  • exceptiongroup 1.1.3
  • flake8 6.1.0
  • iniconfig 2.0.0
  • isort 5.12.0
  • lazy-object-proxy 1.9.0
  • mccabe 0.7.0
  • netcdf4 1.6.4
  • numpy 1.25.2
  • openpyxl 3.1.2
  • packaging 23.1
  • pandas 2.0.3
  • platformdirs 3.10.0
  • pluggy 1.2.0
  • pycodestyle 2.11.0
  • pyflakes 3.1.0
  • pylint 2.17.5
  • pytest 7.4.0
  • python-dateutil 2.8.2
  • pytz 2023.3
  • six 1.16.0
  • tomli 2.0.1
  • tomlkit 0.12.1
  • typing-extensions 4.7.1
  • tzdata 2023.3
  • wrapt 1.15.0
  • xarray 2022.12.0
pyproject.toml pypi
  • colorama ^0.4.5
  • netCDF4 ^1.6.0
  • numpy ^1.23.2
  • openpyxl ^3.0.10
  • python ^3.9
  • xarray ^2022.6.0
.github/workflows/pull_request.yml actions
  • abatilo/actions-poetry v2.3.0 composite
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/push.yml actions
  • abatilo/actions-poetry v2.3.0 composite
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • actions/upload-artifact v2 composite
.github/workflows/release_created.yml actions
  • abatilo/actions-poetry v2.3.0 composite
  • actions/checkout v3 composite
  • actions/setup-python v4 composite