NCDatasets.jl

NCDatasets.jl: a Julia package for manipulating netCDF data sets - Published in JOSS (2024)

https://github.com/JuliaGeo/NCDatasets.jl

Science Score: 77.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
    3 of 24 committers (12.5%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.6%) to scientific vocabulary

Keywords

climate-and-forecast-conventions climatology earth-observation julia meteorology netcdf oceanography opendap

Keywords from Contributors

climate ocean climate-change data-assimilation fluid-dynamics graphics fluxes surrogate ocean-modelling nonlinear-dynamics
Last synced: 6 months ago · JSON representation ·

Repository

Load and create NetCDF files in Julia

Basic Info
Statistics
  • Stars: 167
  • Watchers: 9
  • Forks: 32
  • Open Issues: 28
  • Releases: 71
Topics
climate-and-forecast-conventions climatology earth-observation julia meteorology netcdf oceanography opendap
Created over 8 years ago · Last pushed 7 months ago
Metadata Files
Readme License Citation

README.md

NCDatasets

Build Status codecov documentation stable documentation dev DOI

NCDatasets allows one to read and create netCDF files. NetCDF data set and attribute list behave like Julia dictionaries and variables like Julia arrays. This package implements the CommonDataModel.jl interface, which mean that the datasets can be accessed in the same way as GRIB files (GRIBDatasets.jl) and Zarr files (ZarrDatasets.jl).

The module NCDatasets provides support for the following netCDF CF conventions: * _FillValue will be returned as missing (more information) * scale_factor and add_offset are applied if present * time variables (recognized by the units attribute) are returned as DateTime objects. * support of the CF calendars (standard, gregorian, proleptic gregorian, julian, all leap, no leap, 360 day) using CFTime * the raw data can also be accessed (without the transformations above). * contiguous ragged array representation

Other features include: * Support for NetCDF 4 compression and variable-length arrays (i.e. arrays of vectors where each vector can have potentailly a different length) * The module also includes an utility function ncgen which generates the Julia code that would produce a netCDF file with the same metadata as a template netCDF file.

Installation

Inside the Julia shell, you can download and install the package by issuing:

julia using Pkg Pkg.add("NCDatasets")

Manual

This Manual is a quick introduction in using NCDatasets.jl. For more details you can read the stable or dev documentation.

Create a netCDF file

The following gives an example of how to create a netCDF file by defining dimensions, variables and attributes.

```julia using NCDatasets using DataStructures: OrderedDict

This creates a new NetCDF file called file.nc.

The mode "c" stands for creating a new file (clobber)

ds = NCDataset("file.nc","c")

Define the dimension "lon" and "lat" with the size 100 and 110 resp.

defDim(ds,"lon",100) defDim(ds,"lat",110)

Define a global attribute

ds.attrib["title"] = "this is a test file"

Define the variables temperature with the attribute units

v = defVar(ds,"temperature",Float32,("lon","lat"), attrib = OrderedDict( "units" => "degree Celsius", "scale_factor" => 10, ))

add additional attributes

v.attrib["comments"] = "this is a string attribute with Unicode Ω ∈ ∑ ∫ f(x) dx"

Generate some example data

data = [Float32(i+j) for i = 1:100, j = 1:110];

write a single column

v[:,1] = data[:,1];

write a the complete data set

v[:,:] = data;

close(ds) ```

It is also possible to create the dimensions, the define the variable and set its value with a single call to defVar:

julia using NCDatasets ds = NCDataset("/tmp/test2.nc","c") data = [Float32(i+j) for i = 1:100, j = 1:110] v = defVar(ds,"temperature",data,("lon","lat")) close(ds)

Explore the content of a netCDF file

Before reading the data from a netCDF file, it is often useful to explore the list of variables and attributes defined in it.

For interactive use, the following commands (without ending semicolon) display the content of the file similarly to ncdump -h file.nc:

julia using NCDatasets ds = NCDataset("file.nc")

This creates the central structure of NCDatasets.jl, NCDataset, which represents the contents of the netCDF file (without immediatelly loading everything in memory). NCDataset is an alias for Dataset.

The following displays the information just for the variable varname:

julia ds["varname"]

while to get the global attributes you can do: julia ds.attrib

NCDataset("file.nc") produces a listing like:

``` Dataset: file.nc Group: /

Dimensions lon = 100 lat = 110

Variables temperature (100 × 110) Datatype: Float32 (Float32) Dimensions: lon × lat Attributes: units = degree Celsius scale_factor = 10 comments = this is a string attribute with Unicode Ω ∈ ∑ ∫ f(x) dx

Global attributes title = this is a test file ```

Load a netCDF file

Loading a variable with known structure can be achieved by accessing the variables and attributes directly by their name.

```julia

The mode "r" stands for read-only. The mode "r" is the default mode and the parameter can be omitted.

ds = NCDataset("file.nc","r") v = ds["temperature"]

load a subset

subdata = v[10:30,30:5:end]

load all data

data = v[:,:]

load all data ignoring attributes like scalefactor, addoffset, _FillValue and time units

data2 = v.var[:,:];

load an attribute

unit = v.attrib["units"] close(ds) ```

In the example above, the subset can also be loaded with:

julia subdata = NCDataset("file.nc")["temperature"][10:30,30:5:end]

This might be useful in an interactive session. However, the file test.nc is not directly closed (closing the file will be triggered by Julia's garbage collector), which can be a problem if you open many files. On Linux the number of opened files is often limited to 1024 (soft limit). If you write to a file, you should also always close the file to make sure that the data is properly written to the disk.

An alternative way to ensure the file has been closed is to use a do block: the file will be closed automatically when leaving the block.

julia data = NCDataset(filename,"r") do ds ds["temperature"][:,:] end # ds is closed

Edit an existing netCDF file

When you need to modify variables or attributes in a netCDF file, you have to open it with the "a" option. Here, for example, we add a global attribute creator to the file created in the previous step.

julia ds = NCDataset("file.nc","a") ds.attrib["creator"] = "your name" close(ds);

Benchmark

The benchmark loads a variable of the size 1000x500x100 in slices of 1000x500 (applying the scaling of the CF conventions) and computes the maximum of each slice and the average of each maximum over all slices. This operation is repeated 100 times. The code is available at https://github.com/Alexander-Barth/NCDatasets.jl/tree/master/test/perf .

| Module | median | minimum | mean | std. dev. | |:---------------- | ------:| -------:| -----:| ---------:| | R-ncdf4 | 0.407 | 0.384 | 0.407 | 0.010 | | python-netCDF4 | 0.475 | 0.463 | 0.476 | 0.010 | | julia-NCDatasets | 0.265 | 0.249 | 0.267 | 0.011 |

All runtimes are in seconds. We use Julia 1.10.0 (with NCDatasets 0.14.0), R 4.1.2 (with ncdf4 1.22) and Python 3.10.12 (with netCDF4 1.6.5) on a i5-1135G7 CPU and NVMe SSD (WDC WDS100T2B0C).

Filing an issue

When you file an issue, please include sufficient information that would allow somebody else to reproduce the issue, in particular: 1. Provide the code that generates the issue. 2. If necessary to run your code, provide the used netCDF file(s). 3. Make your code and netCDF file(s) as simple as possible (while still showing the error and being runnable). A big thank you for the 5-star-premium-gold users who do not forget this point! 👍🏅🏆 4. The full error message that you are seeing (in particular file names and line numbers of the stack-trace). 5. Which version of Julia and NCDatasets are you using? Please include the output of: versioninfo() using Pkg Pkg.installed()["NCDatasets"] 6. Does NCDatasets pass its test suite? Please include the output of:

julia using Pkg Pkg.test("NCDatasets")

Alternative

The package NetCDF.jl from Fabian Gans and contributors is an alternative to this package which supports a more Matlab/Octave-like interface for reading and writing NetCDF files.

Credits

netcdf_c.jl and the error handling code of the NetCDF C API are from NetCDF.jl by Fabian Gans (Max-Planck-Institut für Biogeochemie, Jena, Germany) released under the MIT license.

Owner

  • Name: JuliaGeo
  • Login: JuliaGeo
  • Kind: organization

Geospatial packages for Julia

Citation (CITATION.cff)

cff-version: "1.2.0"
authors:
- family-names: Barth
  given-names: Alexander
  orcid: "https://orcid.org/0000-0003-2952-5997"
doi: 10.5281/zenodo.11067062
message: If you use this software, please cite our article in the
  Journal of Open Source Software.
preferred-citation:
  authors:
  - family-names: Barth
    given-names: Alexander
    orcid: "https://orcid.org/0000-0003-2952-5997"
  date-published: 2024-05-29
  doi: 10.21105/joss.06504
  issn: 2475-9066
  issue: 97
  journal: Journal of Open Source Software
  publisher:
    name: Open Journals
  start: 6504
  title: "NCDatasets.jl: a Julia package for manipulating netCDF data
    sets"
  type: article
  url: "https://joss.theoj.org/papers/10.21105/joss.06504"
  volume: 9
title: "NCDatasets.jl: a Julia package for manipulating netCDF data
  sets"

GitHub Events

Total
  • Create event: 5
  • Commit comment event: 6
  • Release event: 1
  • Issues event: 6
  • Watch event: 12
  • Delete event: 1
  • Issue comment event: 43
  • Push event: 17
  • Pull request review event: 1
  • Pull request review comment event: 1
  • Pull request event: 7
  • Fork event: 2
Last Year
  • Create event: 5
  • Commit comment event: 6
  • Release event: 1
  • Issues event: 6
  • Watch event: 12
  • Delete event: 1
  • Issue comment event: 43
  • Push event: 17
  • Pull request review event: 1
  • Pull request review comment event: 1
  • Pull request event: 7
  • Fork event: 2

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 1,137
  • Total Committers: 24
  • Avg Commits per committer: 47.375
  • Development Distribution Score (DDS): 0.146
Past Year
  • Commits: 32
  • Committers: 7
  • Avg Commits per committer: 4.571
  • Development Distribution Score (DDS): 0.25
Top Committers
Name Email Commits
Alexander Barth b****r@g****m 971
Datseris d****e@g****m 44
Tristan Carion t****n@g****m 32
Martijn Visser m****r@g****m 22
ctroupin c****n@g****m 11
Rafael Schouten r****n@g****m 6
adigitoleo a****o@d****m 6
Argel Ramírez Reyes a****z@g****m 5
github-actions[bot] 4****] 5
Gael Forget g****t@m****u 5
Philippe Roy b****r@y****a 4
Kene Uba j****a@p****h 4
Gregory L. Wagner w****g@g****m 4
Alexander Barth e****l@e****m 3
Fabian Gans f****s@b****e 3
Peter Shin p****h@g****m 3
Gabriele Bozzola s****r@g****m 2
tcarion t****n@g****m 1
Benoit Pasquier 4****c 1
Charles Kawczynski k****s@g****m 1
Julia TagBot 5****t 1
Navid C. Constantinou n****y 1
Steven G. Johnson s****j@m****u 1
Kosuke Sando k****o@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 149
  • Total pull requests: 52
  • Average time to close issues: 3 months
  • Average time to close pull requests: 21 days
  • Total issue authors: 59
  • Total pull request authors: 22
  • Average comments per issue: 5.26
  • Average comments per pull request: 4.98
  • Merged pull requests: 36
  • Bot issues: 0
  • Bot pull requests: 7
Past Year
  • Issues: 9
  • Pull requests: 8
  • Average time to close issues: 8 days
  • Average time to close pull requests: about 24 hours
  • Issue authors: 7
  • Pull request authors: 5
  • Average comments per issue: 2.67
  • Average comments per pull request: 0.38
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 1
Top Authors
Issue Authors
  • Datseris (18)
  • Balinus (9)
  • rafaqz (9)
  • haakon-e (8)
  • ryofurue (8)
  • natgeo-wong (7)
  • Alexander-Barth (6)
  • sjdaines (4)
  • ctroupin (4)
  • brynpickering (3)
  • Yixiao-Zhang (3)
  • tiemvanderdeure (3)
  • kongdd (3)
  • gvali (3)
  • aramirezreyes (3)
Pull Request Authors
  • Datseris (9)
  • github-actions[bot] (7)
  • visr (6)
  • Balinus (3)
  • gaelforget (3)
  • Alexander-Barth (3)
  • Sbozzolo (2)
  • keduba (2)
  • rafaqz (2)
  • lupemba (2)
  • glwagner (2)
  • JuliaTagBot (1)
  • charleskawczynski (1)
  • briochemc (1)
  • ctroupin (1)
Top Labels
Issue Labels
feature request (5) upstream bug (3) help wanted (3) announcement (1) enhancement (1) bug (1) question (1) upstream feature request (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • julia 874 total
  • Total dependent packages: 69
  • Total dependent repositories: 9
  • Total versions: 60
juliahub.com: NCDatasets

Load and create NetCDF files in Julia

  • Versions: 60
  • Dependent Packages: 69
  • Dependent Repositories: 9
  • Downloads: 874 Total
Rankings
Dependent packages count: 1.3%
Dependent repos count: 3.4%
Average: 4.1%
Stargazers count: 5.7%
Forks count: 6.0%
Last synced: 6 months ago

Dependencies

.github/workflows/CompatHelper.yml actions
  • julia-actions/setup-julia latest composite
.github/workflows/TagBot.yml actions
  • JuliaRegistries/TagBot v1 composite
.github/workflows/ci-binary-builder.yml actions
  • actions/cache v1 composite
  • actions/checkout v2 composite
  • julia-actions/julia-buildpkg latest composite
  • julia-actions/setup-julia v1 composite
.github/workflows/ci.yml actions
  • actions/cache v1 composite
  • actions/checkout v2 composite
  • codecov/codecov-action v1 composite
  • julia-actions/julia-buildpkg latest composite
  • julia-actions/julia-processcoverage v1 composite
  • julia-actions/julia-runtest latest composite
  • julia-actions/setup-julia v1 composite