xarray-dataclass

:zap: xarray data creation by data classes

https://github.com/xarray-contrib/xarray-dataclass

Science Score: 77.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
    2 of 6 committers (33.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.1%) to scientific vocabulary

Keywords from Contributors

mesh

Scientific Fields

Earth and Environmental Sciences Physical Sciences - 83% confidence
Last synced: 4 months ago · JSON representation ·

Repository

:zap: xarray data creation by data classes

Basic Info
Statistics
  • Stars: 8
  • Watchers: 1
  • Forks: 1
  • Open Issues: 4
  • Releases: 2
Created 10 months ago · Last pushed 4 months ago
Metadata Files
Readme License Citation

README.md

xarray-dataclass

Release Python Downloads DOI Tests

xarray data creation by data classes

This repository is adapted from here. We are grateful for the work of the developer on this repo. Sadly, that repository is inactive. Thus, a fork was moved here in order to allow for more visibility and community maintenance.

Overview

xarray-dataclass is a Python package that makes it easy to create xarray's DataArray and Dataset objects that are "typed" (i.e. fixed dimensions, data type, coordinates, attributes, and name) using the Python's dataclass:

```python from dataclasses import dataclass from typing import Literal from xarray_dataclass import AsDataArray, Coord, Data

X = Literal["x"] Y = Literal["y"]

@dataclass class Image(AsDataArray): """2D image as DataArray."""

data: Data[tuple[X, Y], float]
x: Coord[X, int] = 0
y: Coord[Y, int] = 0

```

Features

  • Typed DataArray or Dataset objects can easily be created: python image = Image.new([[0, 1], [2, 3]], [0, 1], [0, 1])
  • NumPy-like filled-data creation is also available: python image = Image.zeros([2, 2], x=[0, 1], y=[0, 1])
  • Support for features by the Python's dataclass.
  • Support for static type check by Pyright.

Installation

There are multiple ways you can install xarray-dataclass, dependent on what kind of dependency manager you use.

shell pip install xarray-dataclass pixi add --pypi xarray-dataclass

Basic usage

xarray-dataclass uses the Python's dataclass. Data (or data variables), coordinates, attributes, and a name of DataArray or Dataset objects will be defined as dataclass fields by special type hints (Data, Coord, Attr, Name), respectively. Note that the following code is supposed in the examples below.

```python from dataclasses import dataclass from typing import Literal from xarraydataclass import AsDataArray, AsDataset from xarraydataclass import Attr, Coord, Data, Name

X = Literal["x"] Y = Literal["y"] ```

Data field

Data field is a field whose value will become the data of a DataArray object or a data variable of a Dataset object. The type hint Data[TDims, TDtype] fixes the dimensions and the data type of the object. Here are some examples of how to specify them.

Type hint | Inferred dimensions --- | --- Data[tuple[()], ...] | () Data[Literal["x"], ...] | ("x",) Data[tuple[Literal["x"]], ...] | ("x",) Data[tuple[Literal["x"], Literal["y"]], ...] | ("x", "y")

Type hint | Inferred data type --- | --- Data[..., Any] | None Data[..., None] | None Data[..., float] | numpy.dtype("float64") Data[..., numpy.float128] | numpy.dtype("float128") Data[..., Literal["datetime64[ns]"]] | numpy.dtype("<M8[ns]")

Coordinate field

Coordinate field is a field whose value will become a coordinate of a DataArray or a Dataset object. The type hint Coord[TDims, TDtype] fixes the dimensions and the data type of the object.

Attribute field

Attribute field is a field whose value will become an attribute of a DataArray or a Dataset object. The type hint Attr[TAttr] specifies the type of the value, which is used only for static type check.

Name field

Name field is a field whose value will become the name of a DataArray object. The type hint Name[TName] specifies the type of the value, which is used only for static type check.

DataArray class

DataArray class is a dataclass that defines typed DataArray specifications. Exactly one data field is allowed in a DataArray class. The second and subsequent data fields are just ignored in DataArray creation.

```python @dataclass class Image(AsDataArray): """2D image as DataArray."""

data: Data[tuple[X, Y], float]
x: Coord[X, int] = 0
y: Coord[Y, int] = 0
units: Attr[str] = "cd / m^2"
name: Name[str] = "luminance"

```

A DataArray object will be created by a class method new():

```python Image.new([[0, 1], [2, 3]], x=[0, 1], y=[0, 1])

array([[0., 1.], [2., 3.]]) Coordinates: * x (x) int64 0 1 * y (y) int64 0 1 Attributes: units: cd / m^2 ```

NumPy-like class methods (zeros(), ones(), ...) are also available:

```python Image.ones((3, 3))

array([[1., 1., 1.], [1., 1., 1.], [1., 1., 1.]]) Coordinates: * x (x) int64 0 0 0 * y (y) int64 0 0 0 Attributes: units: cd / m^2 ```

Dataset class

Dataset class is a dataclass that defines typed Dataset specifications. Multiple data fields are allowed to define the data variables of the object.

```python @dataclass class ColorImage(AsDataset): """2D color image as Dataset."""

red: Data[tuple[X, Y], float]
green: Data[tuple[X, Y], float]
blue: Data[tuple[X, Y], float]
x: Coord[X, int] = 0
y: Coord[Y, int] = 0
units: Attr[str] = "cd / m^2"

```

A Dataset object will be created by a class method new():

```python ColorImage.new( [[0, 0], [0, 0]], # red [[1, 1], [1, 1]], # green [[2, 2], [2, 2]], # blue )

Dimensions: (x: 2, y: 2) Coordinates: * x (x) int64 0 0 * y (y) int64 0 0 Data variables: red (x, y) float64 0.0 0.0 0.0 0.0 green (x, y) float64 1.0 1.0 1.0 1.0 blue (x, y) float64 2.0 2.0 2.0 2.0 Attributes: units: cd / m^2 ```

Advanced usage

Coordof and Dataof type hints

xarray-dataclass provides advanced type hints, Coordof and Dataof. Unlike Data and Coord, they specify a dataclass that defines a DataArray class. This is useful when users want to add metadata to dimensions for plotting. For example:

```python from xarray_dataclass import Coordof

@dataclass class XAxis: data: Data[X, int] long_name: Attr[str] = "x axis" units: Attr[str] = "pixel"

@dataclass class YAxis: data: Data[Y, int] long_name: Attr[str] = "y axis" units: Attr[str] = "pixel"

@dataclass class Image(AsDataArray): """2D image as DataArray."""

data: Data[tuple[X, Y], float]
x: Coordof[XAxis] = 0
y: Coordof[YAxis] = 0

```

General data variable names in Dataset creation

Due to the limitation of Python's parameter names, it is not possible to define data variable names that contain white spaces, for example. In such cases, please define DataArray classes of each data variable so that they have name fields and specify them by Dataof in a Dataset class. Then the values of the name fields will be used as data variable names. For example:

```python @dataclass class Red: data: Data[tuple[X, Y], float] name: Name[str] = "Red image"

@dataclass class Green: data: Data[tuple[X, Y], float] name: Name[str] = "Green image"

@dataclass class Blue: data: Data[tuple[X, Y], float] name: Name[str] = "Blue image"

@dataclass class ColorImage(AsDataset): """2D color image as Dataset."""

red: Dataof[Red]
green: Dataof[Green]
blue: Dataof[Blue]

```

```python ColorImage.new( [[0, 0], [0, 0]], [[1, 1], [1, 1]], [[2, 2], [2, 2]], )

Dimensions: (x: 2, y: 2) Dimensions without coordinates: x, y Data variables: Red image (x, y) float64 0.0 0.0 0.0 0.0 Green image (x, y) float64 1.0 1.0 1.0 1.0 Blue image (x, y) float64 2.0 2.0 2.0 2.0 ```

Customization of DataArray or Dataset creation

For customization, users can add a special class attribute, __dataoptions__, to a DataArray or Dataset class. A custom factory for DataArray or Dataset creation is only supported in the current implementation.

```python import xarray as xr from xarray_dataclass import DataOptions

class Custom(xr.DataArray): """Custom DataArray."""

__slots__ = ()

def custom_method(self) -> bool:
    """Custom method."""
    return True

@dataclass class Image(AsDataArray): """2D image as DataArray."""

data: Data[tuple[X, Y], float]
x: Coord[X, int] = 0
y: Coord[Y, int] = 0

__dataoptions__ = DataOptions(Custom)

image = Image.ones([3, 3]) isinstance(image, Custom) # True image.custom_method() # True ```

DataArray and Dataset creation without shorthands

xarray-dataclass provides functions, asdataarray and asdataset. This is useful when users do not want to inherit the mix-in class (AsDataArray or AsDataset) in a DataArray or Dataset dataclass. For example:

```python from xarray_dataclass import asdataarray

@dataclass class Image: """2D image as DataArray."""

data: Data[tuple[X, Y], float]
x: Coord[X, int] = 0
y: Coord[Y, int] = 0

image = asdataarray(Image([[0, 1], [2, 3]], [0, 1], [0, 1])) ```

How to contribute

Thank you for being willing to contribute! If you have some ideas to propose, please open an issue. We use GitHub flow for developing and managing the project. The first section describes how to contribute with it. The second and third sections explain how to prepare a local development environment and our automated workflows in GitHub Actions, respectively.

Get the source code

shell git clone https://github.com/xarray-contrib/xarray-dataclass cd xarray-dataclass

Install dependencies

First install pixi. Then, install project dependencies:

shell pixi install -a pixi run -e dev pre-commit install

Testing, linting, and formatting

We have a test workflow for testing and a pre-commit workflow for static type checking, linting, and formatting the code. It is performed when a pull request is created against main. If you would like to check them in local, the following commands are almost equivalent (the difference is that the test workflows are run under multiple Python versions). Furthermore, these tasks are defined only in the dev environment. Pixi does not require you to specify the environment in that case.

pixi run tests pixi run precommit # This runs pre-commit on all files.

Creating documentation

We also have a documentation workflow. However, if you want to locally create the documentation run the following:

shell pixi run doc_build # this just creates the build pixi run doc_serve # build and serve at http://localhost:8000/

Create a release

This section is relevant only for maintainers.

  1. Pull git's main branch.
  2. pixi install -a
  3. pixi run -e dev pre-commit install
  4. pixi run tests
  5. pixi shell
  6. hatch version <new-version>
  7. git add .
  8. git commit -m "ENH: Bump version to <version>"
  9. hatch build
  10. hatch publish
  11. git push upstream main
  12. Create a new tag and Release via the GitHub UI. Auto-generate release notes and add additional notes as needed.

Owner

  • Name: xarray-contrib
  • Login: xarray-contrib
  • Kind: organization

xarray compatible projects

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."

title: "xarray-dataclass"
abstract: "xarray data creation by data classes"
version: 3.0.0
date-released: 2025-07-30
license: "MIT"
doi: "10.5281/zenodo.16604747"
url: "https://github.com/xarray-contrib/xarray-dataclass"
authors:
  - given-names: "Akio"
    family-names: "Taniguchi"
    affiliation: "Nagoya University"
    orcid: "https://orcid.org/0000-0002-9695-6183"
  - given-names: "Wouter-Michiel Vierdag"
    family-names: "Vierdag"
    affiliation: "European Molecular Biology Laboratory"
    orcid: "https://orcid.org/0000-0003-1666-5421"
  - given-names: "Matthew"
    family-names: "McCormick"
    affiliation: "Fideus Labs"
    orcid: "https://orcid.org/0000-0001-9475-3756"

GitHub Events

Total
  • Create event: 6
  • Release event: 2
  • Issues event: 7
  • Watch event: 4
  • Member event: 1
  • Issue comment event: 9
  • Push event: 38
  • Pull request event: 8
  • Pull request review event: 2
  • Fork event: 1
Last Year
  • Create event: 6
  • Release event: 2
  • Issues event: 7
  • Watch event: 4
  • Member event: 1
  • Issue comment event: 9
  • Push event: 38
  • Pull request event: 8
  • Pull request review event: 2
  • Fork event: 1

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 666
  • Total Committers: 6
  • Avg Commits per committer: 111.0
  • Development Distribution Score (DDS): 0.032
Past Year
  • Commits: 32
  • Committers: 2
  • Avg Commits per committer: 16.0
  • Development Distribution Score (DDS): 0.438
Top Committers
Name Email Commits
Akio Taniguchi t****i@a****p 645
Wouter-Michiel Vierdag m****g@e****e 14
Sohum Banerjea s****b@g****m 3
Shaun Cutts s****n@c****t 2
dependabot[bot] 4****] 1
Matt McCormick m****t@m****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 6
  • Total pull requests: 14
  • Average time to close issues: 4 days
  • Average time to close pull requests: about 12 hours
  • Total issue authors: 2
  • Total pull request authors: 3
  • Average comments per issue: 0.33
  • Average comments per pull request: 0.71
  • Merged pull requests: 10
  • Bot issues: 0
  • Bot pull requests: 2
Past Year
  • Issues: 6
  • Pull requests: 14
  • Average time to close issues: 4 days
  • Average time to close pull requests: about 12 hours
  • Issue authors: 2
  • Pull request authors: 3
  • Average comments per issue: 0.33
  • Average comments per pull request: 0.71
  • Merged pull requests: 10
  • Bot issues: 0
  • Bot pull requests: 2
Top Authors
Issue Authors
  • melonora (5)
  • thewtex (1)
Pull Request Authors
  • melonora (7)
  • thewtex (5)
  • pre-commit-ci[bot] (2)
Top Labels
Issue Labels
documentation (1)
Pull Request Labels

Dependencies

.github/workflows/gh-pages.yml actions
  • actions/checkout v4 composite
  • actions/setup-python v5 composite
  • peaceiris/actions-gh-pages v3 composite
.github/workflows/tests.yml actions
  • actions/checkout v4 composite
  • actions/setup-python v5 composite
pyproject.toml pypi
  • numpy >=2.0.0,<3
  • typing-extensions >=4.10.0,<5
  • xarray >=2022.3,<2026