Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (16.7%) to scientific vocabulary
Keywords
ome-zarr
on-disk
out-of-memory
r
r-package
zarr
Keywords from Contributors
bioconductor-packages
Last synced: 6 months ago
·
JSON representation
Repository
A simple native R reader for Zarr Arrays
Basic Info
- Host: GitHub
- Owner: Huber-group-EMBL
- License: other
- Language: R
- Default Branch: devel
- Homepage: https://bioconductor.org/packages/Rarr/
- Size: 1.17 MB
Statistics
- Stars: 45
- Watchers: 7
- Forks: 7
- Open Issues: 12
- Releases: 0
Topics
ome-zarr
on-disk
out-of-memory
r
r-package
zarr
Created about 3 years ago
· Last pushed 6 months ago
Metadata Files
Readme
Changelog
Contributing
License
README.Rmd
---
title: "Zarr arrays with Rarr"
author: "Mike L. Smith"
output:
github_document:
toc: true
toc_depth: 2
---
| GitHub Actions | Bioconductor Build Sysytem | Test Coverage |
|:--------------:|:-------------:|:-----:|
| [](https://github.com/grimbough/Rarr/actions/workflows/main.yml) | [](https://bioconductor.org/checkResults/devel/bioc-LATEST/Rarr/) | [](https://app.codecov.io/gh/grimbough/Rarr?branch=devel) |
```{r, config, echo = FALSE}
knitr::opts_chunk$set(fig.path = "inst/rmd/imgs/", dev = "jpeg")
```
# Introduction to Rarr
The Zarr specification defines a format for chunked, compressed, N-dimensional
arrays. It's design allows efficient access to subsets of the stored array, and
supports both local and cloud storage systems. Zarr is experiencing increasing
adoption in a number of scientific fields, where multi-dimensional data are
prevalent.
**Rarr** is intended to be a simple interface to reading and writing individual
Zarr arrays. It is developed
in R and C with no reliance on external libraries or APIs for interfacing with
the Zarr arrays. Additional compression libraries (e.g. blosc) are bundled with
**Rarr** to provide support for datasets compressed using these tools.
## Limitations with **Rarr**
**Rarr currently only works with [Zarr specification version 2](https://zarr-specs.readthedocs.io/en/latest/v2/v2.0.html).
Support for [version 3](https://zarr-specs.readthedocs.io/en/latest/v3/core/index.html) is actively being worked on.**
If you know about Zarr arrays already, you'll probably be aware they can be
stored in hierarchical groups, where additional meta data can explain the
relationship between the arrays. Currently, **Rarr** is not designed to be
aware of these hierarchical Zarr array collections. However, the component
arrays can be read individually by providing the path to them directly.
Currently, there are also limitations on the Zarr datatypes that can be accessed
using **Rarr**. For now most numeric types can be read into R, although in some
instances e.g. 64-bit integers there is potential for loss of information.
Writing is more limited with support only for datatypes that are supported
natively in R and only using the column-first representation.
# Quick start guide
```{r, child=c('inst/rmd/quick_start.Rmd')}
```
# Current Status
## Reading and Writing
Reading Zarr arrays is reasonably well supported. Writing is available, but is more limited. Both aspects are under active development.
### Data Types
Currently there is only support for reading and writing a subset of the possible datatypes
that can be found in a Zarr array. In some instances there are also limitations on the
datatypes natively supported by R, requiring conversion from the Zarr datatype. The table below summarises the current status of
datatype support. It will be updated as progress is made.
| Zarr Data Type | Status
(reading / writing) | Notes |
|-----------|:--------------:|-------|
|`boolean` | ✔ / ❌ | |
|`int8` | ✔ / ❌ | |
|`uint8` | ✔ / ❌ | |
|`int16` | ✔ / ❌ | |
|`uint16`| ✔ / ❌ | |
|`int32` | ✔ / ✔ | |
|`uint32`| ✔ / ❌ | Values outside the range of `int32` are converted to `NA`. Future plan is to allow conversion to `double` or use the [bit64](https://cran.r-project.org/package=bit64) package. |
|`int64` | ✔ / ❌ | Values outside the range of `int32` are converted to `NA`. Future plan is to allow conversion to `double` or use the [bit64](https://cran.r-project.org/package=bit64) package. |
|`uint64`| ✔ / ❌ | Values outside the range of `int32` are converted to `NA`. Future plan is to allow conversion to `double` or use the [bit64](https://cran.r-project.org/package=bit64) package. |
|`half` / `float16` | ✔ / ❌ | Converted to `double` in R. No effort is made to assess loss of precision due to conversion. |
|`single` / `float32` | ✔ / ❌ | Converted to `double` in R. No effort is made to assess loss of precision due to conversion. |
|`double` / `float64` | ✔ / ✔ | |
|`complex` | ❌ / ❌ | |
|`timedelta` | ❌ / ❌ | |
|`datetime` | ❌ / ❌ | |
|`string` | ✔ / ✔ | |
|`Unicode` | ✔ / ✔ | |
|`void *` | ❌ / ❌ | |
| Structured data types | ❌ / ❌ | |
### Compression Tools
| Data Type | Status
(reading / writing) | Notes |
|-------------|:-------------------:|-------|
|`zlib / gzip`| ✔ / ✔ | Only system default compression level (normally 6) is enabled for writing. |
|`bzip2` | ✔ / ✔ | Only compression level 9 is enabled for writing. |
|`blosc` | ✔ / ✔ | Only `lz4` compression level 5 is enabled for writing. |
|`LZMA ` | ✔ / ✔ | |
|`LZ4` | ✔ / ✔ | |
|`Zstd` | ✔ / ✔ | |
Please open an [issue](https://github.com/grimbough/Rarr/issues) if support for a required compression tool is missing.
### Filters
The is currently no support for additional filters. Please open an [issue](https://github.com/grimbough/Rarr/issues) if you require filter support.
# Required system libraries
To provide support for BLOSC and zstd compression tools Rarr links against libraries providing these tools. If you have them installed on your system Rarr will attempt to use those versions. If they are not detected then Rarr will compile and use versions that are distributed with the package. Either way the functionality will available, however if you are using the system libraries and then later remove them Rarr may fail to work correctly.
This only concerns users installing the package from source. If you are using the pre-built binaries for Windows or Mac OSX distributed by Bioconductor then this should not be an issue for you.
Owner
- Name: EMBL Huber Group
- Login: Huber-group-EMBL
- Kind: organization
- Website: www.huber.embl.de
- Repositories: 4
- Profile: https://github.com/Huber-group-EMBL
Repository for joint projects by members and ex-members of the Huber group at EMBL. Please see also individuals' pages for further repositories.
GitHub Events
Total
- Issues event: 5
- Watch event: 3
- Delete event: 9
- Issue comment event: 26
- Push event: 22
- Pull request review event: 7
- Pull request event: 20
- Pull request review comment event: 7
- Fork event: 1
- Create event: 8
Last Year
- Issues event: 5
- Watch event: 3
- Delete event: 9
- Issue comment event: 26
- Push event: 22
- Pull request review event: 7
- Pull request event: 20
- Pull request review comment event: 7
- Fork event: 1
- Create event: 8
Committers
Last synced: 8 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Mike Smith | g****h@g****m | 295 |
| J Wokaty | j****y | 8 |
| Hugo Gruson | 1****o | 1 |
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 5
- Total pull requests: 12
- Average time to close issues: about 2 years
- Average time to close pull requests: about 4 hours
- Total issue authors: 5
- Total pull request authors: 2
- Average comments per issue: 1.2
- Average comments per pull request: 0.0
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 3
- Pull requests: 12
- Average time to close issues: N/A
- Average time to close pull requests: about 4 hours
- Issue authors: 3
- Pull request authors: 2
- Average comments per issue: 1.0
- Average comments per pull request: 0.0
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- DyfanJones (1)
- Bisaloo (1)
- Artur-man (1)
- giovp (1)
- jkh1 (1)
Pull Request Authors
- Bisaloo (11)
- sharlagelfand (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
.github/workflows/main.yml
actions
- actions/checkout v3 composite
- actions/upload-artifact v3 composite
- grimbough/bioc-actions/build-install-check v1 composite
- grimbough/bioc-actions/run-BiocCheck v1 composite
- grimbough/bioc-actions/setup-bioc v1 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/test-coverage.yaml
actions
- actions/checkout v3 composite
- actions/upload-artifact v3 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION
cran
- R.utils * imports
- httr * imports
- jsonlite * imports
- methods * imports
- paws.storage * imports
- stringr * imports
- utils * imports
- BiocStyle * suggests
- covr * suggests
- knitr * suggests
- tinytest * suggests
inst/scripts/Dockerfile
docker
- python 3.10 build