dnaio

Efficiently read and write sequencing data from Python

https://github.com/marcelm/dnaio

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.9%) to scientific vocabulary

Keywords

bioinformatics python
Last synced: 6 months ago · JSON representation ·

Repository

Efficiently read and write sequencing data from Python

Basic Info
Statistics
  • Stars: 63
  • Watchers: 5
  • Forks: 9
  • Open Issues: 8
  • Releases: 0
Topics
bioinformatics python
Created over 7 years ago · Last pushed 11 months ago
Metadata Files
Readme Changelog License Citation

README.rst

.. image:: https://zenodo.org/badge/DOI/10.5281/zenodo.10548864.svg
  :target: https://doi.org/10.5281/zenodo.10548864

.. image:: https://github.com/marcelm/dnaio/workflows/CI/badge.svg
    :alt: GitHub Actions badge

.. image:: https://img.shields.io/pypi/v/dnaio.svg?branch=main
    :target: https://pypi.python.org/pypi/dnaio
    :alt: PyPI badge

.. image:: https://codecov.io/gh/marcelm/dnaio/branch/master/graph/badge.svg
    :target: https://codecov.io/gh/marcelm/dnaio
    :alt: Codecov badge

===========================================
dnaio processes FASTQ, FASTA and uBAM files
===========================================

``dnaio`` is a Python 3.9+ library for very efficient parsing and writing of FASTQ and also FASTA files.
Since ``dnaio`` version 1.1.0, support for efficiently parsing uBAM files has been implemented.
This allows reading ONT files from the `dorado `_
basecaller directly.

The code was previously part of the
`Cutadapt `_ tool and has been improved significantly since it has been split out.

Example usage
=============

The main interface is the `dnaio.open `_ function::

    import dnaio

    with dnaio.open("reads.fastq.gz") as f:
        bp = 0
        for record in f:
            bp += len(record)
    print(f"The input file contains {bp/1E6:.1f} Mbp")

For more, see the `tutorial `_ and
`API documentation `_.

Installation
============

Using pip::

    pip install dnaio zstandard

``zstandard`` can be omitted if support for Zstandard (``.zst``) files is not required.

Features and supported file types
=================================

- FASTQ input and output
- FASTA input and output
- BAM input
- Compressed input and output (``.gz``, ``.bz2``, ``.xz`` and ``.zst`` are detected automatically)
- Paired-end data in two files
- Interleaved paired-end data in a single file
- Files with DOS/Windows linebreaks can be read
- FASTQ files with a second header line (after the ``+``) are supported

Limitations
===========

- Multi-line FASTQ files are not supported
- FASTQ and uBAM parsing is the focus of this library. The FASTA parser is not as optimized

Links
=====

* `Documentation `_
* `Source code `_
* `Report an issue `_
* `Project page on PyPI `_

Owner

  • Name: Marcel Martin
  • Login: marcelm
  • Kind: user
  • Location: Stockholm

Citation (CITATION.cff)

cff-version: 1.2.0
title: dnaio
type: software
authors:
  - given-names: Marcel
    family-names: Martin
    orcid: 'https://orcid.org/0000-0002-0680-200X'
  - given-names: Ruben Harmen Paul
    family-names: Vorderman
    orcid: 'https://orcid.org/0000-0002-8813-1528'
identifiers:
  - type: doi
    value: 10.5281/zenodo.10548864
repository-code: 'https://github.com/marcelm/dnaio/'
url: 'https://dnaio.readthedocs.io/'
license: MIT

GitHub Events

Total
  • Issues event: 1
  • Watch event: 8
  • Delete event: 5
  • Issue comment event: 33
  • Push event: 13
  • Pull request review comment event: 2
  • Pull request review event: 2
  • Pull request event: 12
  • Create event: 6
Last Year
  • Issues event: 1
  • Watch event: 8
  • Delete event: 5
  • Issue comment event: 33
  • Push event: 13
  • Pull request review comment event: 2
  • Pull request review event: 2
  • Pull request event: 12
  • Create event: 6

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 693
  • Total Committers: 6
  • Avg Commits per committer: 115.5
  • Development Distribution Score (DDS): 0.486
Past Year
  • Commits: 27
  • Committers: 3
  • Avg Commits per committer: 9.0
  • Development Distribution Score (DDS): 0.407
Top Committers
Name Email Commits
Ruben Vorderman r****n@l****l 356
Marcel Martin m****n@s****e 329
Bede Constantinides b****c@g****m 5
Étienne Mollier e****r@d****g 1
Luchao Qi 4****i 1
Black Robot 1
Committer Domains (Top 20 + Academic)

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 44,634 last-month
  • Total docker downloads: 132,207
  • Total dependent packages: 16
    (may contain duplicates)
  • Total dependent repositories: 27
    (may contain duplicates)
  • Total versions: 32
  • Total maintainers: 3
pypi.org: dnaio

Read and write FASTA and FASTQ files efficiently

  • Versions: 27
  • Dependent Packages: 15
  • Dependent Repositories: 27
  • Downloads: 44,634 Last month
  • Docker Downloads: 132,207
Rankings
Docker downloads count: 0.9%
Dependent packages count: 1.3%
Average: 1.8%
Downloads: 2.0%
Dependent repos count: 2.8%
Maintainers (2)
Last synced: 6 months ago
spack.io: py-dnaio

Read and write FASTQ and FASTA

  • Versions: 5
  • Dependent Packages: 1
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Average: 19.5%
Stargazers count: 23.2%
Forks count: 26.7%
Dependent packages count: 28.1%
Maintainers (1)
Last synced: 6 months ago

Dependencies

doc/requirements.txt pypi
  • furo *
  • sphinx_issues *
.github/workflows/ci.yml actions
  • actions/checkout v3 composite
  • actions/download-artifact v2 composite
  • actions/setup-python v4 composite
  • actions/upload-artifact v2 composite
  • codecov/codecov-action v3 composite
  • pypa/cibuildwheel v2.11.2 composite
  • pypa/gh-action-pypi-publish v1.5.1 composite
pyproject.toml pypi
  • xopen >= 1.4.0
setup.py pypi