https://github.com/biojulia/readdatastores.jl

Datastores for reads, not your papa's FASTQ files.

https://github.com/biojulia/readdatastores.jl

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.1%) to scientific vocabulary

Keywords

bam-files bam-format bioinformatics bioinformatics-data biojulia biology fastq fastq-files fastq-format files format genomics genomics-data sam-bam sam-file sam-files sequencing storage
Last synced: 5 months ago · JSON representation

Repository

Datastores for reads, not your papa's FASTQ files.

Basic Info
  • Host: GitHub
  • Owner: BioJulia
  • License: mit
  • Language: Julia
  • Default Branch: master
  • Homepage:
  • Size: 618 KB
Statistics
  • Stars: 11
  • Watchers: 6
  • Forks: 4
  • Open Issues: 2
  • Releases: 5
Topics
bam-files bam-format bioinformatics bioinformatics-data biojulia biology fastq fastq-files fastq-format files format genomics genomics-data sam-bam sam-file sam-files sequencing storage
Created over 6 years ago · Last pushed over 2 years ago
Metadata Files
Readme Funding License Codeowners

README.md

ReadDatastores

Latest Release MIT license DOI Stable documentation Latest documentation Lifecycle Chat

Description

Not your papa's FASTQ files.

ReadDatastores provides a set of datastore types for storing and randomly accessing sequences from read datasets from disk. Each datastore type is optimised to the type of read data stored.

Using these data-stores grants greater performance than using text files that store reads (see FASTX.jl, XAM.jl, etc.) since the sequences are stored in BioSequences.jl succinct bit encodings already, and preset formats/layouts of the binary files means no need to constantly validate the input.

  • A paired read datastore is provided for paired-end reads and long mate-pairs (Illumina MiSeq etc).
  • A long read datastore is provided for long-reads (Nanopore, PacBio etc.)
  • A linked read datastore is provided for shorter reads that are linked or grouped using some additional (typically proximity based) tag (10x).

Also included is the ability to buffer these datastores, sacrificing some RAM, for faster iteration / sequential access of the reads in the datastore.

Installation

You can install ReadDatastores from the julia REPL. Press ] to enter pkg mode again, and enter the following:

julia add ReadDatastores

If you are interested in the cutting edge of the development, please check out the master branch to try new features before release.

Testing

ReadDatastores is tested against Julia 1.X on Linux, OS X, and Windows.

Latest build status:

Contributing

We appreciate contributions from users including reporting bugs, fixing issues, improving performance and adding new features.

Take a look at the contributing files detailed contributor and maintainer guidelines, and code of conduct.

Financial contributions

We also welcome financial contributions in full transparency on our open collective. Anyone can file an expense. If the expense makes sense for the development of the community, it will be "merged" in the ledger of our open collective by the core contributors and the person who filed the expense will be reimbursed.

Backers & Sponsors

Thank you to all our backers and sponsors!

Love our work and community? Become a backer.

backers

Does your company use BioJulia? Help keep BioJulia feature rich and healthy by sponsoring the project Your logo will show up here with a link to your website.

Questions?

If you have a question about contributing or using BioJulia software, come on over and chat to us on Gitter, or you can try the Bio category of the Julia discourse site.

Owner

  • Name: BioJulia
  • Login: BioJulia
  • Kind: organization

Bioinformatics and Computational Biology in Julia

GitHub Events

Total
Last Year

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 41
  • Total Committers: 3
  • Avg Commits per committer: 13.667
  • Development Distribution Score (DDS): 0.073
Past Year
  • Commits: 3
  • Committers: 2
  • Avg Commits per committer: 1.5
  • Development Distribution Score (DDS): 0.333
Top Committers
Name Email Commits
Ben J. Ward b****d@p****m 38
Hiroki Ban 6****o 2
Michael Persico m****o@g****m 1

Issues and Pull Requests

Last synced: 8 months ago

All Time
  • Total issues: 5
  • Total pull requests: 15
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 6 days
  • Total issue authors: 4
  • Total pull request authors: 4
  • Average comments per issue: 1.0
  • Average comments per pull request: 1.07
  • Merged pull requests: 13
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • SabrinaJaye (2)
  • Zymergen-phaverty (1)
  • AntonOresten (1)
  • jakobnissen (1)
Pull Request Authors
  • SabrinaJaye (11)
  • banhbio (2)
  • gitter-badger (1)
  • M-PERSIC (1)
Top Labels
Issue Labels
Pull Request Labels
enhancement (3)

Packages

  • Total packages: 1
  • Total downloads:
    • julia 1 total
  • Total dependent packages: 2
  • Total dependent repositories: 0
  • Total versions: 5
juliahub.com: ReadDatastores

Datastores for reads, not your papa's FASTQ files.

  • Versions: 5
  • Dependent Packages: 2
  • Dependent Repositories: 0
  • Downloads: 1 Total
Rankings
Dependent repos count: 9.9%
Dependent packages count: 16.6%
Average: 22.6%
Forks count: 28.1%
Stargazers count: 35.6%
Last synced: 6 months ago