covtobed

covtobed: a simple and fast tool to extract coverage tracks from BAM files - Published in JOSS (2020)

https://github.com/telatin/covtobed

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

alignments bam-files bed bioconda bioinformatics bioinformatics-tool sequence-coverage
Last synced: 6 months ago · JSON representation

Repository

⛰ covtobed | Convert the coverage track from a BAM file into a BED file

Basic Info
  • Host: GitHub
  • Owner: telatin
  • License: mit
  • Language: Makefile
  • Default Branch: master
  • Homepage:
  • Size: 48.1 MB
Statistics
  • Stars: 45
  • Watchers: 1
  • Forks: 3
  • Open Issues: 0
  • Releases: 15
Topics
alignments bam-files bed bioconda bioinformatics bioinformatics-tool sequence-coverage
Created about 6 years ago · Last pushed 8 months ago
Metadata Files
Readme Contributing License

README.md

covtobed

install with bioconda Bioconda installs covtobed Codacy Badge

status License

a tool to generate BED coverage tracks from BAM files

Reads one (or more) alignment files (sorted BAM) and prints a BED with the coverage. It will join consecutive bases with the same coverage, and can be used to only print a BED file with the regions having a specific coverage range.

:book: Read more in the wiki - this is the main documentation source

Features: * Can read (sorted) BAMs from stream (like bwa mem .. | samtools view -b | samtools sort - | covtobed) * Can print strand specific coverage to check for strand imbalance * Can print the physical coverage (with paired-end or mate-paired libraries)

:information_source: For more features, check the BamToCov suite.

covtobed example

Usage

:book: The complete documentation is available in the GitHub wiki.

Synopsis: ``` Usage: covtobed [options] [BAM]...

Computes coverage from alignments

Options: -h, --help show this help message and exit --version show program's version number and exit --physical-coverage compute physical coverage (needs paired alignments in input) -q MINQ, --min-mapq=MINQ skip alignments whose mapping quality is less than MINQ (default: 0) -m MINCOV, --min-cov=MINCOV print BED feature only if the coverage is bigger than (or equal to) MINCOV (default: 0) -x MAXCOV, --max-cov=MAXCOV print BED feature only if the coverage is lower than MAXCOV (default: 100000) -l MINLEN, --min-len=MINLEN print BED feature only if its length is bigger (or equal to) than MINLELN (default: 1) -z MINCTG, --min-ctg-len=MINCTG skip reference sequences having size less or equal to MINCTG -d, --discard-invalid-alignments skip duplicates, failed QC, and non primary alignment, minq>0 (or user-defined if higher) (default: enabled) --keep-invalid-alignments Keep duplicates, failed QC, and non primary alignment, min=0 (or user-defined if higher) - reverts to legacy behavior --output-strands output coverage and stats separately for each strand --format=CHOICE output format ```

Example

Command (with new default filtering): covtobed -m 0 -x 5 test/demo.bam

To use legacy behavior (no filtering): covtobed --keep-invalid-alignments -m 0 -x 5 test/demo.bam Output: text [...] NC_001416.1 0 2 0 NC_001416.1 2 6 1 NC_001416.1 6 7 2 NC_001416.1 7 12 3 NC_001416.1 12 18 4 NC_001416.1 169 170 4 NC_001416.1 201 206 4 [...]

See the full example output from different tools :openfilefolder: here

Install

  • To install with Miniconda:

bash conda install -c bioconda covtobed

  • Both covtobed, and the legacy program coverage are available as a single Docker container available from Docker Hub Docker build:

bash sudo docker pull andreatelatin/covtobed sudo docker run --rm -ti andreatelatin/covtobed coverage -h

  • Download Singularity image by singularity pull docker://andreatelatin/covtobed, then:

bash singularity exec covtobed.simg coverage -h

Important Changes in v1.4.0

Default Behavior Change: Starting with version 1.4.0, covtobed now filters invalid alignments by default (duplicates, failed QC, non-primary alignments). This provides higher quality results out of the box.

  • New default: Invalid alignments are discarded (equivalent to using --discard-invalid-alignments)
  • Legacy behavior: Use --keep-invalid-alignments to revert to the old behavior
  • Conflicting flags: Using both --discard-invalid-alignments and --keep-invalid-alignments will result in an error

Startup message

When invoked without arguments, covtobed will print a message to inform the user that it is waiting for input from STDIN. To suppress this message, set the environment variable COVTOBED_QUIET to 1.

Performance

covtobed is generally faster than bedtools. More details are in the benchmark page.

Requirements and compiling

This tool requires libbamtools and zlib.

To manually compile: c++ -std=c++17 *.cpp -I/path/to/bamtools/ -L${HOME}/path/to/lib/ -lbamtools -o covtobed

Issues, Limitations and how to contribute

  • This program will read the coverage from sorted BAM files. The CRAM format is not supported at the moment.
  • If you find a problem feel free to raise an issue, we will try to address it as soon as possible
  • Contributions are welcome via PR.

Acknowledgements

This tools uses libbamtools by Derek Barnett, Erik Garrison, Gabor Marth and Michael Stromberg, and cpp-optparse by Johannes Weißl. Both tools and this program are released with MIT license.

Authors

Giovanni Birolo (@gbirolo), University of Turin, and Andrea Telatin (@telatin), Quadram Institute Bioscience.

This program was finalized with a Flexible Talent Mobility Award funded by BBSRC through the Quadram Institute.

Citation

If you use this tool, we would really appreciate if you will cite its paper:

Releases after 1.3 (inclusive):

Giovanni Birolo, Andrea Telatin, BamToCov: an efficient toolkit for sequence coverage calculations, Bioinformatics, 2022

Releases up to 1.2:

Birolo et al., (2020). covtobed: a simple and fast tool to extract coverage tracks from BAM files. Journal of Open Source Software, 5(47), 2119, https://doi.org/10.21105/joss.02119

Owner

  • Name: Andrea Telatin
  • Login: telatin
  • Kind: user
  • Location: Norwich, UK
  • Company: Quadram Institute Bioscience

Bioinformatician @quadram-institute-bioscience

JOSS Publication

covtobed: a simple and fast tool to extract coverage tracks from BAM files
Published
March 06, 2020
Volume 5, Issue 47, Page 2119
Authors
Giovanni Birolo
Dept. Medical Sciences, University of Turin, ITALY
Andrea Telatin ORCID
Gut Microbes and Health Programme, Quadram Institute Bioscience, Norwich, UK
Editor
Will Rowe ORCID
Tags
bedtools bamtools genomics bioinformatics target enrichment sequence coverage

GitHub Events

Total
  • Issues event: 1
  • Watch event: 1
  • Push event: 6
  • Create event: 2
Last Year
  • Issues event: 1
  • Watch event: 1
  • Push event: 6
  • Create event: 2

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 289
  • Total Committers: 2
  • Avg Commits per committer: 144.5
  • Development Distribution Score (DDS): 0.028
Past Year
  • Commits: 11
  • Committers: 1
  • Avg Commits per committer: 11.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Andrea Telatin a****a@t****m 281
gbirolo 4****o 8
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 13
  • Total pull requests: 2
  • Average time to close issues: about 1 month
  • Average time to close pull requests: about 2 hours
  • Total issue authors: 5
  • Total pull request authors: 2
  • Average comments per issue: 1.69
  • Average comments per pull request: 0.5
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: about 4 hours
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • telatin (8)
  • jdeligt (2)
  • bilgehannevruz (1)
  • meganpartridge (1)
  • brentp (1)
Pull Request Authors
  • codacy-badger (1)
  • telatin (1)
Top Labels
Issue Labels
enhancement (4) joss (3) bug (2)
Pull Request Labels

Dependencies

.github/workflows/c-cpp.yml actions
  • actions/checkout v2 composite
Dockerfile docker
  • ubuntu 14.04 build
.github/workflows/cmake-test.yml actions
  • actions/checkout v4 composite
  • actions/upload-artifact v4 composite