best

Bam Error Stats Tool (best): analysis of error types in aligned reads.

https://github.com/google/best

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.5%) to scientific vocabulary

Keywords

bioinformatics sequencing
Last synced: 6 months ago · JSON representation ·

Repository

Bam Error Stats Tool (best): analysis of error types in aligned reads.

Basic Info
  • Host: GitHub
  • Owner: google
  • License: mit
  • Language: Rust
  • Default Branch: main
  • Homepage:
  • Size: 132 KB
Statistics
  • Stars: 137
  • Watchers: 9
  • Forks: 12
  • Open Issues: 5
  • Releases: 1
Topics
bioinformatics sequencing
Created about 3 years ago · Last pushed about 1 year ago
Metadata Files
Readme Contributing License Citation

README.md

best

Bam Error Stats Tool (best): analysis of error types in aligned reads.

best is used to assess the quality of reads after aligning them to a reference assembly.

Features

  • Collect overall and per alignment stats
  • Distribution of indel lengths
  • Yield at different empirical Q-value thresholds
  • Bin per read stats to easily examine the distribution of errors for certain types of reads
  • Stats for regions specified by intervals (BED file, homopolymer regions, windows etc.)
  • Stats for quality scores vs empirical Q-values
  • Multithreading for speed

Usage

The best Usage Guide gives an overview of how to use best.

Installing

  1. Install Rust.
  2. Clone this repository and navigate into the directory of this repository.
  3. Run cargo install --locked --path .
  4. Run best input.bam reference.fasta prefix/path

This will generate stats files with the prefix/path prefix.

Development

Running

  1. Install Rust.
  2. Clone this repository and navigate into the directory of this repository.
  3. Run cargo build --release
  4. Run cargo run --release -- input.bam reference.fasta prefix/path or target/release/best input.bam reference.fasta prefix/path

This will generate stats files with the prefix/path prefix.

The built binary is located at target/release/best.

Formatting

cargo fmt

Comparing

Remember to pass the -t 1 option to ensure that only one thread is used for testing. Best generally tries to ensure the order of outputs is deterministic with multiple threads, but the order of per-alignment stats is arbitrary unless only one thread is used.

Disclaimer

This is not an official Google product.

The code is not intended for use in any clinical settings. It is not intended to be a medical device and is not intended for clinical use of any kind, including but not limited to diagnosis or prognosis.

No representations or warranties are made with regards to the accuracy of results generated. User or licensee is responsible for verifying and validating accuracy when using this tool.

Owner

  • Name: Google
  • Login: google
  • Kind: organization
  • Email: opensource@google.com
  • Location: United States of America

Google ❤️ Open Source

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Liu"
  given-names: "Daniel"
  orcid: "https://orcid.org/0000-0002-2385-2957"
- family-names: "Belyaeva"
  given-names: "Anastasiya"
- family-names: "Shafin"
  given-names: "Kishwar"
  orcid: "https://orcid.org/0000-0001-5252-3434"
- family-names: "Chang"
  given-names: "Pi-Chuan"
  orcid: "https://orcid.org/0000-0003-3021-6446"
- family-names: "Carroll"
  given-names: "Andrew"
  orcid: "https://orcid.org/0000-0002-4824-6689"
- family-names: "Cook"
  given-names: "Daniel"
  orcid: "https://orcid.org/0000-0003-3347-562X"
title: "Best: A Tool for Characterizing Sequencing Errors"
version: 0.1.0
doi: 10.1101/2022.12.22.521488
date-released: 2020-12-09
url: "https://github.com/google/best"

GitHub Events

Total
  • Issues event: 1
  • Watch event: 12
  • Issue comment event: 8
  • Push event: 1
  • Pull request event: 2
  • Fork event: 3
Last Year
  • Issues event: 1
  • Watch event: 12
  • Issue comment event: 8
  • Push event: 1
  • Pull request event: 2
  • Fork event: 3

Committers

Last synced: 6 months ago

All Time
  • Total Commits: 120
  • Total Committers: 5
  • Avg Commits per committer: 24.0
  • Development Distribution Score (DDS): 0.133
Past Year
  • Commits: 1
  • Committers: 1
  • Avg Commits per committer: 1.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Daniel Liu d****u@g****m 104
danielecook d****k@g****m 13
Armin Töpfer a****r@g****m 1
Xuyang Zhou n****y@g****m 1
Himadri Bhattacharjee h****5@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 7
  • Total pull requests: 8
  • Average time to close issues: 1 day
  • Average time to close pull requests: 1 day
  • Total issue authors: 7
  • Total pull request authors: 5
  • Average comments per issue: 3.57
  • Average comments per pull request: 1.25
  • Merged pull requests: 8
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 10 days
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 5.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • touala (1)
  • huishao007 (1)
  • avril-m-harder (1)
  • johnchamberlin (1)
  • colindaven (1)
  • angelovangel (1)
  • AntonioBaeza (1)
  • Daniel-Liu-c0deb0t (1)
Pull Request Authors
  • Daniel-Liu-c0deb0t (4)
  • ykozxy (1)
  • lavafroth (1)
  • armintoepfer (1)
  • xianyu0623 (1)
  • danielecook (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

Cargo.lock cargo
  • adler 1.0.2
  • atty 0.2.14
  • autocfg 1.1.0
  • bit-vec 0.6.3
  • bitflags 1.3.2
  • byteorder 1.4.3
  • bytes 1.1.0
  • cfg-if 1.0.0
  • clap 3.2.7
  • clap_derive 3.2.7
  • clap_lex 0.2.4
  • crc32fast 1.3.2
  • crossbeam-channel 0.5.6
  • crossbeam-deque 0.8.1
  • crossbeam-epoch 0.9.9
  • crossbeam-utils 0.8.10
  • either 1.6.1
  • flate2 1.0.24
  • fxhash 0.2.1
  • hashbrown 0.12.1
  • heck 0.4.0
  • hermit-abi 0.1.19
  • indexmap 1.9.1
  • lexical-core 0.8.5
  • lexical-parse-float 0.8.5
  • lexical-parse-integer 0.8.6
  • lexical-util 0.8.5
  • lexical-write-float 0.8.5
  • lexical-write-integer 0.8.5
  • libc 0.2.126
  • memchr 2.5.0
  • memoffset 0.6.5
  • miniz_oxide 0.5.3
  • noodles 0.26.0
  • noodles-bam 0.21.0
  • noodles-bed 0.4.0
  • noodles-bgzf 0.14.0
  • noodles-core 0.8.0
  • noodles-csi 0.9.0
  • noodles-fasta 0.13.0
  • noodles-sam 0.18.0
  • num-traits 0.2.15
  • num_cpus 1.13.1
  • once_cell 1.12.0
  • ordered-float 3.1.0
  • os_str_bytes 6.1.0
  • proc-macro-error 1.0.4
  • proc-macro-error-attr 1.0.4
  • proc-macro2 1.0.40
  • quote 1.0.20
  • rayon 1.5.3
  • rayon-core 1.9.3
  • rust-lapper 1.0.1
  • rustc-hash 1.1.0
  • scopeguard 1.1.0
  • static_assertions 1.1.0
  • strsim 0.10.0
  • syn 1.0.98
  • termcolor 1.1.3
  • textwrap 0.15.0
  • unicode-ident 1.0.1
  • version_check 0.9.4
  • winapi 0.3.9
  • winapi-i686-pc-windows-gnu 0.4.0
  • winapi-util 0.1.5
  • winapi-x86_64-pc-windows-gnu 0.4.0
Cargo.toml cargo