phredsort

`phredsort` is a cli tool for sorting sequences in a FASTQ file by their quality scores

https://github.com/vmikk/phredsort

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.3%) to scientific vocabulary

Keywords

bash bioinformatics cli fastq phred-quality-scores sequence-quality
Last synced: 6 months ago · JSON representation

Repository

`phredsort` is a cli tool for sorting sequences in a FASTQ file by their quality scores

Basic Info
  • Host: GitHub
  • Owner: vmikk
  • License: mit
  • Language: Go
  • Default Branch: main
  • Homepage:
  • Size: 480 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 2
Topics
bash bioinformatics cli fastq phred-quality-scores sequence-quality
Created over 1 year ago · Last pushed about 1 year ago
Metadata Files
Readme License Citation

README.md

phredsort

DOI codecov

phredsort is a command-line tool for sorting sequences in FASTQ files by their quality scores.

Usage

Basic usage: ```bash

Read from input.fastq.gz and write to output.fastq.gz

phredsort -i input.fastq.gz -o output.fastq.gz

Read from stdin and write to stdout

zcat input.fastq.gz | phredsort --in - --out - | less -S ```

phredsort help message

Installation

Download compiled binary (for Linux)

bash wget https://github.com/vmikk/phredsort/releases/download/1.3.0/phredsort chmod +x phredsort ./phredsort --help

Build from source

bash git clone --depth 1 https://github.com/vmikk/phredsort cd phredsort go build -ldflags="-s -w" phredsort.go ./phredsort --help

Quality metrics

phredsort supports several metrics (--metric parameter) to assess sequence quality:

1. (Back-transformed) average Phred score (avgphred)

  • Properly calculated mean quality score that accounts for the logarithmic nature of Phred scores
  • Converts Phred scores to error probabilities, calculates their arithmetic mean, then converts back to Phred scale
  • Formula: -10 * log10(mean(10^(-Q/10)))
  • More accurate than simple arithmetic mean of Phred scores, which would overestimate quality

2. Maximum expected error (maxee) (as per Edgar & Flyvbjerg, 2014)

  • Sum of error probabilities for all bases in a sequence
  • Formula: sum(10^(-Q/10))
  • Higher values indicate lower quality
  • Depends on sequence length (longer sequences tend to have higher MaxEE)

3. Maximum expected error percentage (meep)

  • MaxEE standardized by sequence length
  • Represents expected number of errors per 100 bases
  • Formula: (MaxEE * 100) / sequence_length
  • Higher values indicate lower quality
  • Allows fair comparison between sequences of different lengths

4. Low quality base count (lqcount)

  • Number of bases below specified quality threshold
  • Useful for binned quality scores (e.g., data from Illumina NovaSeq platform)
  • Counts bases with Phred score < threshold (default: 15)
  • Higher values indicate lower quality

5. Low quality base percentage (lqpercent)

  • Percentage of bases below quality threshold
  • Formula: (lqcount * 100) / sequence_length
  • Higher values indicate lower quality
  • Normalizes low-quality base count by sequence length

Owner

  • Name: Vladimir Mikryukov
  • Login: vmikk
  • Kind: user
  • Location: Tartu, Estonia
  • Company: The University of Tartu

GitHub Events

Total
  • Release event: 1
  • Push event: 43
Last Year
  • Release event: 1
  • Push event: 43

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 98
  • Total Committers: 1
  • Avg Commits per committer: 98.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 98
  • Committers: 1
  • Avg Commits per committer: 98.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Vladimir Mikryukov v****v@g****m 98

Issues and Pull Requests

Last synced: 8 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads: unknown
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 0
proxy.golang.org: github.com/vmikk/phredsort
  • Versions: 0
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.9%
Average: 6.1%
Dependent repos count: 6.3%
Last synced: 6 months ago

Dependencies

go.mod go
  • github.com/dsnet/compress v0.0.1
  • github.com/edsrzf/mmap-go v1.0.0
  • github.com/elliotwutingfeng/asciiset v0.0.0-20230602022725-51bbb787efab
  • github.com/inconshreveable/mousetrap v1.1.0
  • github.com/klauspost/compress v1.16.3
  • github.com/klauspost/pgzip v1.2.5
  • github.com/shenwei356/bio v0.13.6
  • github.com/shenwei356/natsort v0.0.0-20220117010048-580176ad49fb
  • github.com/shenwei356/util v0.5.3
  • github.com/shenwei356/xopen v0.3.2
  • github.com/spf13/cobra v1.8.1
  • github.com/spf13/pflag v1.0.5
  • github.com/ulikunitz/xz v0.5.11
  • golang.org/x/sys v0.0.0-20220412211240-33da011f77ad
go.sum go
  • github.com/cpuguy83/go-md2man/v2 v2.0.4
  • github.com/cznic/sortutil v0.0.0-20181122101858-f5f958428db8
  • github.com/dsnet/compress v0.0.1
  • github.com/dsnet/golib v0.0.0-20171103203638-1ea166775780
  • github.com/edsrzf/mmap-go v1.0.0
  • github.com/elliotwutingfeng/asciiset v0.0.0-20230602022725-51bbb787efab
  • github.com/inconshreveable/mousetrap v1.1.0
  • github.com/klauspost/compress v1.4.1
  • github.com/klauspost/compress v1.16.3
  • github.com/klauspost/cpuid v1.2.0
  • github.com/klauspost/pgzip v1.2.5
  • github.com/russross/blackfriday/v2 v2.1.0
  • github.com/shenwei356/bio v0.13.6
  • github.com/shenwei356/natsort v0.0.0-20220117010048-580176ad49fb
  • github.com/shenwei356/util v0.5.3
  • github.com/shenwei356/xopen v0.3.2
  • github.com/spf13/cobra v1.8.1
  • github.com/spf13/pflag v1.0.5
  • github.com/ulikunitz/xz v0.5.6
  • github.com/ulikunitz/xz v0.5.11
  • golang.org/x/sys v0.0.0-20220412211240-33da011f77ad
  • gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405
  • gopkg.in/yaml.v3 v3.0.1