filtersam

Tools to filter SAM/BAM files by percent identity and percent of matched sequence

https://github.com/robaina/filtersam

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: ncbi.nlm.nih.gov, zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.3%) to scientific vocabulary

Keywords

alignment bioinformatics computational-biology genomics python samtools sequence-alignment
Last synced: 4 months ago · JSON representation ·

Repository

Tools to filter SAM/BAM files by percent identity and percent of matched sequence

Basic Info
  • Host: GitHub
  • Owner: Robaina
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 313 KB
Statistics
  • Stars: 5
  • Watchers: 1
  • Forks: 0
  • Open Issues: 2
  • Releases: 1
Topics
alignment bioinformatics computational-biology genomics python samtools sequence-alignment
Created over 4 years ago · Last pushed over 2 years ago
Metadata Files
Readme License Code of conduct Citation

README.md

logo

A Python tool to filter sam/bam files by percent identity or percent of matched sequence

PyPI GitHub release (latest by date) GitHub license Contributor Covenant DOI


Percent identity is computed as:

$$PI = 100 \frac{Nm}{Nm + N_i}$$

where $Nm$ is the number of matches and $Ni$ is the number of mismatches.

Percent of matched sequences is computed as:

$$PM = 100 \frac{N_m}{L}$$

where $L$ corresponds to query sequence length.

NOTES

  1. Percent of matched sequence is also an alternative definition of percent identity used in some cases, for intance, in BLAST.

  2. BAM/SAM files must contain MD tags to be able to filter by percent identity. Aligners such as BWA add MD tags to each queried sequence in a BAM file. MD tags can also be generated with samtools.

Installation

pip install filtersam

Usage

You can find a jupyter notebook with usage examples here.

Citation

If you use this software, please cite it as below:

Robaina-Estévez, S. (2022). filterSAM: filter sam/bam files by percent identity or percent of matched sequence (Version 0.0.11)[Computer software]. https://doi.org/10.5281/zenodo.7056278.

Owner

  • Name: Semidán Robaina
  • Login: Robaina
  • Kind: user
  • Location: Atlantic Ocean
  • Company: Hapdera

Computational Biology | Data Science | Python Dev. | Ph.D. Systems Biology

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: "Robaina-Estévez"
    given-names: "Semidán"
    orcid: "https://orcid.org/0000-0003-0781-1677"
title: "filterSAM: filter sam/bam files by percent identity or percent of matched sequence"
version: 0.0.11
doi: 10.5281/zenodo.7056278
date-released: 2022-09-07
url: "https://github.com/Robaina/filterSAM"

GitHub Events

Total
  • Issues event: 1
  • Watch event: 2
Last Year
  • Issues event: 1
  • Watch event: 2

Committers

Last synced: about 1 year ago

All Time
  • Total Commits: 35
  • Total Committers: 2
  • Avg Commits per committer: 17.5
  • Development Distribution Score (DDS): 0.114
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Semidán s****a@g****m 31
Semidán Robaina Estévez h****o@s****m 4
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 3
  • Total pull requests: 0
  • Average time to close issues: 5 days
  • Average time to close pull requests: N/A
  • Total issue authors: 3
  • Total pull request authors: 0
  • Average comments per issue: 0.33
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • mcmahon-uw (1)
  • Robaina (1)
  • njohner (1)
Pull Request Authors
Top Labels
Issue Labels
enhancement (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 37 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 11
  • Total maintainers: 1
pypi.org: filtersam

('Tools to filter sam o bam files by percent identity or percent of matched sequence',)

  • Versions: 11
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 37 Last month
Rankings
Dependent packages count: 10.1%
Dependent repos count: 21.5%
Average: 26.3%
Stargazers count: 27.8%
Forks count: 29.8%
Downloads: 42.3%
Maintainers (1)
Last synced: 6 months ago

Dependencies

requirements.txt pypi
  • numpy ==1.21.2
  • parallelbam ==0.0.12
  • pysam ==0.16.0.1
setup.py pypi
  • numpy *
  • parallelbam *
  • pysam *