cstag and cstag-cli

cstag and cstag-cli: tools for manipulating and visualizing cs tags - Published in JOSS (2024)

https://github.com/akikuno/cstag

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 6 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org, zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

bioinformatics cstag minimap2 python sam sequence
Last synced: 4 months ago · JSON representation

Repository

Python module to manipulate and visualize minimap2's cs tag

Basic Info
Statistics
  • Stars: 10
  • Watchers: 1
  • Forks: 0
  • Open Issues: 1
  • Releases: 23
Topics
bioinformatics cstag minimap2 python sam sequence
Created almost 4 years ago · Last pushed over 1 year ago
Metadata Files
Readme Contributing License Code of conduct

README.md

Licence Test Python PyPI Bioconda JOSS DOI

cstag

cstag is a Python library tailored for manipulating and visualizing minimap2's cs tags.

[!NOTE] To add cs tags to SAM/BAM files, check out cstag-cli.

🌟 Features

  • cstag.call(): Generate a cs tag
  • cstag.shorten(): Convert a cs tag from its long to short format
  • cstag.lengthen(): Convert a cs tag from its short to long format
  • cstag.consensus(): Create a consensus cs tag from multiple cs tags
  • cstag.mask(): Mask low-quality bases within a cs tag
  • cstag.split(): Break down a cs tag into its constituent parts
  • cstag.revcomp(): Convert a cs tag to its reverse complement
  • cstag.to_sequence(): Reconstruct a reference subsequence from the alignment
  • cstag.to_vcf(): Generate a VCF representation
  • cstag.to_html(): Generate an HTML representation

For comprehensive documentation, please visit our docs.

🛠 Installation

Using PyPI:

bash pip install cstag

Using Bioconda:

bash conda install -c bioconda cstag

💡 Usage

Generating cs tags

```python import cstag

cigar = "8M2D4M2I3N1M" md = "2A5^AG7" seq = "ACGTACGTACGTACG"

print(cstag.call(cigar, md, seq))

:2*ag:5-ag:4+ac~nn3nn:1

print(cstag.call(cigar, md, seq, long=True))

=AC*ag=TACGT-ag=ACGT+ac~nn3nn=G

```

Shortening or Lengthening cs tags

```python import cstag

Convert a cs tag from long to short

cs_tag = "=ACGT*ag=CGT"

print(cstag.shorten(cs_tag))

:4*ag:3

Convert a cs tag from short to long

cs_tag = ":4*ag:3" cigar = "8M" seq = "ACGTACGT"

print(cstag.lengthen(cs_tag, cigar, seq))

=ACGT*ag=CGT

```

Creating a Consensus

```python import cstag

cs_tags = ["=ACGT", "=ACgt=T", "=Cgt=T", "=C*gt=T", "=ACT+ccc=T"] positions = [1, 1, 2, 2, 1]

print(cstag.consensus(cs_tags, positions))

=AC*gt=T

```

Masking Low-Quality Bases

```python import cstag

cstag = "=ACGT*ac+gg-cc=T" cigar = "5M2I2D1M" qual = "AA!!!!AA" phredthreshold = 10 print(cstag.mask(cstag, cigar, qual, phredthreshold))

=ACNN*an+ng-cc=T

```

Splitting a cs tag

```python import cstag

cstag = "=ACGT*ac+gg-cc=T" print(cstag.split(cstag))

['=ACGT', '*ac', '+gg', '-cc', '=T']

```

Reverse Complement of a cs tag

```python import cstag

cstag = "=ACGT*ac+gg-cc=T" print(cstag.revcomp(cstag))

=A-gg+cc*tg=ACGT

```

Reconstructing the Reference Subsequence

```python import cstag cstag = "=AC*gt=T-gg=C+tt=A" print(cstag.tosequence(cs_tag))

ACTTCTTA

```

Generating a VCF Report

```python import cstag cstag = "=AC*gt=T-gg=C+tt=A" chrom = "chr1" pos = 1 print(cstag.tovcf(cs_tag, chrom, pos)) """

fileformat=VCFv4.2

CHROM POS ID REF ALT QUAL FILTER INFO

chr1 3 . G T . . . chr1 4 . TGG T . . . chr1 5 . C CTT . . . """ ```

The multiple cs tags enable reporting of the variant allele frequency (VAF).

```python import cstag cstags = ["=ACGT", "=ACgt=T", "=Cgt=T", "=ACGT", "=AC*gt=T"] chroms = ["chr1", "chr1", "chr1", "chr2", "chr2"] positions = [2, 2, 3, 10, 100] print(cstag.tovcf(cs_tags, chroms, positions)) """

fileformat=VCFv4.2

INFO=

INFO=

INFO=

INFO=

CHROM POS ID REF ALT QUAL FILTER INFO

chr1 4 . G T . . DP=3;RD=1;AD=2;VAF=0.667 chr2 102 . G T . . DP=1;RD=0;AD=1;VAF=1.0 """ ```

Generating an HTML Report

```python import cstag from pathlib import Path

cs_tag = "=AC+ggg=T-acgt*at~gt10ag=GNNN" description = "Example"

cstaghtml = cstag.tohtml(cstag, description) Path("report.html").writetext(cstag_html)

Output "report.html"

```

You can visualize mutations indicated by the cs tag using the generated report.html file as shown below:

image

📣 Feedback and Support

For questions, bug reports, or other forms of feedback, we'd love to hear from you!
Please use GitHub Issues for all reporting purposes.

Please refer to CONTRIBUTING for how to contribute and how to verify your contributions.

🤝 Code of Conduct

Please note that this project is released with a Contributor Code of Conduct.
By participating in this project you agree to abide by its terms.

📄 Citation

  • Kuno, A., (2024). cstag and cstag-cli: tools for manipulating and visualizing cs tags. Journal of Open Source Software, 9(93), 6066, https://doi.org/10.21105/joss.06066

Owner

  • Name: Akihiro Kuno
  • Login: akikuno
  • Kind: user
  • Location: Tsukuba, Ibaraki, Japan
  • Company: University of Tsukuba

Bioinformatician working at the Laboratory Animal Resource Center

JOSS Publication

cstag and cstag-cli: tools for manipulating and visualizing cs tags
Published
January 22, 2024
Volume 9, Issue 93, Page 6066
Authors
Akihiro Kuno ORCID
Department of Anatomy and Embryology, University of Tsukuba, Tsukuba, Ibaraki, Japan, Laboratory Animal Resource Center, Trans-border Medical Research Center, University of Tsukuba, Tsukuba, Ibaraki, Japan.
Editor
Jacob Schreiber ORCID
Tags
python genomics sequencing bioinformatics

GitHub Events

Total
  • Issues event: 2
  • Watch event: 2
  • Issue comment event: 3
Last Year
  • Issues event: 2
  • Watch event: 2
  • Issue comment event: 3

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 248
  • Total Committers: 2
  • Avg Commits per committer: 124.0
  • Development Distribution Score (DDS): 0.077
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Akihiro Kuno a****o 229
Akihiro Kuno a****o@g****m 19
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 12
  • Total pull requests: 16
  • Average time to close issues: 8 months
  • Average time to close pull requests: 2 minutes
  • Total issue authors: 5
  • Total pull request authors: 1
  • Average comments per issue: 1.17
  • Average comments per pull request: 0.0
  • Merged pull requests: 15
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 0
  • Average time to close issues: 25 days
  • Average time to close pull requests: N/A
  • Issue authors: 2
  • Pull request authors: 0
  • Average comments per issue: 5.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • akikuno (7)
  • dduchen (1)
  • aistBMRG (1)
  • betteridiot (1)
Pull Request Authors
  • akikuno (15)
Top Labels
Issue Labels
Priority: High (3) Priority: Moderate (3) enhancement (1) Priority: Low (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 3,226 last-month
  • Total dependent packages: 2
  • Total dependent repositories: 1
  • Total versions: 26
  • Total maintainers: 1
pypi.org: cstag

Python library tailored for for manipulating and visualizing minimap2's cs tags

  • Versions: 26
  • Dependent Packages: 2
  • Dependent Repositories: 1
  • Downloads: 3,226 Last month
Rankings
Dependent packages count: 3.1%
Downloads: 11.1%
Average: 18.1%
Dependent repos count: 21.7%
Stargazers count: 25.0%
Forks count: 29.8%
Maintainers (1)
Last synced: 4 months ago