pyusm

Package implementing the universal sequence mapping tools created by S. Vinga and J. Almeida for generating chaos game representations of sequences of arbitrary alphabet size.

https://github.com/katherine983/pyusm

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 6 DOI reference(s) in README
  • Academic publication links
    Links to: ncbi.nlm.nih.gov, zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.3%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Package implementing the universal sequence mapping tools created by S. Vinga and J. Almeida for generating chaos game representations of sequences of arbitrary alphabet size.

Basic Info
  • Host: GitHub
  • Owner: katherine983
  • License: other
  • Language: Python
  • Default Branch: main
  • Size: 56.7 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 2
Created over 2 years ago · Last pushed over 2 years ago
Metadata Files
Readme License Citation

README.md

DOI

pyusm

Python package implementing and expanding on the universal sequence mapping tools created by S. Vinga and J. Almeida and referenced in 1 2 3.

For further documentation of the package functions including theoretical background and proofs, visit https://katherine983.github.io/pyusm/intro.html

Examples: ```python import pyusm

data = ['a', 'b', 'c']

produces an instance of the USM class with form 'USM'

datausm = pyusm.USM.make_usm(data)

produces an instance of the USM class with form '2DCGR'

datacgr = pyusm.USM.cgr2d(data) ```

```python from pyusm import usm_entropy

computes the quadratic renyi entropy values from the forward USM map coordinates in datausm.fw

renyi entropy estimates output as a dictionary with kernel variance values as keys

rn2dict = usm_entropy.renyi2usm(datausm.fw) python

generate a 2D cgr plot and animation

import pyusm

produces an instance of the USM class with form '2DCGR'

datacgr = pyusm.USM.cgr2d(data)

initiate figure

cgrfig = pyusm.cgrplot(datacgr.fw, datacgr.coorddict) cgrfig.plot()

animate plot figure

cgrfig.animate()

save figure (alias for matplotlib .savefig() method)

cgrfig.savefig('cgrfig.txt', **kwargs) ```

Testing

The testing suite built with pytest. For now the expected performance is for one failure, 26 passed, 2 xfailed. The failure should be for string input in the testusmseq_iterables() test. Test data include the sequence of Es promotor regions in B subtilis used in the original study 1 which can also be found here. Source for HUMHBB sequence data can be found here.

Bibliography

1

Vinga, S., & Almeida, J. S. (2004). Rényi continuous entropy of DNA sequences. Journal of Theoretical Biology, 231(3), 377–388. https://doi.org/10.1016/j.jtbi.2004.06.030
2
Almeida, J. S., & Vinga, S. (2002). Universal sequence map (USM) of arbitrary discrete sequences. BMC Bioinformatics, 3. https://doi.org/10.1186/1471-2105-3-6
3
Almeida, J. S., & Vinga, S. (2009). Biological sequences as pictures: A generic two dimensional solution for iterated maps. BMC Bioinformatics, 10(100), 1–7. https://doi.org/10.1186/1471-2105-10-100

Owner

  • Name: Katherine Wuestney
  • Login: katherine983
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.1.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: Wuestney
    given-names: Katherine
    orcid: https://orcid.org/0000-0002-5691-0041
title: katherine983/pyusm: pyusm v1.0.1
version: v1.0.1
date-released: 2023-07-25

GitHub Events

Total
Last Year

Dependencies

setup.py pypi
  • matplotlib >=3.3.4
  • numexpr >=2.7
  • numpy >=1.20
  • pytest >=6.2.3
  • scipy >=1.6