https://github.com/althonos/uniprot.rs

Rust data structures and parser for the Uniprot database(s).

https://github.com/althonos/uniprot.rs

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 2 committers (50.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.5%) to scientific vocabulary

Keywords

bioinformatics rust-library swissprot uniprot uniprotkb uniref

Keywords from Contributors

genome metagenomes
Last synced: 5 months ago · JSON representation

Repository

Rust data structures and parser for the Uniprot database(s).

Basic Info
  • Host: GitHub
  • Owner: althonos
  • License: mit
  • Language: Rust
  • Default Branch: master
  • Homepage: https://docs.rs/uniprot
  • Size: 2.31 MB
Statistics
  • Stars: 9
  • Watchers: 3
  • Forks: 1
  • Open Issues: 1
  • Releases: 11
Topics
bioinformatics rust-library swissprot uniprot uniprotkb uniref
Created about 6 years ago · Last pushed over 2 years ago
Metadata Files
Readme Changelog License

README.md

uniprot.rs Star me

Rust data structures and parser for the UniprotKB database(s).

Actions Codecov License Source Crate Documentation Changelog GitHub issues

🔌 Usage

The uniprot::uniprot::parse function can be used to obtain an iterator over the entries of a UniprotKB database in XML format (either SwissProt or TrEMBL). XML files for UniRef and UniParc can also be parsed, with uniprot::uniref::parse and uniprot::uniparc::parse, respectively.

```rust extern crate uniprot;

let f = std::fs::File::open("tests/uniprot.xml") .map(std::io::BufReader::new) .unwrap();

for r in uniprot::uniprot::parse(f) { let entry = r.unwrap(); // ... process the Uniprot entry ... } ```

Any BufRead implementor can be used as an input, so the database files can be streamed directly from their online location with the help of an HTTP library such as reqwest, or using the ftp library.

The XML format is the same for the EBI REST API and for the UniProt API, so this library can also be used to read single entries or larger queries. For instance, you can search UniProt for a keyword and retrieve all the matching entries:

```rust extern crate ureq; extern crate libflate; extern crate uniprot;

let query = "bacteriorhodopsin"; let query_url = format!("https://www.uniprot.org/uniprot/?query={}&format=xml&compress=yes", query);

let req = ureq::get(&queryurl).set("Accept", "application/xml"); let reader = libflate::gzip::Decoder::new(req.call().unwrap().intoreader()).unwrap();

for r in uniprot::uniprot::parse(std::io::BufReader::new(reader)) { let entry = r.unwrap(); // ... process the Uniprot entry ... } ```

See the online documentation at docs.rs for more examples, and some details about the different features available.

📝 Features

  • threading (enabled by default): compiles the multithreaded parser that offers a 90% speed increase when processing XML files.
  • url-links (disabled by default): exposes the links in OnlineInformation as an url::Url.

🔍 See Also

If you're a bioinformatician and a Rustacean, you may be interested in these other libraries:

  • pubchem.rs: Rust data structures and API client for the PubChem API.
  • obofoundry.rs: Rust data structures for the OBO Foundry.
  • fastobo: Rust parser and abstract syntax tree for Open Biomedical Ontologies.
  • proteinogenic: Chemical structure generation for protein sequences as SMILES strings.

📋 Changelog

This project adheres to Semantic Versioning and provides a changelog in the Keep a Changelog format.

📜 License

This library is provided under the open-source MIT license.

This project is in no way not affiliated, sponsored, or otherwise endorsed by the UniProt Consortium. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

Owner

  • Name: Martin Larralde
  • Login: althonos
  • Kind: user
  • Location: Heidelberg, Germany
  • Company: EMBL / LUMC, @zellerlab

PhD candidate in Bioinformatics, passionate about programming, SIMD-enthusiast, Pythonista, Rustacean. I write poems, and sometimes they are executable.

GitHub Events

Total
Last Year

Committers

Last synced: about 1 year ago

All Time
  • Total Commits: 140
  • Total Committers: 2
  • Avg Commits per committer: 70.0
  • Development Distribution Score (DDS): 0.021
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Martin Larralde m****e@e****e 137
dependabot-preview[bot] 2****] 3
Committer Domains (Top 20 + Academic)
embl.de: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 2
  • Total pull requests: 20
  • Average time to close issues: 10 days
  • Average time to close pull requests: about 2 months
  • Total issue authors: 2
  • Total pull request authors: 2
  • Average comments per issue: 6.5
  • Average comments per pull request: 1.1
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 20
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • rustrust (1)
  • stijndcl (1)
Pull Request Authors
  • dependabot-preview[bot] (11)
  • dependabot[bot] (9)
Top Labels
Issue Labels
enhancement (2)
Pull Request Labels
dependencies (20)

Packages

  • Total packages: 1
  • Total downloads:
    • cargo 14,497 total
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 11
  • Total maintainers: 1
crates.io: uniprot

Rust data structures and parser for the Uniprot database(s).

  • Versions: 11
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 14,497 Total
Rankings
Forks count: 29.1%
Dependent repos count: 29.3%
Stargazers count: 30.7%
Average: 30.8%
Downloads: 30.8%
Dependent packages count: 33.8%
Maintainers (1)
Last synced: 6 months ago

Dependencies

Cargo.toml cargo
  • ftp 3.0.1 development
  • libflate 1.0.0 development
  • ureq 2.4.0 development
  • chrono 0.4.19
  • crossbeam-channel 0.5.0
  • fnv 1.0.6
  • lazy_static 1.4.0
  • memchr 2.4.0
  • num_cpus 1.12.0
  • quick-xml 0.22.0
  • url 2.1.1
.github/workflows/test.yml actions
  • actions-rs/cargo v1 composite
  • actions-rs/tarpaulin v0.1 composite
  • actions-rs/toolchain v1 composite
  • actions/cache v2 composite
  • actions/checkout v1 composite
  • codecov/codecov-action v2 composite
  • rasmus-saks/release-a-changelog-action v1.0.1 composite