eutils

simplified searching, fetching, and parsing records from NCBI using their E-utilities interface

https://github.com/biocommons/eutils

Science Score: 33.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: ncbi.nlm.nih.gov
  • Committers with academic emails
    2 of 17 committers (11.8%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.3%) to scientific vocabulary

Keywords

bioinformatics genomics ncbi

Keywords from Contributors

genome-analysis sequencing variant-analysis variation
Last synced: 6 months ago · JSON representation

Repository

simplified searching, fetching, and parsing records from NCBI using their E-utilities interface

Basic Info
  • Host: GitHub
  • Owner: biocommons
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 2.03 MB
Statistics
  • Stars: 62
  • Watchers: 8
  • Forks: 27
  • Open Issues: 5
  • Releases: 0
Topics
bioinformatics genomics ncbi
Created almost 10 years ago · Last pushed over 1 year ago
Metadata Files
Readme Contributing License Codeowners

README.rst

eutils -- a simplified interface to NCBI E-Utilities
====================================================

|pypi_badge| |build_status| |issues_badge| |contributors| |license| |docs| |changelog|

eutils is a Python package to simplify searching, fetching, and
parsing records from NCBI using their E-utilities_ interface.

News
----

* 0.6.0 was released on 2019-12-17. Support for Python 2.7 has been
  dropped. See the `0.6 ChangeLog
  `_.


Documentation
--------------
See https://eutils.readthedocs.io/en/stable/


Features
--------
* simple Pythonic interface for searching and fetching
* automatic query rate throttling per NCBI guidelines
* optional sqlite-based caching of compressed replies
* "façades" that facilitate access to essential attributes in replies



A Quick Example
---------------

As of May 1, 2018, NCBI throttles requests based on whether a client
is registered. Unregistered clients are limited to 3 requests/second;
registered clients are granted 10 requests/second, and may request
more. See the `NCBI Announcement
`_
for more information.

The eutils package will automatically throttle requests according to
NCBI guidelines (3 or 10 requests/second without or with an API key,
respectively).

::

  $ pip install biocommons.eutils
  $ ipython

  >>> import os
  >>> from biocommons.eutils import Client

  # Initialize a client. This client handles all caching and query
  # throttling.  For example:
  >>> ec = Client(api_key=os.environ.get("NCBI_API_KEY", None))

  # search for tumor necrosis factor genes
  # any valid NCBI query may be used
  >>> esr = ec.esearch(db='gene',term='tumor necrosis factor')

  # esearch returns a list of entity IDs associated with your search. preview some of them:
  >>> esr.ids[:5]
  [136114222, 136113226, 136112112, 136111930, 136111620]

  # fetch data for an ID (gene id 7157 is human TNF)
  >>> egs = ec.efetch(db='gene', id=7157)

  # One may fetch multiple genes at a time. These are returned as an
  # EntrezgeneSet. We'll grab the first (and only) child, which returns
  # an instance of the Entrezgene class.
  >>> eg = egs.entrezgenes[0]

  # Easily access some basic information about the gene
  >>> eg.hgnc, eg.maploc, eg.description, eg.type, eg.genus_species
  ('TP53', '17p13.1', 'tumor protein p53', 'protein-coding', 'Homo sapiens')

  # get a list of genomic references
  >>> sorted([(r.acv, r.label) for r in eg.references])
  [('NC_000017.11', 'Chromosome 17 Reference GRCh38...'),
   ('NC_018928.2', 'Chromosome 17 Alternate ...'),
   ('NG_017013.2', 'RefSeqGene')]

  # Get the first three products defined on GRCh38
  >>> [p.acv for p in eg.references[0].products][:3]
  ['NM_001126112.2', 'NM_001276761.1', 'NM_000546.5']

  # As a sample, grab the first product defined on this reference (order is arbitrary)
  >>> mrna = [i for i in eg.references[0].products if i.type == "mRNA"][0]
  >>> str(mrna)
  'GeneCommentary(acv=NM_001126112.2,type=mRNA,heading=Reference,label=transcript variant 2)'

  # mrna.genomic_coords provides access to the exon definitions on this reference
  >>> mrna.genomic_coords.gi, mrna.genomic_coords.strand
  ('568815581', -1)

  >>> mrna.genomic_coords.intervals
  [(7687376, 7687549), (7676520, 7676618), (7676381, 7676402),
  (7675993, 7676271), (7675052, 7675235), (7674858, 7674970),
  (7674180, 7674289), (7673700, 7673836), (7673534, 7673607),
  (7670608, 7670714), (7668401, 7669689)]

  # and if the mrna has a product, the resulting protein:
  >>> str(mrna.products[0])
  'GeneCommentary(acv=NP_001119584.1,type=peptide,heading=Reference,label=isoform a)'



Important Notes
---------------

* **You are encouraged to** `browse issues
  `_. Please report any
  issues you find.
* **Use a pip package specification to ensure stay within minor
  releases for API stability.** For example, ``eutils >=0.6,<0.7``.


Developing and Contributing
---------------------------

Contributions of bug reports, code patches, and documentation are
welcome!

Development occurs in the default branch. Please work in feature
branches or bookmarks from the default branch. Feature branches should
be named for the eutils issue they fix, as in
`121-update-xml-facades`.  When merging, use a commit message like
"closes #121: update xml facades to new-style interface". ("closes #n"
is recognized automatically and closes the ticket upon pushing.)

The included Makefile automates many tasks.  In particular, `make
develop` prepares a development environment and `make test` runs
unittests. (Please run tests before committing!)

Again, thanks for your contributions.


.. _E-utilities: http://www.ncbi.nlm.nih.gov/books/NBK25499/


.. |build_status| image:: https://travis-ci.org/biocommons/eutils.svg?branch=master
  :target: https://travis-ci.org/biocommons/eutils

.. |changelog| image:: https://img.shields.io/badge/docs-changelog-green.svg
   :target: https://eutils.readthedocs.io/en/stable/changelog/

.. |contributors| image:: https://img.shields.io/github/contributors/biocommons/eutils.svg
  :target: https://github.com/biocommons/eutils

.. |docs| image:: https://img.shields.io/badge/docs-readthedocs-green.svg
   :target: http://eutils.readthedocs.io/

.. |issues_badge| image:: https://img.shields.io/github/issues/biocommons/eutils.png
  :target: https://github.com/biocommons/eutils/issues
  :align: middle

.. |license| image:: https://img.shields.io/github/license/biocommons/eutils.svg
  :target: https://github.com/biocommons/eutils/blob/master/LICENSE

.. |pypi_badge| image:: https://img.shields.io/pypi/v/eutils.svg
  :target: https://pypi.org/project/eutils/

Owner

  • Name: biocommons
  • Login: biocommons
  • Kind: organization

a collection of open source bioinformatics tools

GitHub Events

Total
  • Issues event: 1
  • Watch event: 3
  • Issue comment event: 1
  • Fork event: 3
Last Year
  • Issues event: 1
  • Watch event: 3
  • Issue comment event: 1
  • Fork event: 3

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 238
  • Total Committers: 17
  • Avg Commits per committer: 14.0
  • Development Distribution Score (DDS): 0.521
Past Year
  • Commits: 1
  • Committers: 1
  • Avg Commits per committer: 1.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Reece Hart r****e@i****m 114
Reece Hart r****t@g****m 73
nthmost n****t 24
cariaso c****o@g****m 5
Mark Diekhans m****d@s****u 4
Lawrence Lee l****e@i****m 3
Paige Newman 3****3 2
khyox k****x 2
shouldsee s****m@g****m 2
ShriramK k****8@g****m 2
bryan brancotte b****e@p****r 1
Vincent Matthys v****s@s****r 1
Andreas Prlic a****c@g****m 1
Timothy Laurent t****t@i****m 1
Jeff van Santen j****6@g****m 1
Timothy Laurent t****t@g****m 1
Mohit Solanki m****9@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 89
  • Total pull requests: 22
  • Average time to close issues: 10 months
  • Average time to close pull requests: 19 days
  • Total issue authors: 15
  • Total pull request authors: 14
  • Average comments per issue: 0.8
  • Average comments per pull request: 0.91
  • Merged pull requests: 20
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • reece (70)
  • jsstevenson (4)
  • pmartin23 (2)
  • diekhans (1)
  • shouldsee (1)
  • LLCampos (1)
  • rly (1)
  • baoilleach (1)
  • naomifox (1)
  • timothyjlaurent (1)
  • jinseonyou (1)
  • jvansan (1)
  • safay (1)
  • andreasprlic (1)
  • nthmost (1)
Pull Request Authors
  • jsstevenson (9)
  • ShriramK (2)
  • diekhans (2)
  • shouldsee (2)
  • cariaso (2)
  • timothyjlaurent (2)
  • bpeterman (1)
  • bryan-brancotte (1)
  • jinseonyou (1)
  • andreasprlic (1)
  • jvansan (1)
  • VincentMatthys (1)
  • pmartin23 (1)
  • mohi7solanki (1)
Top Labels
Issue Labels
enhancement (12) bug (8) minor (7) stale (6) closed-by-stale (6) major (5) querying (3) critical (2) xmlfacades (2) not-a-bug (1) docs (1) trivial (1) proposal (1) keep alive (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 32,394 last-month
  • Total docker downloads: 8,679
  • Total dependent packages: 6
  • Total dependent repositories: 24
  • Total versions: 27
  • Total maintainers: 3
pypi.org: eutils

"Python interface to NCBI's eutilities API"

  • Versions: 27
  • Dependent Packages: 6
  • Dependent Repositories: 24
  • Downloads: 32,394 Last month
  • Docker Downloads: 8,679
Rankings
Docker downloads count: 1.2%
Dependent packages count: 1.6%
Downloads: 2.8%
Dependent repos count: 3.0%
Average: 4.2%
Forks count: 7.5%
Stargazers count: 9.1%
Maintainers (3)
Last synced: 7 months ago

Dependencies

setup.py pypi
.github/workflows/stale.yml actions
  • actions/stale v8 composite
.github/workflows/labels.yml actions
  • EndBug/label-sync v2 composite
  • actions/checkout v4 composite