bioregistry

📮 An integrative registry of biological databases, ontologies, and nomenclatures.

https://github.com/biopragmatics/bioregistry

Science Score: 77.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 4 DOI reference(s) in README
✓
Academic publication links
Links to: zenodo.org
✓
Committers with academic emails
3 of 35 committers (8.6%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.0%) to scientific vocabulary

Keywords

biocuration biopragmatics bioregistry

Keywords from Contributors

spatial-analysis biology standardization

Scientific Fields

Earth and Environmental Sciences Physical Sciences - 40% confidence

Biology Life Sciences - 40% confidence

Engineering Computer Science - 40% confidence

Last synced: 6 months ago · JSON representation ·

Repository

📮 An integrative registry of biological databases, ontologies, and nomenclatures.

Basic Info

Host: GitHub
Owner: biopragmatics
License: mit
Language: Python
Default Branch: main
Homepage: https://bioregistry.io
Size: 1.16 GB

Statistics

Stars: 133
Watchers: 6
Forks: 59
Open Issues: 164
Releases: 79

Topics

biocuration biopragmatics bioregistry

Created about 5 years ago · Last pushed 6 months ago

Metadata Files

Readme Contributing License Code of conduct Citation Governance

Bioregistry

A community-driven integrative meta-registry of life science databases, ontologies, and other resources.
More information here.

The Bioregistry can be accessed, searched, and queried through its associated website at https://bioregistry.io.

📥 Download

The underlying data of the Bioregistry can be downloaded (or edited) directly from here. Several exports to YAML, TSV, and RDF, including consensus views over the registry, are built on a weekly basis and can be downloaded via the exports/ directory.

The manually curated portions of these data are available under the CC0 1.0 Universal License. Aggregated data are redistributed under their original licenses.

🙏 Contributing

Contributions are both welcomed and encouraged. Contribution guidelines for new prefix requests, record edits, record removals, and code updates are available in CONTRIBUTING.md.

The most simple contribution is to submit an issue:

Submit a new prefix using the issue template. A new pull request will be generated automatically for you.
Update an existing record using one of the existing issue templates (e.g., for updating a record's regular expression, merging two prefixes).
For any updates that don't have a corresponding template, feel free to start with a blank issue.

If you want to make a direct contribution, feel free to make edits directly to the bioregistry.json file either through the GitHub interface or locally by forking the repository.

If you want to make a contribution but don't know where to start, you can check this list of curation To-Do's that's automatically generated weekly, including more detailed information on how to contribute.

⚖️ Governance

The Bioregistry is maintained by a Review Team and Core Development team whose memberships and duties are described in the Project Governance.

🧹 Maintenance

🫀 Health Report

The Bioregistry runs some automated tests weekly to check that various metadata haven't gone stale. For example, it checks that the homepages are still available and that each provider URL is still able to resolve.

It has a dedicated dashboard that is not part of the main Bioregistry site.

♻️ Update

The database is automatically updated daily thanks to scheduled workflows in GitHub Actions. The workflow's configuration can be found here and the last run can be seen here. Further, a changelog can be recapitulated from the commits of the GitHub Actions bot.

If you want to manually update the database, run the following:

shell $ tox -e update

Make sure that you have valid environment variables or pystow configurations for BIOPORTAL_API_KEY, ECOPORTAL_API_KEY, AGROPORTAL_API_KEY, FAIRSHARING_LOGIN, and FAIRSHARING_PASSWORD.

🚀 Installation

The Bioregistry can be installed from PyPI with:

shell $ pip install bioregistry

It can be installed in development mode for local curation with:

shell $ git clone https://github.com/biopragmatics/bioregistry.git $ cd bioregistry $ pip install --editable .

Build the docs locally with tox -e docs then view by opening docs/build/html/index.html.

💪 Usage

Normalizing Prefixes

The Bioregistry can be used to normalize prefixes across MIRIAM and all the (very plentiful) variants that pop up in ontologies in OBO Foundry and the OLS with the normalize_prefix() function.

```python from bioregistry import normalize_prefix

Doesn't affect canonical prefixes

assert 'ncbitaxon' == normalize_prefix('ncbitaxon')

This works for uppercased prefixes, like:

assert 'chebi' == normalize_prefix("CHEBI")

This works for mixed case prefixes like

assert 'fbbt' == normalize_prefix("FBbt")

This works for synonym prefixes, like:

assert 'ncbitaxon' == normalize_prefix('taxonomy')

This works for common mistaken prefixes, like:

assert 'pubchem.compound' == normalize_prefix('pubchem')

This works for prefixes that are often written many ways, like:

assert 'ec' == normalizeprefix('ec-code') assert 'ec' == normalizeprefix('EC_CODE')

If a prefix is not registered, it gives back `None`

assert normalize_prefix('not a real key') is None ```

Parsing CURIEs

The Bioregistry supports parsing a CURIE into a pair of normalized prefix and identifier using the parse_curie() function:

```python from bioregistry import parse_curie

Obvious for canonical CURIEs

assert ('chebi', '1234') == parse_curie('chebi:1234')

Normalize mixed case prefixes

assert ('fbbt', '00007294') == parse_curie('FBbt:00007294')

Normalize common mistaken prefixes

assert ('pubchem.compound', '1234') == parse_curie('pubchem:1234')

Remove the redundant prefix and normalize

assert ('go', '1234') == parse_curie('GO:GO:1234') ```

This will also apply the same normalization rules for prefixes from the previous section on normalizing prefixes for the remaining examples.

Normalizing CURIEs

The Bioregistry supports converting a CURIE to a canonical CURIE by normalizing the prefix and removing redundant namespaces embedded in LUIs with the normalize_curie() function.

```python from bioregistry import normalize_curie

Idempotent to canonical CURIEs

assert 'chebi:1234' == normalize_curie('chebi:1234')

Normalize common mistaken prefixes

assert 'pubchem.compound:1234' == normalize_curie('pubchem:1234')

Normalize mixed case prefixes

assert 'fbbt:1234' == normalize_curie('FBbt:1234')

Remove the redundant prefix and normalize

assert 'go:1234' == normalize_curie('GO:GO:1234') ```

Parsing IRIs

The Bioregistry can be used to parse CURIEs from IRIs due to its vast registry of provider URL strings and additional programmatic logic implemented with Python. It can parse OBO Library PURLs, IRIs from the OLS and identifiers.org, IRIs from the Bioregistry website, and any other IRIs from well-formed providers registered in the Bioregistry. The parse_iri() function gets a pre-parsed CURIE, while the curie_from_iri() function makes a canonical CURIE from the pre-parsed CURIE.

```python from bioregistry import curiefromiri, parse_iri

First-party IRI

assert ('chebi', '24867') == parseiri('https://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI:24867') assert 'chebi:24867' == curiefrom_iri('https://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI:24867')

OBO Library PURL

assert ('chebi', '24867') == parseiri('http://purl.obolibrary.org/obo/CHEBI24867') assert 'chebi:24867' == curiefromiri('http://purl.obolibrary.org/obo/CHEBI_24867')

OLS IRI

assert ('chebi', '24867') == parseiri('https://www.ebi.ac.uk/ols/ontologies/chebi/terms?iri=http://purl.obolibrary.org/obo/CHEBI24867') assert 'chebi:24867' == curiefromiri('https://www.ebi.ac.uk/ols/ontologies/chebi/terms?iri=http://purl.obolibrary.org/obo/CHEBI_24867')

Identifiers.org IRIs (with varying usage of HTTP(s) and colon/slash separator

assert ('chebi', '24867') == parseiri('https://identifiers.org/CHEBI:24867') assert ('chebi', '24867') == parseiri('http://identifiers.org/CHEBI:24867') assert ('chebi', '24867') == parseiri('https://identifiers.org/CHEBI/24867') assert ('chebi', '24867') == parseiri('http://identifiers.org/CHEBI/24867')

Bioregistry IRI

assert ('chebi', '24867') == parse_iri('https://bioregistry.io/chebi:24867') ```

In general, the Bioregistry knows how to parse both the http and https variants of any given URI:

```python from bioregistry import parse_iri

assert ('neuronames', '268') == parseiri("http://braininfo.rprc.washington.edu/centraldirectory.aspx?ID=268") assert ('neuronames', '268') == parseiri("https://braininfo.rprc.washington.edu/centraldirectory.aspx?ID=268") ```

Generating IRIs

You can generate an IRI from either a CURIE or a pre-parsed CURIE (i.e., a 2-tuple of a prefix and identifier) with the get_iri() function. By default, it uses the following priorities:

Custom prefix map (custom)
First-party IRI (default)
Identifiers.org / MIRIAM (miriam)
Ontology Lookup Service (ols)
OBO PURL (obofoundry)
Name-to-Thing (n2t)
BioPortal (bioportal)

```python from bioregistry import get_iri

assert getiri("chebi", "24867") == 'https://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI:24867' assert getiri("chebi:24867") == 'https://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI:24867' ```

It's possible to change the default priority list by passing an alternate sequence of metaprefixes to the priority keyword (see above). For example, if you're working with OBO ontologies, you might want to make OBO PURLs the highest priority and when OBO PURLs can't be generated, default to something else:

```python from bioregistry import get_iri

priority = ["obofoundry", "default", "miriam", "ols", "n2t", "bioportal"] assert getiri("chebi:24867", priority=priority) == 'http://purl.obolibrary.org/obo/CHEBI24867' assert get_iri("hgnc:1234", priority=priority) == 'https://bioregistry.io/hgnc:1234' ```

Even deeper, you can add (or override) any of the Bioregistry's default prefix map with the prefix_map keyword:

```python from bioregistry import get_iri

prefixmap = { "myprefix": "https://example.org/myprefix/", "chebi": "https://example.org/chebi/", } assert getiri("chebi:24867", prefixmap=prefixmap) == 'https://example.org/chebi/24867' assert getiri("myprefix:1234", prefixmap=prefix_map) == 'https://example.org/myprefix/1234' ```

A custom prefix map can be supplied in combination with a priority list, using the "custom" key for changing the priority of the custom prefix map.

```python from bioregistry import get_iri

prefixmap = {"lipidmaps": "https://example.org/lipidmaps/"} priority = ["obofoundry", "custom", "default", "bioregistry"] assert getiri("chebi:24867", prefixmap=prefixmap, priority=priority) == \ 'http://purl.obolibrary.org/obo/CHEBI24867' assert getiri("lipidmaps:1234", prefixmap=prefixmap, priority=priority) == \ 'https://example.org/lipidmaps/1234' ```

Alternatively, there are direct functions for generating IRIs for different registries:

```python import bioregistry as br

Bioregistry IRI

assert br.getbioregistryiri('chebi', '24867') == 'https://bioregistry.io/chebi:24867'

Default Provider

assert br.getdefaultiri('chebi', '24867') == 'https://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI:24867'

OBO Library

assert br.getobofoundryiri('chebi', '24867') == 'http://purl.obolibrary.org/obo/CHEBI_24867'

OLS IRI

assert br.getolsiri('chebi', '24867') == 'https://www.ebi.ac.uk/ols/ontologies/chebi/terms?iri=http://purl.obolibrary.org/obo/CHEBI_24867'

Bioportal IRI

assert br.getbioportaliri('chebi', '24867') == \ 'https://bioportal.bioontology.org/ontologies/CHEBI/?p=classes&conceptid=http://purl.obolibrary.org/obo/CHEBI_24867'

Identifiers.org IRI

assert br.getidentifiersorg_iri('chebi', '24867') == 'https://identifiers.org/CHEBI:24867'

Name-to-Thing IRI

assert br.getn2tiri('chebi', '24867') == 'https://n2t.net/chebi:24867' ```

Each of these functions could also return None if there isn't a provider available or if the prefix can't be mapped to the various resources.

Prefix Map

The Bioregistry can be used to generate prefix maps with various flavors depending on your context. Prioritization works the same way as when generating IRIs.

```python from bioregistry import getprefixmap

Standard

prefixmap = getprefix_map()

Prioritize OBO prefixes over bioregistry

priority = ["obofoundry", "default", "miriam", "ols", "n2t", "bioportal"] prefixmap = getprefixmap(uriprefix_priority=priority)

Provide custom remapping that doesn't have prioritization logic

remapping = {"chebi": "CHEBI"} prefixmap = getprefix_map(remapping=remapping) ```

Getting Metadata

The pattern for an entry in the Bioregistry can be looked up quickly with get_pattern() if it exists. It prefers the custom curated, then MIRIAM, then Wikidata pattern.

```python import bioregistry

assert '^GO:\d{7}$' == bioregistry.get_pattern('go') ```

Entries in the Bioregistry can be checked for deprecation with the is_deprecated() function. MIRIAM and OBO Foundry don't often agree - OBO Foundry takes precedence since it seems to be updated more often.

```python import bioregistry

assert bioregistry.isdeprecated('nmr') assert not bioregistry.isdeprecated('efo') ```

Entries in the Bioregistry can be looked up with the get_resource() function.

```python import bioregistry

entry = bioregistry.get_resource('taxonomy')

there are lots of mysteries to discover in this dictionary!

```

The full Bioregistry can be read in a Python project using:

```python import bioregistry

registry = bioregistry.read_registry() ```

🕸️ Resolver App

After installation with the [web] extras, the Bioregistry web application can be run with the following code:

shell $ python -m pip install bioregistry[web] $ bioregistry web

to run a web app that functions like Identifiers.org, but backed by the Bioregistry. A public instance of this app is hosted by the Gyori Lab for Computational Biomedicine at https://bioregistry.io.

👋 Attribution

⚖️ License

The code in this repository is licensed under the MIT License.

📛 Badge

If you use the Bioregistry in your code, support us by including our badge in your project's README.md:

markdown [![Powered by the Bioregistry](https://img.shields.io/static/v1?label=Powered%20by&message=Bioregistry&color=BA274A&style=flat&logo=image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACgAAAAoCAYAAACM/rhtAAAACXBIWXMAAAEnAAABJwGNvPDMAAAAGXRFWHRTb2Z0d2FyZQB3d3cuaW5rc2NhcGUub3Jnm+48GgAACi9JREFUWIWtmXl41MUZxz/z291sstmQO9mQG0ISwHBtOOSwgpUQhApWgUfEowKigKI81actypaqFbWPVkGFFKU0Vgs+YgvhEAoqEUESrnDlEEhCbkLYJtlkk9399Y/N/rKbzQXt96+Zed+Z9/t7Z+adeecnuA1s5yFVSGrLOAf2qTiEEYlUZKIAfYdKE7KoBLkQSc4XgkPfXxz/owmT41ZtiVtR3j94eqxQq5aDeASIvkVb12RBtt0mb5xZsvfa/5XgnqTMcI3Eq7IQjwM+7jJJo8YvNhK/qDBUOl8A7JZWWqqu01Jeg6Pd1nW4NuBjjax6eWrRruv/M8EDqTMflmXeB0Jcbb6RIRhmTCJ0ymgC0wYjadTd9nW0tWMu+In63NNU7c3FWtvgJpXrZVlakVGU8/ltEcwzGjU3miI/ABa72vwTB5K45AEi7x2PUEl9fZsHZLuDmgPHuLJpJ82lle6iTSH6mpXp+fnt/Sa4yzhbp22yfwFkgnMaBy17kPhFmQh1997qLxztNkq35XB505fINtf0iz1WvfTQ7Pxdlj4Jdnjuny5yvpEhjHh7FQOGD/YyZi4owS86HJ+QQMDpJaBf3jUXlHD21+8q0y4LDppV/vfNO7+jzV3Pa6SOac0E8I8fSPonpm7JAVR+eRhzwU/Ofj+e49tpT/HdtGXcyLvQJ8HAtCTGfmJCF2dwfpTMz4NszX/uqqdyr+xPyVwoEK+C03PGrDX4GkJ7NBJ+txH/hCgAit7cRlNxOY62dmzmZgwzJvZJUh2gI/xnRmoOHsfe3AqQ/kho0qXs+pLzLh3FgwdT54YKxLsAQq0mbf1zHuTsltZejemHJSrlgGGDPGTXc09zdM5qTi59jZbKOg+Zb1QYI95+XokEQogPDifPDnPJFQ8uCkl8FyGmACQtn4dhxp3KINX7jnHi0ZeJnT8dla8Plbu+48zzfyJ08kh8ggIACB4zlIAhsURm3EnML6eB6Fzep1a+SUt5DS2VddTs+4GQccPRhgV1kowIQRaChhMXAPxkIev/Vl+8R/HgnqTMmI4gjH/iQOIXZSqdzQUlXDB9RPyi+1DrdVx67WMursvCkDERXYxB0ROSIOKecURMG+tBzkXAhbYbZk6teNPLkwmPzUIX71wuMiw+MHx2nEJQrWIFHSdE4pIHlFDisLZxYe1HhIwfTtLK+RSu30rVnlxGvrOapOcW9DsW3vH6CgKS4zxIXlz3Fw8dSaMmcfEcV9XHYbc/DSCZMEkgFoJzY0TeO17pVL7jANbaBoauWUJlTi4VOw+T9sazBKYl0ZB/qV/kALThQRi3vOJB0lpzw0vPMONOtOHOqRcyi7bzkEqanJo3HogBMGROUrziaGundGsOsQsyUPn6UPx2NvELZxIybhinn3uLyx9uVwaW7XbqjxdQmr2X0uy93Dh+Dtlu9zCu9vdj1PsvEWwcii7OwJAXFnoRFCoVhoxJrmr0gOQWo9qBfaorXodOHq0o1x8roN3cSMyC6ZT942uQBIlL53Jl804sV6oY9/fXAGg4WcjFdZuxlFV7GNPFRzFs7VKCRiV7ejJrTa/eDr1rFKXZOQCocEyTgHQAyUdD4B2d4cF8pohg4zC0YUFU7z5C9Jy7sVvbKPtsH6GT0tCGBtFwspBTz/zRixyApbSKk8te5+aZ4l4JdUVQWpIScmQhjGocUjJCRhcTieSjURQTF89FtttpuVaLpaya8Knp1B3OQ5Zlag/nU//9cmScS6EnONrauWjazIQv3kCoVD3quUPS+uAXHU7z1SpATpEQchSA78AwD0WVnxa1XkdjURlCJRGQHMfN/EuEjk9jyr4NRN47Hltjc58Gm0sraTjZ/w3l5BLuKkZJdFzT1f5+3Sq3NZjRDNAjaX1orb2BX2wEmkA9fvGGbvW7Q+OlUu+2wlIqdx+h3dzkJVPrda5iQJ93p+DRqcQ/PhsAw8xJ6AfHdkhuIVvoEribLl/jxKOv4Gi34T8omgnb1yOk7sdTA01AiK3J6yoGgP+gaPwHOdOP6LlTlXb3mNYXAlI8da9/e0pJBZovV2BrakYzQK/I3bg0SsiiCqClqs/0wAPB6UOVo6k3+CdEETwm1aPtP+dLlLJPSKAHOYDWCoVLlYTkKAKcCU4vO7IrhErFsLVLPXZ+V0haDcN+v8xjB9strdQfPavUA0ckefRxWNuwVNS6rBRKQB44r+Lmc5f7TRAgaFQyYzb9Dv/4gd18ASQ8/gsC0zwJNJVcw97aeWmOcDtaAW6eLXZLBchTC8EhWXbW6o+cInhMipetuu9OUvTWNnwNodzx+krlvAQIGjmECV+spyH/Ak3F5QDok+OoPXicip2HiJiWTuH6rQx6eh7BxlT0STH4xUbSUl6Df/xAIqaO9bBVn3taKUuy/ZAwYZImpvx4FYjVRgQzOec9r1vK0TmrldMiIDkO45ZXegxLLrRW13P0/heQHQ4CUhIYvfElNIHOtWaztNJ4qZQBqfFKLg3OMz135rNY624ClB0tHJcomTA5ZMGnANbaBmoOHPMy5hvZebNuLCoj71frXIN0i9pDJzj24IsIlUTCo7NI3/KyQg5ArfMleEyKBzmA6r1HO8eV+dSEySEB2G3yRpwZP1c2f+n1GjB07RIlcwNoKi7j3G839EhQF2cg6fmHmbznPRKevJ/GorIedV1wtLVzJesrV9WqQtoIHRfWjreSjwGar1ZRui3Ho7PfwHBGb3jRg6S1roGeoIuNJGBIPKV/zSF31irOrn4HXAu9B1zduhtLecelQxZZ9xTtrgC342Df8IwQyaYqBMKEWo0xaw1BI4d4DNJSWcfF32fRWnuD5NWPEDZ5lIe8NDuHq1v+ha2xGdkho4szYJg1hbj501EH6OgJ5oIS8hf/oWPm5HqNrE51vdt4nC/7k+9bIIT8GYA2Ipixn5jwjQrrZsju0XT5GubTRfiEBqFPisUvOrzPPi0VdeQ9YcJ63bWmxbzphTk7XHKvA/DrlJkfAU+Bcy2N+fA3vZK0WVoxny4idOKIfn+IO7lTz7zRObWCjdMv7VnhruOV9dws9F8u4CsAS1k1J54wYS4o6arWaaS8hvLP998yuZtnisl7wuROLkdjsKzqqtfL45FjB8gzwZnIJy6dS8Jjs3p8ausvHG3tXN26mytZO5W8Rcjsbg1Qze/X45ELHY9I7wHLXG26+CgSl8zFkDGh3zdkF2S7nep9PzhzmnK3FEGwUWOwrJr6zTdeL529EnRhf3LmfCHEBkBZiNrwIAwZkwi9a5Qzh9D6dNvXYW3jZkEJ9UdOOYPwdY/gXgdiufuGuC2C4Hy3kWXrOhmeBLQeA6jV6GLC8Y0KR613Hn+2phZaK69jqah1P/hdsCKLLIfGtnbG+f3eyfHtEHTh38mzom2SY4WQWQjE9tnBE+XIZKuQNrqCcH9wSwRdMGGSJiTnpatwTJOFMIKcgvPVX/kNIcM1gSgC8iTZfii3aEL+7fyG+C+6O8izl1GE5gAAAABJRU5ErkJggg==)](https://github.com/biopragmatics/bioregistry)

If your README uses reStructuredText (.rst), use this instead:

.. image:: https://img.shields.io/static/v1?label=Powered%20by&message=Bioregistry&color=BA274A&style=flat&logo=image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACgAAAAoCAYAAACM/rhtAAAACXBIWXMAAAEnAAABJwGNvPDMAAAAGXRFWHRTb2Z0d2FyZQB3d3cuaW5rc2NhcGUub3Jnm+48GgAACi9JREFUWIWtmXl41MUZxz/z291sstmQO9mQG0ISwHBtOOSwgpUQhApWgUfEowKigKI81actypaqFbWPVkGFFKU0Vgs+YgvhEAoqEUESrnDlEEhCbkLYJtlkk9399Y/N/rKbzQXt96+Zed+Z9/t7Z+adeecnuA1s5yFVSGrLOAf2qTiEEYlUZKIAfYdKE7KoBLkQSc4XgkPfXxz/owmT41ZtiVtR3j94eqxQq5aDeASIvkVb12RBtt0mb5xZsvfa/5XgnqTMcI3Eq7IQjwM+7jJJo8YvNhK/qDBUOl8A7JZWWqqu01Jeg6Pd1nW4NuBjjax6eWrRruv/M8EDqTMflmXeB0Jcbb6RIRhmTCJ0ymgC0wYjadTd9nW0tWMu+In63NNU7c3FWtvgJpXrZVlakVGU8/ltEcwzGjU3miI/ABa72vwTB5K45AEi7x2PUEl9fZsHZLuDmgPHuLJpJ82lle6iTSH6mpXp+fnt/Sa4yzhbp22yfwFkgnMaBy17kPhFmQh1997qLxztNkq35XB505fINtf0iz1WvfTQ7Pxdlj4Jdnjuny5yvpEhjHh7FQOGD/YyZi4owS86HJ+QQMDpJaBf3jUXlHD21+8q0y4LDppV/vfNO7+jzV3Pa6SOac0E8I8fSPonpm7JAVR+eRhzwU/Ofj+e49tpT/HdtGXcyLvQJ8HAtCTGfmJCF2dwfpTMz4NszX/uqqdyr+xPyVwoEK+C03PGrDX4GkJ7NBJ+txH/hCgAit7cRlNxOY62dmzmZgwzJvZJUh2gI/xnRmoOHsfe3AqQ/kho0qXs+pLzLh3FgwdT54YKxLsAQq0mbf1zHuTsltZejemHJSrlgGGDPGTXc09zdM5qTi59jZbKOg+Zb1QYI95+XokEQogPDifPDnPJFQ8uCkl8FyGmACQtn4dhxp3KINX7jnHi0ZeJnT8dla8Plbu+48zzfyJ08kh8ggIACB4zlIAhsURm3EnML6eB6Fzep1a+SUt5DS2VddTs+4GQccPRhgV1kowIQRaChhMXAPxkIev/Vl+8R/HgnqTMmI4gjH/iQOIXZSqdzQUlXDB9RPyi+1DrdVx67WMursvCkDERXYxB0ROSIOKecURMG+tBzkXAhbYbZk6teNPLkwmPzUIX71wuMiw+MHx2nEJQrWIFHSdE4pIHlFDisLZxYe1HhIwfTtLK+RSu30rVnlxGvrOapOcW9DsW3vH6CgKS4zxIXlz3Fw8dSaMmcfEcV9XHYbc/DSCZMEkgFoJzY0TeO17pVL7jANbaBoauWUJlTi4VOw+T9sazBKYl0ZB/qV/kALThQRi3vOJB0lpzw0vPMONOtOHOqRcyi7bzkEqanJo3HogBMGROUrziaGundGsOsQsyUPn6UPx2NvELZxIybhinn3uLyx9uVwaW7XbqjxdQmr2X0uy93Dh+Dtlu9zCu9vdj1PsvEWwcii7OwJAXFnoRFCoVhoxJrmr0gOQWo9qBfaorXodOHq0o1x8roN3cSMyC6ZT942uQBIlL53Jl804sV6oY9/fXAGg4WcjFdZuxlFV7GNPFRzFs7VKCRiV7ejJrTa/eDr1rFKXZOQCocEyTgHQAyUdD4B2d4cF8pohg4zC0YUFU7z5C9Jy7sVvbKPtsH6GT0tCGBtFwspBTz/zRixyApbSKk8te5+aZ4l4JdUVQWpIScmQhjGocUjJCRhcTieSjURQTF89FtttpuVaLpaya8Knp1B3OQ5Zlag/nU//9cmScS6EnONrauWjazIQv3kCoVD3quUPS+uAXHU7z1SpATpEQchSA78AwD0WVnxa1XkdjURlCJRGQHMfN/EuEjk9jyr4NRN47Hltjc58Gm0sraTjZ/w3l5BLuKkZJdFzT1f5+3Sq3NZjRDNAjaX1orb2BX2wEmkA9fvGGbvW7Q+OlUu+2wlIqdx+h3dzkJVPrda5iQJ93p+DRqcQ/PhsAw8xJ6AfHdkhuIVvoEribLl/jxKOv4Gi34T8omgnb1yOk7sdTA01AiK3J6yoGgP+gaPwHOdOP6LlTlXb3mNYXAlI8da9/e0pJBZovV2BrakYzQK/I3bg0SsiiCqClqs/0wAPB6UOVo6k3+CdEETwm1aPtP+dLlLJPSKAHOYDWCoVLlYTkKAKcCU4vO7IrhErFsLVLPXZ+V0haDcN+v8xjB9strdQfPavUA0ckefRxWNuwVNS6rBRKQB44r+Lmc5f7TRAgaFQyYzb9Dv/4gd18ASQ8/gsC0zwJNJVcw97aeWmOcDtaAW6eLXZLBchTC8EhWXbW6o+cInhMipetuu9OUvTWNnwNodzx+krlvAQIGjmECV+spyH/Ak3F5QDok+OoPXicip2HiJiWTuH6rQx6eh7BxlT0STH4xUbSUl6Df/xAIqaO9bBVn3taKUuy/ZAwYZImpvx4FYjVRgQzOec9r1vK0TmrldMiIDkO45ZXegxLLrRW13P0/heQHQ4CUhIYvfElNIHOtWaztNJ4qZQBqfFKLg3OMz135rNY624ClB0tHJcomTA5ZMGnANbaBmoOHPMy5hvZebNuLCoj71frXIN0i9pDJzj24IsIlUTCo7NI3/KyQg5ArfMleEyKBzmA6r1HO8eV+dSEySEB2G3yRpwZP1c2f+n1GjB07RIlcwNoKi7j3G839EhQF2cg6fmHmbznPRKevJ/GorIedV1wtLVzJesrV9WqQtoIHRfWjreSjwGar1ZRui3Ho7PfwHBGb3jRg6S1roGeoIuNJGBIPKV/zSF31irOrn4HXAu9B1zduhtLecelQxZZ9xTtrgC342Df8IwQyaYqBMKEWo0xaw1BI4d4DNJSWcfF32fRWnuD5NWPEDZ5lIe8NDuHq1v+ha2xGdkho4szYJg1hbj501EH6OgJ5oIS8hf/oWPm5HqNrE51vdt4nC/7k+9bIIT8GYA2Ipixn5jwjQrrZsju0XT5GubTRfiEBqFPisUvOrzPPi0VdeQ9YcJ63bWmxbzphTk7XHKvA/DrlJkfAU+Bcy2N+fA3vZK0WVoxny4idOKIfn+IO7lTz7zRObWCjdMv7VnhruOV9dws9F8u4CsAS1k1J54wYS4o6arWaaS8hvLP998yuZtnisl7wuROLkdjsKzqqtfL45FjB8gzwZnIJy6dS8Jjs3p8ausvHG3tXN26mytZO5W8Rcjsbg1Qze/X45ELHY9I7wHLXG26+CgSl8zFkDGh3zdkF2S7nep9PzhzmnK3FEGwUWOwrJr6zTdeL529EnRhf3LmfCHEBkBZiNrwIAwZkwi9a5Qzh9D6dNvXYW3jZkEJ9UdOOYPwdY/gXgdiufuGuC2C4Hy3kWXrOhmeBLQeA6jV6GLC8Y0KR613Hn+2phZaK69jqah1P/hdsCKLLIfGtnbG+f3eyfHtEHTh38mzom2SY4WQWQjE9tnBE+XIZKuQNrqCcH9wSwRdMGGSJiTnpatwTJOFMIKcgvPVX/kNIcM1gSgC8iTZfii3aEL+7fyG+C+6O8izl1GE5gAAAABJRU5ErkJggg== :target: https://github.com/biopragmatics/bioregistry :alt: Powered by the Bioregistry

It looks like this:

📖 Citation

Unifying the identification of biomedical entities with the Bioregistry >
Hoyt, C. T., Balk, M., Callahan, T. J., Domingo-Fernandez, D., Haendel, M. A., Hegde, H. B., Himmelstein, D. S., Karis, K., Kunze, J., Lubiana, T., Matentzoglu, N., McMurry, J., Moxon, S., Mungall, C. J., Rutz, A., Unni, D. R., Willighagen, E., Winston, D., and Gyori, B. M. (2022)
Scientific Data, s41597-022-01807-3

bibtex @article{Hoyt2022Bioregistry, author = {Hoyt, Charles Tapley and Balk, Meghan and Callahan, Tiffany J and Domingo-Fern{\'{a}}ndez, Daniel and Haendel, Melissa A and Hegde, Harshad B and Himmelstein, Daniel S and Karis, Klas and Kunze, John and Lubiana, Tiago and Matentzoglu, Nicolas and McMurry, Julie and Moxon, Sierra and Mungall, Christopher J and Rutz, Adriano and Unni, Deepak R and Willighagen, Egon and Winston, Donald and Gyori, Benjamin M}, doi = {10.1038/s41597-022-01807-3}, issn = {2052-4463}, journal = {Sci. Data}, number = {1}, pages = {714}, title = {{Unifying the identification of biomedical entities with the Bioregistry}}, url = {https://doi.org/10.1038/s41597-022-01807-3}, volume = {9}, year = {2022} }

Talks on the Bioregistry:

Future Curation in the Bioregistry (WPCI, December 2022)
The Bioregistry - Governance and Review Team (WPCI, December 2022)
Development, Maintenance, and Expansion of the Bioregistry (Sorger Lab Meeting, October 2022)
The Bioregistry, CURIEs, and OBO Community Health (ICBO 2022 (September))
Introduction to the Bioregistry (Sorger Lab Meeting, July 2021)

🎁 Support

The Bioregistry was primarily developed by the Gyori Lab for Computational Biomedicine at Northeastern University, which was previously a part of the Laboratory of Systems Pharmacology in the Harvard Program in Therapeutic Science (HiTS) at Harvard Medical School.

💰 Funding

Chan Zuckerberg Initiative (CZI) 2023-329850
DARPA Automating Scientific Knowledge Extraction and Modeling (ASKEM) HR00112220036
DARPA Young Faculty Award W911NF2010255 (PI: Benjamin M. Gyori).

Owner

Name: Biopragmatics Stack
Login: biopragmatics
Kind: organization

Website: https://biopragmatics.github.io
Twitter: biopragmatics
Repositories: 9
Profile: https://github.com/biopragmatics

Software supporting biomedical semantics and pragmatics

Citation (CITATION.cff)

cff-version: 1.2.0
message: Please cite the Bioregistry manuscript when using this software.
type: article
authors:
- family-names: "Hoyt"
  given-names: "Charles Tapley"
  orcid: "https://orcid.org/0000-0003-4423-4370"
- family-names: "Balk"
  given-names: "Meghan"
  orcid: "https://orcid.org/0000-0003-2699-3066"
- family-names: "Callahan"
  given-names: "Tiffany J."
  orcid: "https://orcid.org/0000-0002-8169-9049"
- family-names: "Domingo-Fernandez"
  given-names: "Daniel"
  orcid: "https://orcid.org/0000-0002-2046-6145"
- family-names: "Haendel"
  given-names: "Melissa A."
  orcid: "https://orcid.org/0000-0001-9114-8737"
- family-names: "Hegde"
  given-names: "Harshad B."
  orcid: "https://orcid.org/0000-0002-2411-565X"
- family-names: "Himmelstein"
  given-names: "Daniel S."
  orcid: "https://orcid.org/0000-0002-3012-7446"
- family-names: "Karis"
  given-names: "Klas"
  orcid: "https://orcid.org/0000-0003-1699-7776"
- family-names: "Kunze"
  given-names: "John"
  orcid: "https://orcid.org/0000-0001-7604-8041"
- family-names: "Lubiana"
  given-names: "Tiago"
  orcid: "https://orcid.org/0000-0003-2473-2313"
- family-names: "Matentzoglu"
  given-names: "Nicolas"
  orcid: "https://orcid.org/0000-0002-7356-1779"
- family-names: "McMurry"
  given-names: "Julie"
  orcid: "https://orcid.org/0000-0002-9353-5498"
- family-names: "Moxon"
  given-names: "Sierra"
  orcid: "https://orcid.org/0000-0002-8719-7760"
- family-names: "Mungall"
  given-names: "Christopher J."
  orcid: "https://orcid.org/0000-0002-6601-2165"
- family-names: "Rutz"
  given-names: "Adriano"
  orcid: "https://orcid.org/0000-0003-0443-9902"
- family-names: "Unni"
  given-names: "Deepak R."
  orcid: "https://orcid.org/0000-0002-8424-0604"
- family-names: "Willighagen"
  given-names: "Egon"
  orcid: "https://orcid.org/0000-0001-7542-0286"
- family-names: "Winston"
  given-names: "Donald"
  orcid: "https://orcid.org/0000-0002-8424-0604"
- family-names: "Gyori"
  given-names: "Benjamin M."
  orcid: "https://orcid.org/0000-0001-9439-5346"
doi: 10.1038/s41597-022-01807-3
identifiers:
  - type: doi
    value: 10.1038/s41597-022-01807-3
keywords:
  - The Bioregistry
  - biocuration
  - biosemantics
  - bioinformatics
  - clinical informatics
  - systems biology
  - semantics
  - semantic web
  - CURIE
  - URI
  - identifiers
  - chemistry
  - cheminformatics
  - agriculture
  - molecular biology
title: >-
  Unifying the Identification of Biomedical Entities with the Bioregistry

Committers

Last synced: 11 months ago

All Time

Total Commits: 4,661
Total Committers: 35
Avg Commits per committer: 133.171
Development Distribution Score (DDS): 0.459

Past Year

Commits: 558
Committers: 17
Avg Commits per committer: 32.824
Development Distribution Score (DDS): 0.638

Top Committers

Name	Email	Commits
GitHub Action	a**n@g**m	2,521
Charles Tapley Hoyt	c**t@g**m	1,816
github-actions[bot]	4****]	79
Mufaddal Naguthanawala	1****m	64
Benjamin M. Gyori	b**i@g**m	58
tanayshah2	t**0@g**m	43
Nalika Palayoor	p**n@n**u	17
Egon Willighagen	e**n@g**m	13
Nico Matentzoglu	n**u@g**m	8
Daniel Himmelstein	d**n@g**m	6
Adriano Rutz	4****e	5
CondeArcana333	o**a@c**x	2
Chris Mungall	c**m@b**g	2
Philip Strömert	P**t@t**u	2
SumirHPandit	4****p	2
Tomohiro Oga	o**t@n**u	2
Vincent Emonet	v**t@g**m	2
marius-mather	6****r	2
svempati21	1****1	1
rombaum	4****m	1
kkaris	k**s@g**m	1
Victorino Lavida	7****a	1
Tiago Lubiana	t**s@u**r	1
Scott Colby	s****3	1
Nisha Sharma	5****4	1
Kevin Schaper	k**r@g**m	1
John A. Kunze	j**l@g**m	1
Jeet Vora	v**t@g**m	1
Giacomo Lanza	3****3	1
Donny Winston	d**n@a**u	1
and 5 more...

Committer Domains (Top 20 + Academic)

northeastern.edu: 2 incenp.org: 1 alum.mit.edu: 1 usp.br: 1 tib.eu: 1 berkeleybop.org: 1 ciencias.unam.mx: 1 github.com: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 320
Total pull requests: 931
Average time to close issues: 3 months
Average time to close pull requests: 7 days
Total issue authors: 67
Total pull request authors: 31
Average comments per issue: 1.8
Average comments per pull request: 1.0
Merged pull requests: 753
Bot issues: 0
Bot pull requests: 85

Past Year

Issues: 125
Pull requests: 630
Average time to close issues: 6 days
Average time to close pull requests: 5 days
Issue authors: 34
Pull request authors: 16
Average comments per issue: 0.94
Average comments per pull request: 0.98
Merged pull requests: 510
Bot issues: 0
Bot pull requests: 51

View more stats

Top Authors

Issue Authors

cthoyt (89)
bgyori (41)
cmungall (34)
matentzn (21)
sierra-moxon (9)
egonw (8)
lnanderson (7)
JervenBolleman (6)
StroemPhi (5)
dhimmel (5)
hrshdhgd (5)
hkir-dev (4)
nagutm (4)
pfabry (4)
allaway (4)

Pull Request Authors

cthoyt (417)
nagutm (137)
bgyori (115)
github-actions[bot] (85)
nalikapalayoor (73)
tanayshah2 (29)
matentzn (16)
egonw (5)
maayamanoj466 (5)
sumirp (4)
allaway (4)
vemonet (4)
marius-mather (4)
bact (4)
tomo-oga (3)

Top Labels

Issue Labels

New (74) Prefix (66) Update (53) Curation (29) bug (8) Provider (7) website (7) External Registry (6) Policy (4) Regex (4) good first issue (4) documentation (4) Collection (3) Merge (2) wontfix (1) question (1) enhancement (1) Parking (1) waiting-for-submitter (1) manuscript (1)

Pull Request Labels

Prefix (173) New (173) Collection (11) publication-curation (9) waiting-for-submitter (8) Provider (4) blocked (3) documentation (2) Split (1) bug (1) Merge (1)

Packages

Total packages: 2
Total downloads:
- pypi 39,324 last-month
Total docker downloads: 164

Total dependent packages: 24
(may contain duplicates)
Total dependent repositories: 50
(may contain duplicates)
Total versions: 1,452
Total maintainers: 1

pypi.org: bioregistry

Integrated registry of biological databases and nomenclatures

Homepage: https://github.com/biopragmatics/bioregistry
Documentation: https://bioregistry.readthedocs.io
License: MIT License Copyright (c) 2020-2024 Charles Tapley Hoyt Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Latest release: 0.12.36
published 6 months ago

Versions: 852
Dependent Packages: 24
Dependent Repositories: 50
Downloads: 39,324 Last month
Docker Downloads: 164

Rankings

Dependent packages count: 0.5%

Downloads: 1.6%

Dependent repos count: 2.1%

Docker downloads count: 3.0%

Average: 3.6%

Forks count: 6.7%

Stargazers count: 7.5%

Maintainers (1)

cthoyt

Last synced: 6 months ago

proxy.golang.org: github.com/biopragmatics/bioregistry

Documentation: https://pkg.go.dev/github.com/biopragmatics/bioregistry#section-documentation
License: mit
Latest release: v0.12.36
published 6 months ago

Versions: 600
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 5.4%

Average: 5.6%

Dependent repos count: 5.8%

Last synced: 6 months ago

Dependencies

.github/workflows/docker.yml actions

docker/build-push-action v2 composite
docker/login-action v1 composite
docker/setup-buildx-action v1 composite
docker/setup-qemu-action v1 composite

.github/workflows/health.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite
ad-m/github-push-action master composite

.github/workflows/new_prefix_pr.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite
peter-evans/create-pull-request v3 composite

.github/workflows/release.yml actions

marvinpinto/action-automatic-releases v1.2.1 composite

.github/workflows/tests.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite
codecov/codecov-action v1 composite

.github/workflows/update.yml actions

actions/checkout master composite
actions/setup-python v2 composite
ad-m/github-push-action master composite
docker/build-push-action v2 composite
docker/login-action v1 composite
docker/setup-buildx-action v1 composite
docker/setup-qemu-action v1 composite

Dockerfile docker

python 3.11-alpine build

pyproject.toml pypi

bioregistry

Science Score: 77.0%

Keywords

Keywords from Contributors

Scientific Fields

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Bioregistry

📥 Download

🙏 Contributing

⚖️ Governance

🧹 Maintenance

🫀 Health Report

♻️ Update

🚀 Installation

💪 Usage

Normalizing Prefixes

Doesn't affect canonical prefixes

This works for uppercased prefixes, like:

This works for mixed case prefixes like

This works for synonym prefixes, like:

This works for common mistaken prefixes, like:

This works for prefixes that are often written many ways, like:

If a prefix is not registered, it gives back None

Parsing CURIEs

Obvious for canonical CURIEs

Normalize mixed case prefixes

Normalize common mistaken prefixes

Remove the redundant prefix and normalize

Normalizing CURIEs

Idempotent to canonical CURIEs

Normalize common mistaken prefixes

Normalize mixed case prefixes

Remove the redundant prefix and normalize

Parsing IRIs

First-party IRI

OBO Library PURL

OLS IRI

Identifiers.org IRIs (with varying usage of HTTP(s) and colon/slash separator

Bioregistry IRI

Generating IRIs

Bioregistry IRI

Default Provider

OBO Library

OLS IRI

Bioportal IRI

Identifiers.org IRI

Name-to-Thing IRI

Prefix Map

Standard

Prioritize OBO prefixes over bioregistry

Provide custom remapping that doesn't have prioritization logic

Getting Metadata

there are lots of mysteries to discover in this dictionary!

🕸️ Resolver App

👋 Attribution

⚖️ License

📛 Badge

📖 Citation

🎁 Support

💰 Funding

Owner

Citation (CITATION.cff)

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

If a prefix is not registered, it gives back `None`