https://github.com/cthoyt/umls_downloader
Don't worry about UMLS, RxNorm, SNOMED, or SemMedDB licensing - write code that knows how to download it automatically
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.6%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Don't worry about UMLS, RxNorm, SNOMED, or SemMedDB licensing - write code that knows how to download it automatically
Basic Info
- Host: GitHub
- Owner: cthoyt
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://umls-downloader.readthedocs.io
- Size: 54.7 KB
Statistics
- Stars: 40
- Watchers: 2
- Forks: 4
- Open Issues: 0
- Releases: 7
Topics
Metadata Files
README.md
UMLS Downloader
Don't worry about UMLS Terminology Services (UTS)
licensing and distribution rules - just use
umls_downloader to write code that knows how to download content and use it
automatically from the following (non-exhaustive) list of resources:
or any content that can be downloaded through the UTS ticket granting system. There's no centralized list of content available through the UTS so suggestions for additional resources are welcome through the issue tracker.
Full documentation are available at umls-downloader.readthedocs.io.
Installation
bash
$ pip install umls_downloader
Download A Specific Version of UMLS
```python import os from umlsdownloader import downloadumls
Get this from https://uts.nlm.nih.gov/uts/edit-profile
api_key = ...
path = downloadumls(version="2021AB", apikey=api_key)
This is where it gets downloaded: ~/.data/bio/umls/2021AB/umls-2021AB-mrconso.zip
expectedpath = os.path.join( os.path.expanduser("~"), ".data", "umls", "2021AB", "umls-2021AB-mrconso.zip", ) assert expectedpath == path.as_posix() ```
After it's been downloaded once, it's smart and doesn't need to download again.
It gets stored using pystow automatically
in the ~/.data/bio/umls directory.
A full list of functions is available in the documentation.
Automating Configuration of UTS Credentials
There are two ways to automatically set the username and password so you don't have to worry about getting it and passing it around in your python code:
- Set
UMLS_API_KEYin the environment - Create
~/.config/umls.iniand set in the[umls]section aapi_keykey.
```python from umlsdownloader import downloadumls
Same path as before
path = download_umls(version="2021AB") ```
Download the Latest Version
First, you'll have to
install bioversions
with pip install bioversions, whose job it is to look up the latest version of
many databases. Then, you can modify the previous code slightly by omitting
the version keyword argument:
```python from umlsdownloader import downloadumls
Same path as before (as of November 21st, 2021)
path = download_umls() ```
Download and open the file
The UMLS file is zipped, so it's usually accompanied with the following boilerplate code:
```python import zipfile from umlsdownloader import downloadumls
path = downloadumls() with zipfile.ZipFile(path) as zipfile: with zip_file.open("MRCONSO.RRF", mode="r") as file: for line in file: ... ```
This exact code is wrapped with the open_umls() using Python's context manager
so it can more simply be written as:
```python from umlsdownloader import openumls
with open_umls() as file: for line in file: ... ```
The version and api_key arguments also apply here.
Why not an API?
The UMLS provides an API
for access to tiny bits of data at a time. There are even two recent (last 5
years) packages umls-api
connect-umls that provide a wrapper
around them. However, API access is generally rate limited, difficult to use in
bulk, and slow. For working with UMLS (or any other database, for that matter)in
bulk, it's necessary to download full database dumps.
👋 Attribution
⚖️ License
The code in this package is licensed under the MIT License.
🍪 Cookiecutter
This package was created with @audreyfeldroy's cookiecutter package using @cthoyt's cookiecutter-snekpack template.
Owner
- Name: Charles Tapley Hoyt
- Login: cthoyt
- Kind: user
- Location: Bonn, Germany
- Company: RWTH Aachen University
- Website: https://cthoyt.com
- Repositories: 489
- Profile: https://github.com/cthoyt
GitHub Events
Total
- Watch event: 8
Last Year
- Watch event: 8
Committers
Last synced: over 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| Charles Tapley Hoyt | c****t@g****m | 40 |
| Benjamin M. Gyori | b****i@g****m | 1 |
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 3
- Total pull requests: 6
- Average time to close issues: 14 days
- Average time to close pull requests: about 1 hour
- Total issue authors: 3
- Total pull request authors: 2
- Average comments per issue: 2.0
- Average comments per pull request: 0.17
- Merged pull requests: 6
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- cthoyt (1)
- Minitour (1)
- bgyori (1)
Pull Request Authors
- cthoyt (6)
- bgyori (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 26,088 last-month
- Total dependent packages: 2
- Total dependent repositories: 3
- Total versions: 7
- Total maintainers: 1
pypi.org: umls-downloader
Automate downloading UMLS data.
- Homepage: https://github.com/cthoyt/umls_downloader
- Documentation: https://umls-downloader.readthedocs.io/
- License: MIT
-
Latest release: 0.1.3
published almost 2 years ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v2 composite
- actions/setup-python v2 composite