create-bridgedb-diseases
Code to create local mapping files for disease IDs
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 7 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.1%) to scientific vocabulary
Repository
Code to create local mapping files for disease IDs
Basic Info
- Host: GitHub
- Owner: DeniseSl22
- License: other
- Language: Groovy
- Default Branch: master
- Size: 2.47 MB
Statistics
- Stars: 0
- Watchers: 2
- Forks: 3
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Create BridgeDb Identity Mapping files
This Groovy script creates a Derby file for BridgeDb [1,2] for use in PathVisio, etc.
The script will be tested with Wikidata [3,4] from November 2019, and is based on the createbridgedbmetabolites repository
We're greatfull for all that worked on identifier mappings in this/these project(s):
- http://wikidata.org/
Everyone can contribute ID mappings to Wikidata.
![]()
Releases
The files are released via the BridgeDb Website: https://www.bridgedb.org/data/gene_database/
The mapping files are also archived on Figshare: https://figshare.com/search?q=diseases+bridgedb+mapping+database&quick=1
License
This repository: New BSD.
Derby License -> http://db.apache.org/derby/license.html BridgeDb License -> http://www.bridgedb.org/browser/trunk/LICENSE-2.0.txt
Run the script and test the results
- add the jars to your classpath, e.g. on Linux with:
export CLASSPATH=`ls -1 *.jar | tr '\n' ':'`
- make sure the Wikidata files are saved
2.1 ID mappings
A set of SPARQL queries have been compiled and saved in the wikidata/ folder. These queries can be manually executed at http://query.wikidata.org/. These queries download mappings from Wikidata for OMIM (omim.rq), Disease Ontology (do.rq), UMLS CUI (cui.rq), Orphanet (orpha.rq), MeSH descriptor IDs (mesh.rq)-> coming soon.
However, you can also use the below curl command line operations.
curl -H "Accept: text/csv" --data-urlencode query@wikidata/omim.rq -G https://query.wikidata.org/bigdata/namespace/wdq/sparql -o omim2wikidata.csv
curl -H "Accept: text/csv" --data-urlencode query@wikidata/do.rq -G https://query.wikidata.org/bigdata/namespace/wdq/sparql -o do2wikidata.csv
curl -H "Accept: text/csv" --data-urlencode query@wikidata/cui.rq -G https://query.wikidata.org/bigdata/namespace/wdq/sparql -o cui2wikidata.csv
curl -H "Accept: text/csv" --data-urlencode query@wikidata/orpha.rq -G https://query.wikidata.org/bigdata/namespace/wdq/sparql -o orpha2wikidata.csv
curl -H "Accept: text/csv" --data-urlencode query@wikidata/mesh.rq -G https://query.wikidata.org/bigdata/namespace/wdq/sparql -o mesh2wikidata.csv
curl -H "Accept: text/csv" --data-urlencode query@wikidata/icd9.rq -G https://query.wikidata.org/bigdata/namespace/wdq/sparql -o icd92wikidata.csv
curl -H "Accept: text/csv" --data-urlencode query@wikidata/icd10.rq -G https://query.wikidata.org/bigdata/namespace/wdq/sparql -o icd102wikidata.csv
curl -H "Accept: text/csv" --data-urlencode query@wikidata/icd11.rq -G https://query.wikidata.org/bigdata/namespace/wdq/sparql -o icd112wikidata.csv
curl -H "Accept: text/csv" --data-urlencode query@wikidata/mondo.rq -G https://query.wikidata.org/bigdata/namespace/wdq/sparql -o mondo2wikidata.csv
4.2 Get Disease Labels
With a similar SPARQL query (names.rq) the disease labels (English only) can be downloaded as simple TSV and saved as "names2wikidata.tsv" (note that this file is TAB separated):
curl -H "Accept: text/tab-separated-values" --data-urlencode query@wikidata/names.rq -G https://query.wikidata.org/bigdata/namespace/wdq/sparql -o names2wikidata.tsv
- Update the createDerby.groovy file with the new version numbers ("DATASOURCEVERSION" field) and run the script with Groovy: #Update line
export CLASSPATH=`ls -1 *.jar | tr '\n' ':'`
groovy createDerby.groovy
Test the resulting Derby file by opening it in PathVisio
Use the BridgeDb QC tool to compare it with the previous mapping file
The BridgeDb repository has a tool to perform quality control (qc) on ID mapping files:
sh qc.sh old.bridge new.bridge
- Upload the data to Figshare and update the following pages:
- http://www.bridgedb.org/mapping-databases/hmdb-metabolite-mappings/ #Update link
- http://bridgedb.org/data/gene_database/ #Update link
- Tag this repository with the DOI of the latest release.
To ensure we know exactly which repository version was used to generate a specific release, the latest commit used for that release is tagged with the DOI on Figshare. To list all current tags:
git tag
To make a new tag, run:
git tag $DOR
`
where $DOI is replaced with the DOI of the release.
- Inform downstream projects
At least the following projects need to be informed about the availability of the new mapping database:
- BridgeDb webservice
- WikiPathways RDF generation team (Jenkins server)
- WikiPathways indexer (supporting the WikiPathways web service)
References
- http://bridgedb.org/
- Van Iersel, M. P., Pico, A. R., Kelder, T., Gao, J., Ho, I., Hanspers, K., Conklin, B. R., Evelo, C. T., Jan. 2010. The BridgeDb framework: standardized access to gene, protein and metabolite identifier mapping services. BMC bioinformatics 11 (1), 5+. http://dx.doi.org/10.1186/1471-2105-11-5
- Vrandečić, Denny. "Wikidata: a new platform for collaborative data collection." Proceedings of the 21st International Conference on World Wide Web. ACM, 2012. https://doi.org/10.1145/2187980.2188242
- Mietchen D, Hagedorn G, Willighagen E, Rico M, Gómez-Pérez A, Aibar E, Rafes K, Germain C, Dunning A, Pintscher L, Kinzler D (2015) Enabling Open Science: Wikidata for Research (Wiki4R). Research Ideas and Outcomes 1: e7573. https://doi.org/10.3897/rio.1.e7573
Owner
- Name: De
- Login: DeniseSl22
- Kind: user
- Location: Maastricht
- Company: UM @BiGCAT-UM
- Website: https://orcid.org/0000-0001-8449-1318
- Twitter: smallcat4sci
- Repositories: 10
- Profile: https://github.com/DeniseSl22
PhD candidate in chem/bio/informatics @UM/BiGCaT
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: BridgeDb
message: 'If you use this software, please cite it as below.'
type: software
authors:
- given-names: 'Martijn P.'
family-names: Van Iersel
- given-names: Alexander R.
family-names: Pico
- given-names: Thomas
family-names: Kelder
- given-names: Jianjiong
family-names: Gao
- given-names: Ho
family-names: Isaac
- given-names: Kristina
family-names: Hanspers
- given-names: Bruce R.
family-names: Conklin
- given-names: Chris T.
family-names: Evelo
identifiers:
- type: url
value: 'https://github.com/bridgedb/BridgeDb'
description: Source code repository for BridgeDb
- type: doi
value: 10.1186/1471-2105-11-5
description: >-
Article: The BridgeDb framework: standardized
access to gene, protein and metabolite
identifier mapping services
repository-code: 'https://github.com/bridgedb/BridgeDb'
url: 'https://bridgedb.github.io/'
preferred-citation:
type: article
authors:
- given-names: 'Martijn P.'
family-names: Van Iersel
- given-names: Alexander R.
family-names: Pico
- given-names: Thomas
family-names: Kelder
- given-names: Jianjiong
family-names: Gao
- given-names: Ho
family-names: Isaac
- given-names: Kristina
family-names: Hanspers
- given-names: Bruce R.
family-names: Conklin
- given-names: Chris T.
family-names: Evelo
doi: "10.1186/1471-2105-11-5"
journal: "BMC Bioinformatics"
title: "The BridgeDb framework: standardized access to gene, protein and metabolite identifier mapping services"
issue: 5
volume: 11
year: 2010
keywords:
- identifier mapping
- Genes
- Proteins
- Metabolites
- Biological data
license: Apache-2.0
version: 3.0.13
date-released: '2022-01-14'
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: 12 months ago
All Time
- Total issues: 3
- Total pull requests: 3
- Average time to close issues: over 1 year
- Average time to close pull requests: 13 days
- Total issue authors: 2
- Total pull request authors: 2
- Average comments per issue: 1.33
- Average comments per pull request: 1.33
- Merged pull requests: 2
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- DeniseSl22 (2)
- egonw (1)
Pull Request Authors
- egonw (1)
- hbasaric (1)