https://github.com/bridgedb/create-bridgedb-genedb
Example of script to query BioMart, parse and create a BridgeDb database
Science Score: 18.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.7%) to scientific vocabulary
Keywords
Repository
Example of script to query BioMart, parse and create a BridgeDb database
Basic Info
- Host: GitHub
- Owner: bridgedb
- License: apache-2.0
- Language: Java
- Default Branch: master
- Homepage: https://bridgedb.github.io/data/gene_database/
- Size: 21.2 MB
Statistics
- Stars: 4
- Watchers: 8
- Forks: 3
- Open Issues: 5
- Releases: 0
Topics
Metadata Files
README.md
BridgeDb database building for gene databases
Introduction
A script to create a gene-focussed BrigdeDb database based on Ensembl BioMART.
Installation
Java 8 is required for PathVisio 3.x support.
Compile the code with:
shell
mvn clean install
cp target/org.bridgedb.genedb-jar-with-dependencies.jar BioMart2BridgeDb.jar
Run
In your terminal:
shell
java -jar BioMart2BridgeDb.jar $DATASOURCENAME $VERSION $CONFIG_FILE $PATH_FOR_NEW_DB
- <DATASOURCENAME>: Database name (e.g. Ensembl, EnsemblGenomes)
- <VERSION>: Database version (e.g. 111)
- <CONFIGFILE>: configuration file
- <PATHFORNEWDB>: Path for the new database
List of default config files:
Configuration files can be found in https://github.com/bridgedb/create-bridgedb-genedb-config/tree/master/configFiles.
Example: Arabidopsis thaliana config file
How to create your own config file
Give the version of Ensembl BioMart to query:
e.g: http://www.ensembl.org/biomart/, http://oct2014.archive.ensembl.org/biomart/, http://nov2020-metazoa.ensembl.org/biomart/
endpoint=https://nov2020-plants.ensembl.org/biomart/
You can find an overview of releases in the Ensembl Archive, Metazoa Archive, Plants Archive, Fungi Archive.
MartRegistry for plants v49 can be found here:
https://nov2020-plants.ensembl.org/biomart/martservice?type=registry
e.g: plantsmart, metazoamart, default
schema=plants_mart
Code name of the animal species: http://www.ensembl.org/biomart/martservice?type=datasets&mart=ENSEMBLMARTENSEMBL, Metazoa v49, Plants v49 and, Fungi v49
species=athalianaeggene
The name of the bridge database
database_name=Arabidopsis thaliana genes and proteins
The name of the file .bridge created
file_name=AtDerbyEnsemblPlant49
The different data source code name for Arabidopsis thaliana can be found here:
https://nov2020-plants.ensembl.org/biomart/martservice?type=attributes&mart=plantsmart&dataset=athalianaeg_gene
probe_datasource=Affy,Agilent probe_set=affyaragene,affyath1121501,agilentg2519f015059,agilentg2519f021169,agilentg4136a011839,agilentg4136b013324,agilentg4142a012600 **genedatasource**=entrezgeneid,goid,mirbaseaccession,mirbaseid,pdb,refseqdna,refseqpeptide,uniprotsptrembl,uniprotswissprot,tairlocus,nascgene_id
Optional filters (chromosome list) for Arabidopsis thaliana can be found here: https://nov2020-plants.ensembl.org/biomart/martservice?type=filters&dataset=athalianaeggene
e.g: chromosome_name=1,2,3,4,5,Pt,Mt
Owner
- Name: BridgeDb
- Login: bridgedb
- Kind: organization
- Location: Netherlands
- Website: https://www.bridgedb.org/
- Repositories: 37
- Profile: https://github.com/bridgedb
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: BridgeDb
message: 'If you use this software, please cite it as below.'
type: software
authors:
- given-names: 'Martijn P.'
family-names: Van Iersel
- given-names: Alexander R.
family-names: Pico
- given-names: Thomas
family-names: Kelder
- given-names: Jianjiong
family-names: Gao
- given-names: Ho
family-names: Isaac
- given-names: Kristina
family-names: Hanspers
- given-names: Bruce R.
family-names: Conklin
- given-names: Chris T.
family-names: Evelo
identifiers:
- type: url
value: 'https://github.com/bridgedb/BridgeDb'
description: Source code repository for BridgeDb
- type: doi
value: 10.1186/1471-2105-11-5
description: >-
Article: The BridgeDb framework: standardized
access to gene, protein and metabolite
identifier mapping services
repository-code: 'https://github.com/bridgedb/BridgeDb'
url: 'https://bridgedb.github.io/'
preferred-citation:
type: article
authors:
- given-names: 'Martijn P.'
family-names: Van Iersel
- given-names: Alexander R.
family-names: Pico
- given-names: Thomas
family-names: Kelder
- given-names: Jianjiong
family-names: Gao
- given-names: Ho
family-names: Isaac
- given-names: Kristina
family-names: Hanspers
- given-names: Bruce R.
family-names: Conklin
- given-names: Chris T.
family-names: Evelo
doi: "10.1186/1471-2105-11-5"
journal: "BMC Bioinformatics"
title: "The BridgeDb framework: standardized access to gene, protein and metabolite identifier mapping services"
issue: 5
volume: 11
year: 2010
keywords:
- identifier mapping
- Genes
- Proteins
- Metabolites
- Biological data
license: Apache-2.0
version: 3.0.13
date-released: '2022-01-14'
GitHub Events
Total
- Issues event: 1
Last Year
- Issues event: 1