https://github.com/bridgedb/create-bridgedb-genedb

Example of script to query BioMart, parse and create a BridgeDb database

https://github.com/bridgedb/create-bridgedb-genedb

Science Score: 18.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.7%) to scientific vocabulary

Keywords

ensembl gene identifier-mapping
Last synced: 6 months ago · JSON representation ·

Repository

Example of script to query BioMart, parse and create a BridgeDb database

Basic Info
Statistics
  • Stars: 4
  • Watchers: 8
  • Forks: 3
  • Open Issues: 5
  • Releases: 0
Fork of JonathanMELIUS/BioMartScript
Topics
ensembl gene identifier-mapping
Created about 8 years ago · Last pushed over 1 year ago
Metadata Files
Readme License Citation

README.md

Java CI with Maven

BridgeDb database building for gene databases

Introduction

A script to create a gene-focussed BrigdeDb database based on Ensembl BioMART.

Installation

Java 8 is required for PathVisio 3.x support.

Compile the code with:

shell mvn clean install cp target/org.bridgedb.genedb-jar-with-dependencies.jar BioMart2BridgeDb.jar

Run

In your terminal:

shell java -jar BioMart2BridgeDb.jar $DATASOURCENAME $VERSION $CONFIG_FILE $PATH_FOR_NEW_DB - <DATASOURCENAME>: Database name (e.g. Ensembl, EnsemblGenomes) - <VERSION>: Database version (e.g. 111) - <CONFIGFILE>: configuration file - <PATHFORNEWDB>: Path for the new database

List of default config files:

Configuration files can be found in https://github.com/bridgedb/create-bridgedb-genedb-config/tree/master/configFiles.

Example: Arabidopsis thaliana config file

How to create your own config file

  • Give the version of Ensembl BioMart to query:

    e.g: http://www.ensembl.org/biomart/, http://oct2014.archive.ensembl.org/biomart/, http://nov2020-metazoa.ensembl.org/biomart/

    endpoint=https://nov2020-plants.ensembl.org/biomart/

    You can find an overview of releases in the Ensembl Archive, Metazoa Archive, Plants Archive, Fungi Archive.

  • MartRegistry for plants v49 can be found here:

    https://nov2020-plants.ensembl.org/biomart/martservice?type=registry

    e.g: plantsmart, metazoamart, default

    schema=plants_mart

  • Code name of the animal species: http://www.ensembl.org/biomart/martservice?type=datasets&mart=ENSEMBLMARTENSEMBL, Metazoa v49, Plants v49 and, Fungi v49

    species=athalianaeggene

  • The name of the bridge database

    database_name=Arabidopsis thaliana genes and proteins

  • The name of the file .bridge created

    file_name=AtDerbyEnsemblPlant49

  • The different data source code name for Arabidopsis thaliana can be found here:

    https://nov2020-plants.ensembl.org/biomart/martservice?type=attributes&mart=plantsmart&dataset=athalianaeg_gene

    probe_datasource=Affy,Agilent probe_set=affyaragene,affyath1121501,agilentg2519f015059,agilentg2519f021169,agilentg4136a011839,agilentg4136b013324,agilentg4142a012600 **genedatasource**=entrezgeneid,goid,mirbaseaccession,mirbaseid,pdb,refseqdna,refseqpeptide,uniprotsptrembl,uniprotswissprot,tairlocus,nascgene_id

  • Optional filters (chromosome list) for Arabidopsis thaliana can be found here: https://nov2020-plants.ensembl.org/biomart/martservice?type=filters&dataset=athalianaeggene

    e.g: chromosome_name=1,2,3,4,5,Pt,Mt

Owner

  • Name: BridgeDb
  • Login: bridgedb
  • Kind: organization
  • Location: Netherlands

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: BridgeDb
message: 'If you use this software, please cite it as below.'
type: software
authors:
  - given-names: 'Martijn P.'
    family-names: Van Iersel
  - given-names: Alexander R.
    family-names: Pico
  - given-names: Thomas
    family-names: Kelder
  - given-names: Jianjiong
    family-names: Gao
  - given-names: Ho
    family-names: Isaac
  - given-names: Kristina
    family-names: Hanspers
  - given-names: Bruce R.
    family-names: Conklin
  - given-names: Chris T.
    family-names: Evelo
identifiers:
  - type: url
    value: 'https://github.com/bridgedb/BridgeDb'
    description: Source code repository for BridgeDb
  - type: doi
    value: 10.1186/1471-2105-11-5
    description: >-
      Article: The BridgeDb framework: standardized
      access to gene, protein and metabolite
      identifier mapping services
repository-code: 'https://github.com/bridgedb/BridgeDb'
url: 'https://bridgedb.github.io/'
preferred-citation:
  type: article
  authors:
  - given-names: 'Martijn P.'
    family-names: Van Iersel
  - given-names: Alexander R.
    family-names: Pico
  - given-names: Thomas
    family-names: Kelder
  - given-names: Jianjiong
    family-names: Gao
  - given-names: Ho
    family-names: Isaac
  - given-names: Kristina
    family-names: Hanspers
  - given-names: Bruce R.
    family-names: Conklin
  - given-names: Chris T.
    family-names: Evelo
  doi: "10.1186/1471-2105-11-5"
  journal: "BMC Bioinformatics"
  title: "The BridgeDb framework: standardized access to gene, protein and metabolite identifier mapping services"
  issue: 5
  volume: 11
  year: 2010
keywords:
  - identifier mapping
  - Genes
  - Proteins
  - Metabolites
  - Biological data
license: Apache-2.0
version: 3.0.13
date-released: '2022-01-14'

GitHub Events

Total
  • Issues event: 1
Last Year
  • Issues event: 1