https://github.com/bgeedb/wikidata_bgeedb-bot
The BgeeDB Wikidata bot
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
✓DOI references
Found 6 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.7%) to scientific vocabulary
Repository
The BgeeDB Wikidata bot
Basic Info
Statistics
- Stars: 1
- Watchers: 5
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Wikidata BgeeDB-bot
This software tool is a bot to insert gene expression data from the Bgee database into Wikidata. Currently, only existing wikidata gene entries from Ensembl and wikidata anatomic entities (e.g. stomach) with a cross-reference to UBERON ontology are considered (including Cell ontology). This bot inserts to wikidata gene pages "expressed in" statements. For example, see the statement "expressed in" at BRAF gene wikidata page: https://www.wikidata.org/wiki/Q17853226.
Note that at most 10 "expression in" statements are included per gene page. The 10 exclusive UBERON anatomic entities (terms prefixed with UBERON and CL) where the gene is expressed.
Editing and generating configuration file
The properties.template contains all variables needed to be set up for running this bot. Variables: * WDUSER: the wikidata username. * WDPASS: the wikidata password.
- EXPRESSIONCALLSFILE: the TSV file path containing the "is expressed in" relations to insert in Wikidata ordered
by descending gene expression score. When considering the EasyBgee v14.2, we can execute the SQL query
getorderedisexpressedin over the
EasyBgee MySQL database to generate the "is expressed in" relations as
a TSV file with the following heading:
gene_id uberon_idwhere UBERON ids are defined by removing their prefixUBERON:when it exists (e.g. UBERON:0002369 => 0002369) and for the other ids that are not prefixed withUBERON:, the:is replaced with_such as the following example: modified fromCL:0000711toCL_0000711. For example, anINPUT_BGEE_DATA_TSVfile with two entries is show below.gene_id uberon_id ENSMUSG00000000001 0002369 ENSMUSG00000000001 CL_0000711
For further information about the variables to set, refer to the properties.template.
Before executing any make command this file must be renamed from properties.template to properties.
After editing the properties file, if you do not have pipenv installed in your python3.10 (or superior) interpreter,
run first the make command below in the current project directory.
make install_pipenv
If pipenv is already installed, run the make command below in the current project directory:
make
REMARK A temporary file called count.tmp is generated to be able to restart the execution from the last successful
Wikidata insertion. To rerun the bot from the beginning, this file must be removed.
DEPRECATED Execute the make command line below to generate the relations "is expressed in" for human and mouse genes
from EasyBgee v14.2
make get_input_expression_data
The output file is placed at the file path defined in the EXPRESSION_CALLS_FILE variable.
Owner
- Name: BgeeDB
- Login: BgeeDB
- Kind: organization
- Website: https://bgee.org/
- Twitter: Bgeedb
- Repositories: 10
- Profile: https://github.com/BgeeDB
GitHub Events
Total
- Push event: 2
Last Year
- Push event: 2