aopwikirdf
This repository contains code for the AOP-Wiki XML-to-RDF conversion and guidance to deploy a Virtuoso SPARQL endpoint Docker image that is loaded with the AOP-Wiki RDF
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: science.org, zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.9%) to scientific vocabulary
Repository
This repository contains code for the AOP-Wiki XML-to-RDF conversion and guidance to deploy a Virtuoso SPARQL endpoint Docker image that is loaded with the AOP-Wiki RDF
Basic Info
Statistics
- Stars: 9
- Watchers: 3
- Forks: 1
- Open Issues: 11
- Releases: 4
Metadata Files
README.md
AOP-Wiki XML to RDF Conversion Tool
A tool for converting AOP-Wiki XML data exports into RDF (Resource Description Framework) format. This repository contains the conversion notebook that transforms Adverse Outcome Pathway data into semantic web formats, along with automated workflows for data generation and quality control.
Features
- Automated XML-to-RDF conversion from AOP-Wiki XML exports
- Gene mapping integration using HGNC data and BridgeDb services
- Weekly automated updates via GitHub Actions
- Quality control validation of generated RDF files
- URI resolvability monitoring with automated badge generation
- SPARQL endpoint setup instructions for local data exploration
Generated Data
The conversion process produces three main RDF files:
- AOPWikiRDF.ttl: Main dataset containing AOP-Wiki data (AOPs, Key Events, Key Event Relationships, Chemical Stressors)
- AOPWikiRDF-Genes.ttl: Gene mapping extensions with approved HGNC symbols and database identifiers
- AOPWikiRDF-Void.ttl: VoID metadata describing the datasets
Quick Start
Prerequisites
bash
pip install -r requirements.txt
Run Conversion
bash
jupyter execute AOP-Wiki_XML_to_RDF_conversion.ipynb
Generate Statistics
bash
jupyter execute AOP-Wiki_stats.ipynb
Set up a Virtuoso SPARQL endpoint with AOP-Wiki RDF (on linux):
Step 1 - Create folder to mount
Enter the terminal and create a local folder to map to the docker container. Note the path to the folder to enter it at step 3. In this example, the folder '/aopwikirdf' was created and entered it by using:
mkdir -p aopwikirdf
Step 2 - Move the RDF (.ttl) files into the newly created folder
Step 3 - Run the Docker image
Be sure to use ports 8890:8890 and 1111:1111. In this case, the container was named "AOPWiki". Also, this step configures the mapped local folder with the data, which is in this example "/aopwikirdf". The Docker image used is openlink/virtuoso-opensource-7. Run the Docker image by entering:
sudo docker run -d --env DBA_PASSWORD=dba -p 8890:8890 -p 1111:1111 --name AOPWiki --volume `pwd`/aopwikirdf/:/database/data/ openlink/virtuoso-opensource-7
Step 4 - Enter the running container
The SPARQL endpoint should already be accessible through localhost:8890/sparql/. However, while the Docker image is running, the data is not yet loaded. Therefore you need to enter the it by using:
sudo docker exec -it AOPWiki bash
Step 5 - Move the .ttl files
First, enter the "/data" folder and move the Turtle file(s) to the folder upstream by using:
mv data/AOPWikiRDF.ttl .
mv data/AOPWikiRDF-Void.ttl .
mv data/AOPWikiRDF-Genes.ttl .
exit
Step 6 - Enter the container SQL and reset
Enter the running docker container SQL by using:
sudo docker exec -i AOPWiki isql 1111
In case the service is already active and contains older RDF, be sure to perform a global reset and delete the old RDF files from the loadlist, using the following commands:
```
RDFGLOBALRESET();
DELETE FROM loadlist WHERE llgraph = 'http://aopwiki.org/';
```
The presence of files in the loadlist can be viewed using the following command:
select * from DB.DBA.load_list;
Step 7 - Load the RDF
Use the following commands to complete the loading of RDF. If errors occur, try again within a few seconds (which often works), or look at http://docs.openlinksw.com/virtuoso/errorcodes/ to find out what they mean.
log_enable(2);
DB.DBA.XML_SET_NS_DECL ('dc', 'http://purl.org/dc/elements/1.1/',2);
DB.DBA.XML_SET_NS_DECL ('dcterms', 'http://purl.org/dc/terms/',2);
DB.DBA.XML_SET_NS_DECL ('rdfs', 'http://www.w3.org/2000/01/rdf-schema#',2);
DB.DBA.XML_SET_NS_DECL ('foaf', 'http://xmlns.com/foaf/0.1/',2);
DB.DBA.XML_SET_NS_DECL ('aop', 'https://identifiers.org/aop/',2);
DB.DBA.XML_SET_NS_DECL ('aop.events', 'https://identifiers.org/aop.events/',2);
DB.DBA.XML_SET_NS_DECL ('aop.relationships', 'https://identifiers.org/aop.relationships/',2);
DB.DBA.XML_SET_NS_DECL ('aop.stressor', 'https://identifiers.org/aop.stressor/',2);
DB.DBA.XML_SET_NS_DECL ('aopo', 'http://aopkb.org/aop_ontology#',2);
DB.DBA.XML_SET_NS_DECL ('cas', 'https://identifiers.org/cas/',2);
DB.DBA.XML_SET_NS_DECL ('inchikey', 'https://identifiers.org/inchikey/',2);
DB.DBA.XML_SET_NS_DECL ('pato', 'http://purl.obolibrary.org/obo/PATO_',2);
DB.DBA.XML_SET_NS_DECL ('ncbitaxon', 'http://purl.bioontology.org/ontology/NCBITAXON/',2);
DB.DBA.XML_SET_NS_DECL ('cl', 'http://purl.obolibrary.org/obo/CL_',2);
DB.DBA.XML_SET_NS_DECL ('uberon', 'http://purl.obolibrary.org/obo/UBERON_',2);
DB.DBA.XML_SET_NS_DECL ('go', 'http://purl.org/obo/owl/GO#',2);
DB.DBA.XML_SET_NS_DECL ('mi', 'http://purl.obolibrary.org/obo/MI_',2);
DB.DBA.XML_SET_NS_DECL ('mp', 'http://purl.obolibrary.org/obo/MP_',2);
DB.DBA.XML_SET_NS_DECL ('hp', 'http://purl.obolibrary.org/obo/HP_',2);
DB.DBA.XML_SET_NS_DECL ('pco', 'http://purl.obolibrary.org/obo/PCO_',2);
DB.DBA.XML_SET_NS_DECL ('nbo', 'http://purl.obolibrary.org/obo/NBO_',2);
DB.DBA.XML_SET_NS_DECL ('vt', 'http://purl.obolibrary.org/obo/VT_',2);
DB.DBA.XML_SET_NS_DECL ('pr', 'http://purl.obolibrary.org/obo/PR_',2);
DB.DBA.XML_SET_NS_DECL ('chebio', 'http://purl.obolibrary.org/obo/CHEBI_',2);
DB.DBA.XML_SET_NS_DECL ('fma', 'http://purl.org/sig/ont/fma/fma',2);
DB.DBA.XML_SET_NS_DECL ('cheminf', 'http://semanticscience.org/resource/CHEMINF_',2);
DB.DBA.XML_SET_NS_DECL ('ncit', 'http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#',2);
DB.DBA.XML_SET_NS_DECL ('comptox', 'https://comptox.epa.gov/dashboard/',2);
DB.DBA.XML_SET_NS_DECL ('mmo', 'http://purl.obolibrary.org/obo/MMO_',2);
DB.DBA.XML_SET_NS_DECL ('chebi', 'https://identifiers.org/chebi/',2);
DB.DBA.XML_SET_NS_DECL ('chemspider', 'https://identifiers.org/chemspider/',2);
DB.DBA.XML_SET_NS_DECL ('wikidata', 'https://identifiers.org/wikidata/',2);
DB.DBA.XML_SET_NS_DECL ('chembl.compound', 'https://identifiers.org/chembl.compound/',2);
DB.DBA.XML_SET_NS_DECL ('pubchem.compound', 'https://identifiers.org/pubchem.compound/',2);
DB.DBA.XML_SET_NS_DECL ('drugbank', 'https://identifiers.org/drugbank/',2);
DB.DBA.XML_SET_NS_DECL ('kegg.compound', 'https://identifiers.org/kegg.compound/',2);
DB.DBA.XML_SET_NS_DECL ('lipidmaps', 'https://identifiers.org/lipidmaps/',2);
DB.DBA.XML_SET_NS_DECL ('hmdb', 'https://identifiers.org/hmdb/',2);
DB.DBA.XML_SET_NS_DECL ('ensembl', 'https://identifiers.org/ensembl/',2);
DB.DBA.XML_SET_NS_DECL ('edam', 'http://edamontology.org/',2);
DB.DBA.XML_SET_NS_DECL ('hgnc', 'https://identifiers.org/hgnc/',2);
DB.DBA.XML_SET_NS_DECL ('ncbigene', 'https://identifiers.org/ncbigene/',2);
DB.DBA.XML_SET_NS_DECL ('uniprot', 'https://identifiers.org/uniprot/',2);
DB.DBA.XML_SET_NS_DECL ('void', 'http://rdfs.org/ns/void#',2);
DB.DBA.XML_SET_NS_DECL ('pav', 'http://purl.org/pav/',2);
DB.DBA.XML_SET_NS_DECL ('dcat', 'http://www.w3.org/ns/dcat#',2);
log_enable(1);
grant select on "DB.DBA.SPARQL_SINV_2" to "SPARQL";
grant execute on "DB.DBA.SPARQL_SINV_IMP" to "SPARQL";
ld_dir('data', 'AOPWikiRDF.ttl', 'http://aopwiki.org/');
ld_dir('data', 'AOPWikiRDF-Void.ttl', 'http://aopwiki.org/');
ld_dir('data', 'AOPWikiRDF-Genes.ttl', 'http://aopwiki.org/');
ld_dir('data', 'ServiceDescription.ttl', 'servicedescription');
To finalize the loading of data, use:
rdf_loader_run();
Check the status and look if the all.ttl file is loaded by entering:
select * from DB.DBA.load_list;
If the "il_state" = 2, the loading is complete. If issues occurred in this step, have a look at http://vos.openlinksw.com/owiki/wiki/VOS/VirtBulkRDFLoader.
Quit the SQL by entering:
quit;
Step 8 - Enter the Virtuoso service with loaded AOP-Wiki RDF
The container is running with loaded RDF, available through http://localhost:8890, or enter the SPARQL endpoint directly through http://localhost:8890/sparql/.
Owner
- Name: Marvin Martens
- Login: marvinm2
- Kind: user
- Company: Maastricht University
- Repositories: 7
- Profile: https://github.com/marvinm2
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: Martens
given-names: Marvin
title: AOP-Wiki XML to RDF conversion
version: v1.3
date-released: 2020-11-26
url: "https://github.com/marvinm2/AOPWikiRDF"
preferred-citation:
type: article
authors:
- family-names: Martens
given-names: M.
- family-names: Evelo
given-names: Chris T.
- family-names: Willighagen
given-names: Egon L.
title: "Providing Adverse Outcome Pathways from the AOP-Wiki in a Semantic Web Format to Increase Usability and Accessibility of the Content"
year: 2022
month: 2
day: 25
journal: Applied In Vitro Toxicology
doi: 10.1089/aivt.2021.0010
url: https://doi.org/10.1089/aivt.2021.0010
GitHub Events
Total
- Issues event: 36
- Watch event: 2
- Issue comment event: 31
- Push event: 92
- Pull request event: 2
- Create event: 1
Last Year
- Issues event: 36
- Watch event: 2
- Issue comment event: 31
- Push event: 92
- Pull request event: 2
- Create event: 1
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 29
- Total pull requests: 1
- Average time to close issues: 6 months
- Average time to close pull requests: less than a minute
- Total issue authors: 2
- Total pull request authors: 1
- Average comments per issue: 0.66
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 26
- Pull requests: 1
- Average time to close issues: 1 day
- Average time to close pull requests: less than a minute
- Issue authors: 1
- Pull request authors: 1
- Average comments per issue: 0.5
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- marvinm2 (31)
- jamesamcl (3)
- egonw (1)
- egmnb (1)
Pull Request Authors
- marvinm2 (2)