Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.1%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: sib-swiss
  • License: other
  • Language: Java
  • Default Branch: master
  • Size: 1.21 MB
Statistics
  • Stars: 0
  • Watchers: 3
  • Forks: 1
  • Open Issues: 1
  • Releases: 19
Created 10 months ago · Last pushed 6 months ago
Metadata Files
Readme License Citation

README.md

SPARQL examples testing and conversion utilities

This is a set of tools to convert, test and help maintain SPARQL query examples.

The code comes from the SIB SPARQL Examples project.

For this code to work each query should be in a turtle file.

Each SPARQL query is itself in a turtle file. We use the following ontologies for the basic concepts.

  • ShACL for the relation to the text of the Select/Ask queries, and declaring prefixes
  • RDFS for comments and labels as shown in the user interfaces
  • RDF for basic type relations
  • schema.org for the target SPARQL endpoint and tagging relevant keywords

The following illustrates an example to retrieve all taxa from the UniProt SPARQL endpoint.

```sparql prefix ex: https://sparql.uniprot.org/.well-known/sparql-examples/ # <!-- change per dataset prefix sh: http://www.w3.org/ns/shacl# prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# prefix rdfs:http://www.w3.org/2000/01/rdf-schema# ex:1 # <!-- UniProt, Rhea and Swiss-Lipids are numbered but this can be anything. a sh:SPARQLSelectExecutable, sh:SPARQLExecutable ; sh:prefixes :sparqlexamples_prefixes ; # <!-- required for the import of the prefix declarations. Note the blank node rdfs:comment """A comment May have HTML in them. Example: Select all taxa from the UniProt taxonomy"""^^rdf:HTML ; sh:select """PREFIX up: http://purl.uniprot.org/core/

SELECT ?taxon FROM http://sparql.uniprot.org/taxonomy WHERE { ?taxon a up:Taxon . }""" ; schema:target https://sparql.uniprot.org/sparql/ ; schema:keywords "taxa". ```

If you want to add a label to a query please use the schema.org keywords.

Running the code

If using a release:

```bash mkdir target wget "https://github.com/sib-swiss/sparql-examples-utils/releases/download/v2.0.10/sparql-examples-utils-2.0.10-uber.jar" mv sparql-examples-utils-2.0.10-uber.jar target

This target directory is only so that the commands in the examples match as if the code was build locally.

basically target/ can be remove from all examples below

```

If building locally in this git repository:

bash mvn package

Quality Assurance (QA)

To test your examples pass the folder/directory containing your examples as an argument (--input-directory to the test subcommand, e.g.:

bash java -jar target/sparql-examples-utils-*-uber.jar test --input-directory=../sparql-examples/examples

The queries can be executed automatically on all endpoints they apply to using an extra argument for the test --also-run-slow-tests. This does change the queries to add a LIMIT 1 if no limit was set in the query.

bash java -jar target/sparql-examples-utils-*-uber.jar test --input-directory=../sparql-examples/examples -p MetaNetX --also-run-slow-tests

[!NOTE]

All CLI commands provided in this readme expects you have the sparql-examples folder cloned in the same directory alongside this sparql-examples-utils project folder. Feel free to change them for your own example folder and path.

Conversion for upload in SPARQL endpoint

Before loading the examples into a SPARQL endpoint they should be concatenated into one file, including the prefixes/namespaces definitions.

Compile all query files for a specific example subfolder, into a local turtle file:

bash java -jar target/sparql-examples-utils-*-uber.jar convert -i ../sparql-examples/examples -p Bgee -f ttl > examples_Bgee.ttl

Or compile all subfolders as JSON-LD to the standard output:

bash java -jar target/sparql-examples-utils-*-uber.jar convert -i ../sparql-examples/examples -p all -f jsonld

Conversion to RQ files

For easier use by other tools we can also generate .rq files. Following the syntax of grlc.io allowing to use these queries as HTTP APIs. bash java -jar target/sparql-examples-utils-*-uber.jar convert -i ../sparql-examples/examples -p all -r

Conversion from RQ files

If you already have a set of sparql examples in '*.rq' files then one can try to import then with:

bash java -jar target/sparql-examples-utils-*-uber.jar import-rq -i ../${DIRECTORY_WITH_EXAMPLES_IN_RQ_FILES} -b ${BASE_IRI}

This attempts to extract metadata as expressed using the grlc.io approach. Prefixes are collected as map, which might lead to issues if they are not unique in the set.

The base IRI should be the space where you will store the examples and where they can be dereferenced.

Generate markdown file

Generate markdown files with the query and a mermaid diagram of the queries, to be used to deploy a static website for the query examples.

bash java -jar target/sparql-examples-utils-*-uber.jar convert -i ../sparql-examples/examples -m

Querying for queries

As the SPARQL examples are themselves RDF, they can be queried for as soon as they are loaded in a SPARQL endpoint. sparql PREFIX sh: <http://www.w3.org/ns/shacl#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX spex: <https://purl.expasy.org/sparql-examples/ontology#> SELECT DISTINCT ?sq ?comment ?query WHERE { ?sq a sh:SPARQLExecutable ; rdfs:comment ?comment ; sh:select|sh:ask|sh:construct|spex:describe ?query . } ORDER BY ?sq

Finding queries that run on more than one endpoint

```bash java -jar target/sparql-examples-utils-*-uber.jar convert --input-directory ../sparql-examples/examples > examples_all.ttl

sparql --data examplesall.ttl "SELECT ?query (GROUPCONCAT(?target ; separator=', ') AS ?targets) WHERE { ?query https://schema.org/target ?target } GROUP BY ?query HAVING (COUNT(DISTINCT ?target) > 1) " ```

Native executable

The project is ready to be compiled by the native-image tool of the GraalVM project.

To do so set your JAVA_HOME to the graalvm location and user maven package -Pnative

e.g. on linux with graalvm installed in your $HOME/bin directory: ```bash export JAVAHOME=$HOME/bin/graalvm-jdk-${GRAALVMVERSION} mvn package -Pnative

Then there is now a local native binary that you can use to convert all entries

./target/sparql-examples-utils converter -i ${DIRECTORYWITHEXAMPLES} ```

How to cite this work

If you reuse any part of this work, please cite the arXiv paper:

@misc{largecollectionsparqlquestionquery, title={A large collection of bioinformatics question-query pairs over federated knowledge graphs: methodology and applications}, author={Jerven Bolleman and Vincent Emonet and Adrian Altenhoff and Amos Bairoch and Marie-Claude Blatter and Alan Bridge and Severine Duvaud and Elisabeth Gasteiger and Dmitry Kuznetsov and Sebastien Moretti and Pierre-Andre Michel and Anne Morgat and Marco Pagni and Nicole Redaschi and Monique Zahn-Zabal and Tarcisio Mendes de Farias and Ana Claudia Sima}, year={2024}, doi={10.48550/arXiv.2410.06010}, eprint={2410.06010}, archivePrefix={arXiv}, primaryClass={cs.DB}, url={https://arxiv.org/abs/2410.06010}, }

Extracting queries from wikibase

These utils have a mode to extract queries from wikibase instances.

For example to extract the queries from factgrid into a separate example directory.

sh java -jar target/sparql-examples-utils-*-uber.jar \ wikibase \ -e YOUR@EMAIL.real \ -o $HOME/git/wikibase-sparql-examples/examples/FactGrid \ -s https://database.factgrid.de/sparql \ -u https://database.factgrid.de/

Next release

Make sure JavaDoc has no ERRORs mvn org.apache.maven.plugins:maven-javadoc-plugin:3.11.2:jar

Prepare release, remember tag starts with 'v' e.g. 'v2.0.12'

mvn release:prepare mvn release:perform

Owner

  • Name: SIB Swiss Institute of Bioinformatics
  • Login: sib-swiss
  • Kind: organization
  • Location: Switzerland

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
title: "A large collection of bioinformatics question-query pairs over federated knowledge graphs: methodology and applications"
repository-code: https://github.com/sib-swiss/sparql-examples-utils
date-released: 2024-10-08
doi: 10.1093/gigascience/giaf045
license: MIT
authors:
  - given-names: Jerven
    family-names: Bolleman
    orcid: https://orcid.org/0000-0002-7449-1266,
    email: Jerven.Bolleman@sib.swiss
    affiliation: SIB Swiss Institute of Bioinformatics
  - given-names: Vincent
    family-names: Emonet
    orcid: https://orcid.org/0000-0002-1501-1082
    affiliation: SIB Swiss Institute of Bioinformatics
  - given-names: Adrian
    family-names: Altenhoff
    affiliation: SIB Swiss Institute of Bioinformatics
    orcid: https://orcid.org/0000-0001-7492-1273
  - given-names: Amos
    family-names: Bairoch
    affiliation: SIB Swiss Institute of Bioinformatics
    orcid: https://orcid.org/0000-0003-2826-6444
  - given-names: Marie-Claude
    family-names: Blatter
    affiliation: SIB Swiss Institute of Bioinformatics
    orcid: https://orcid.org/0000-0002-7474-1499 
  - given-names: Alan
    family-names: Bridge
    affiliation: SIB Swiss Institute of Bioinformatics
    orcid: https://orcid.org/0000-0003-2148-9135 
  - given-names: Severine
    family-names: Duvaud
    orcid: https://orcid.org/0000-0001-7892-9678
    affiliation: SIB Swiss Institute of Bioinformatics
  - given-names: Elisabeth
    family-names: Gasteiger
    affiliation: SIB Swiss Institute of Bioinformatics
    orcid: https://orcid.org/0000-0003-1829-162X 
  - given-names: Dmitry
    family-names: Kuznetsov
    affiliation: SIB Swiss Institute of Bioinformatics
    orcid: https://orcid.org/0000-0002-9972-947X
  - given-names: Sebastien
    family-names: Moretti
    affiliation: SIB Swiss Institute of Bioinformatics
    orcid: https://orcid.org/0000-0003-3947-488X
  - given-names: Pierre-Andre
    family-names: Michel
    affiliation: SIB Swiss Institute of Bioinformatics
    orcid: https://orcid.org/0000-0002-7023-1045 
  - given-names: Anne
    family-names: Morgat
    affiliation: SIB Swiss Institute of Bioinformatics
    orcid: https://orcid.org/0000-0002-1216-2969 
  - given-names: Marco
    family-names: Pagni
    affiliation: SIB Swiss Institute of Bioinformatics
    orcid: https://orcid.org/0000-0001-9292-9463 
  - given-names: Nicole
    family-names: Redaschi
    affiliation: SIB Swiss Institute of Bioinformatics
    orcid: https://orcid.org/0000-0001-8890-2268 
  - given-names: Monique
    family-names: Zahn-Zabal
    affiliation: SIB Swiss Institute of Bioinformatics
    orcid https://orcid.org/0000-0001-7961-6091 
  - given-names: Tarcisio
    family-names: Mendes de Farias
    orcid: https://orcid.org/0000-0002-3175-5372
    affiliation: SIB Swiss Institute of Bioinformatics
  - given-names: Ana Claudia
    family-names: Sima
    orcid: https://orcid.org/0000-0003-3213-4495
    affiliation: SIB Swiss Institute of Bioinformatics

GitHub Events

Total
  • Create event: 12
  • Release event: 18
  • Issues event: 8
  • Watch event: 3
  • Issue comment event: 13
  • Public event: 1
  • Push event: 29
  • Pull request event: 14
  • Fork event: 4
Last Year
  • Create event: 12
  • Release event: 18
  • Issues event: 8
  • Watch event: 3
  • Issue comment event: 13
  • Public event: 1
  • Push event: 29
  • Pull request event: 14
  • Fork event: 4

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 2
  • Total pull requests: 6
  • Average time to close issues: 2 months
  • Average time to close pull requests: about 2 months
  • Total issue authors: 2
  • Total pull request authors: 2
  • Average comments per issue: 1.5
  • Average comments per pull request: 1.17
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 6
  • Average time to close issues: 2 months
  • Average time to close pull requests: about 2 months
  • Issue authors: 2
  • Pull request authors: 2
  • Average comments per issue: 1.5
  • Average comments per pull request: 1.17
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • JervenBolleman (4)
  • VladimirAlexiev (4)
  • egonw (2)
  • vemonet (2)
  • Adafede (2)
Pull Request Authors
  • egonw (6)
  • JervenBolleman (4)
  • Adafede (2)
  • vemonet (2)
Top Labels
Issue Labels
query-fixing (1)
Pull Request Labels

Dependencies

.github/workflows/check.yml actions
  • actions/checkout v4 composite
  • actions/setup-java v4 composite
pom.xml maven
  • ch.qos.reload4j:reload4j 1.2.25
  • com.blazegraph:bigdata-core 2.1.4
  • info.picocli:picocli 4.7.6
  • info.picocli:picocli-codegen 4.7.6
  • org.apache.jena:jena-arq 5.0.0
  • org.eclipse.rdf4j:rdf4j-model
  • org.eclipse.rdf4j:rdf4j-model-api
  • org.eclipse.rdf4j:rdf4j-model-vocabulary
  • org.eclipse.rdf4j:rdf4j-repository-api
  • org.eclipse.rdf4j:rdf4j-repository-sail
  • org.eclipse.rdf4j:rdf4j-rio-jsonld
  • org.eclipse.rdf4j:rdf4j-rio-rdfxml
  • org.eclipse.rdf4j:rdf4j-rio-turtle
  • org.eclipse.rdf4j:rdf4j-sail-memory
  • org.eclipse.rdf4j:rdf4j-sail-model
  • org.eclipse.rdf4j:rdf4j-shacl
  • org.eclipse.rdf4j:rdf4j-sparqlbuilder
  • org.jsoup:jsoup 1.17.2
  • org.junit.jupiter:junit-jupiter
  • org.junit.platform:junit-platform-console
  • org.slf4j:slf4j-log4j12 1.7.36