oeg-software-graph
Knowledge graph containing the catalog of software from the oeg-upm organization in GitHub
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.8%) to scientific vocabulary
Keywords
Repository
Knowledge graph containing the catalog of software from the oeg-upm organization in GitHub
Basic Info
Statistics
- Stars: 2
- Watchers: 5
- Forks: 0
- Open Issues: 1
- Releases: 1
Topics
Metadata Files
README.md
oeg-software-graph
Description
This repository contains the resources used to build a knowledge graph containing the catalog of software from the oeg-upm organization in GitHub (last execution June, 2023). The source data is generated with the SOftware Metadata Extraction Framework (SOMEF), which extracts the relevant information of a repository from README files and saves it as JSON files. Then, a knowledge graph that relies on the Software Description Ontology (SDO) is created using RML-star mappings. The resulting knowledge graph is then queried to assess the adoption of a set of representative best practices for research software publishing.
Disclaimer: this repository is a demonstration, accepted at the 2023 Semantics Conference.
Structure
This repository is organized as follows:
* data/ contains the input JSON file aggregating the metadata extracted of all repositories from the oeg-upm GitHub organisation (somef.json), along with the produced knowledge graph (somef-kg.nq)
* mappings/ contains the RML-star mappings needed to construct the knowledge graph from the JSON file
* notebooks/ contains two notebooks, for i) the generation of the JSON files and ii) construction and querying of the knowledge graph
* best-practices-requirements/ describes the set of representative best practices that are assessed in the repositories represented in the knowledge graph
Installation
This pipeline has been tested in Python 3.9.
In order to run the pipeline, you need to install Jupyter Notebooks:
pip install notebook
Then, install the requirements of the project. Creating two separate environments is highly recommended (one for extraction, another one for querying), since the libraries used for extracting metadata and creating the knowledge graph have varied dependencies. For installing the extraction requirements, run:
pip install -r requirements_extraction.txt
For installing the construction and querying requirements, run
pip install -r requirements.txt
Finally, start Jupyter notebook and run the notebooks in the notebooks folder.
Requirements:
Our pipeline makes use of the somef, yatter, morph-kgc and pyoxigraph packages. For more information about the versions used, see the requirements.txt file (construction and querying) and requirements_extraction.txt (which will install somef).
If you want to play around with SPARQL queries, just run the construction and querying notebook, which will guide you through the KG creation and querying process.
Click in the Binder button to show a pre-loaded notebook for testing (it may take a few minutes to load).
Citation
If you use this work, please cite our software as follows:
bibtex
@article{iglesias2023towards,
title = {Towards Assessing FAIR Research Software Best Practices in an Organization Using RDF-star},
author = {Iglesias-Molina, Ana and Garijo, Daniel},
year = {2023},
booktitle = {Proceedings of the Posters and Demo Track of the 19th International Conference on Semantic Systems co-located with 19th International Conference on Semantic Systems (SEMANTiCS 2023)},
publisher = {CEUR-WS.org},
series = {{CEUR} Workshop Proceedings},
volume = {3526},
url = {https://ceur-ws.org/Vol-3526/paper-09.pdf}
}
Authors
Owner
- Name: Ontology Engineering Group (UPM)
- Login: oeg-upm
- Kind: organization
- Email: oeg-dev@delicias.dia.fi.upm.es
- Location: Boadilla del Monte, Madrid, Spain
- Website: https://oeg.fi.upm.es/
- Repositories: 294
- Profile: https://github.com/oeg-upm
Citation (CITATION.cff)
cff-version: 1.2.1
message: "If you use this software, please cite both the article from preferred-citation and the software itself."
title: "OEG Software Graph"
license: Apache-2.0
authors:
- family-names: Iglesias-Molina
given-names: Ana
orcid: "https://orcid.org/0000-0001-5375-8024"
- family-names: Garijo
given-names: Daniel
orcid: "http://orcid.org/0000-0003-0454-7145"
doi: 10.5281/zenodo.8114677
version: 1.0.0
preferred-citation:
authors:
- family-names: Iglesias-Molina
given-names: Ana
orcid: "https://orcid.org/0000-0001-5375-8024"
- family-names: Garijo
given-names: Daniel
orcid: "http://orcid.org/0000-0003-0454-7145"
title: "Towards Assessing FAIR Research Software Best Practices in an Organization Using RDF-star"
booktitle: "Proceedings of the Posters and Demo Track of the 19th International Conference on Semantic Systems co-located with 19th International Conference on Semantic Systems (SEMANTiCS 2023)"
publisher: "CEUR-WS.org"
type: article
year: 2023
url: https://ceur-ws.org/Vol-3526/paper-09.pdf