https://github.com/arenas-guerrero-julian/semantic-python-overview

(subjective) overview of projects which are related both to python and semantic technologies (RDF, OWL, Reasoning, ...)

https://github.com/arenas-guerrero-julian/semantic-python-overview

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org, sciencedirect.com, springer.com
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.2%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

(subjective) overview of projects which are related both to python and semantic technologies (RDF, OWL, Reasoning, ...)

Basic Info
  • Host: GitHub
  • Owner: arenas-guerrero-julian
  • License: cc0-1.0
  • Default Branch: main
  • Homepage:
  • Size: 101 KB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Fork of pysemtec/semantic-python-overview
Created about 1 year ago · Last pushed about 1 year ago

https://github.com/arenas-guerrero-julian/semantic-python-overview/blob/main/

[![join community](https://pysemtec.org/img/join-community.svg "join community")](https://pysemtec.org)
# Semantic Python Overview

This repository aims to collect and curate a list of projects which are related both to python and semantic technologies (RDF, OWL, SPARQL, Reasoning, ...). It is inspired by collections like [awesome lists](https://github.com/sindresorhus/awesome#readme). The list might be incomplete and biased, due to the limited knowledge of its authors. Improvements are very welcome. Feel free to file an issue or a pull request. Every section is alphabetically sorted.

Furthermore, this repository might serve as a **cristallization point for a community** interested in such projects  and how they might productively interact. See [this discussion](https://github.com/cknoll/semantic-python-overview/discussions/1) for more information.


## Established Projects

- [Bioregistry](https://github.com/biopragmatics/bioregistry) - The Bioregistry
  - docs: https://bioregistry.readthedocs.io
  - website: https://bioregistry.io/
  - features:
    - Open source (and CC 0) repository of prefixes, their associated metadata, and mappings to external registries' prefixes
    - Standarization of prefixes and CURIEs 
    - Interconversion between CURIEs and IRIs
    - Generation of context-specific prefix maps for usage in RDF, LinkML, SSSOM, OWL, etc.
- [brickschema](https://github.com/BrickSchema/py-brickschema)  Brick Ontology Python package
    - Brick is an open-source effort to standardize semantic descriptions of the physical, logical and virtual assets in buildings and the relationships between them.
    - docs: https://brickschema.readthedocs.io/en/latest/
    - website: https://brickschema.org/
    - features:
        - basic inference with different reasoners
        - web based interaction (by means of [Yasgui](https://github.com/TriplyDB/Yasgui))
        - Translations from different formats (Haystack, VBIS)
- [Cooking with Python and KBpedia](https://www.mkbergman.com/cooking-with-python-and-kbpedia/)
    - Tutorial series on "how to pick tools and then use Python for using and manipulating the KBpedia knowledge graph"
    - [Material in form of Jupyter Notebooks](https://github.com/Cognonto/CWPK),
    - accompanying python package [cowpoke](https://github.com/Cognonto/cowpoke),
- [CubicWeb](https://www.cubicweb.org/) a framework to build semantic web applications
  - website: https://www.cubicweb.org
  - docs: https://cubicweb.readthedocs.io/en/latest/
  - features:
    - An engine driven by the explicit data model of the application
    - RQL, an intuitive query language close to the business vocabulary
    - An architecture that separates data selection and visualisation
    - Data security by design
    - An efficient data storage

- [Eddy](https://github.com/obdasystems/eddy) - graphical ontology editor
  - website: https://www.obdasystems.com/eddy
  - features:
    - graphical ontology editing
    - uses bespoke Graphol format but has an OWL2 export
    - visualization built on PyQt5
  - literature references:
    - [*Lembo, D and Pantaleone, D and Santarelli, V and Savo, DF: **Eddy: A Graphical Editor for OWL 2 Ontologies**. IJCAI 2016; 4252-4253*](https://cs.unibg.it/savo/papers/LPSS-IJCAI-16.pdf)
- [fastobo-py](https://github.com/fastobo/fastobo-py): Python bindings for *fastobo* (rust library to parse OBO 1.4)
    - features:
        - load, edit and serialize ontologies in the OBO 1.4 format
- [FunOwl](https://github.com/hsolbrig/funowl)  functional OWL syntax for Python
  - features:
    - provide a pythonic API that follows the OWL functional model for constructing OWL
- [Gastrodon](https://github.com/paulhoule/gastrodon) - puts RDF data on your fingertips in Pandas; gateway to matplotlib, scikit-learn and other visualization tools.
  - features:
    - interpolate variables into SPARQL queries
    - access local RDFlib graphs and remote SPARQL protocol endpoints
    - convert SPARQL result set to pandas dataframes
    - understandable error messages
    - input/output graphs in Turtle form
    - conversion between RDF collections and Python collections
    - Sphinx domain to incorporate RDF data into documentation
- [gizmos](https://github.com/ontodev/gizmos)  Utilities for ontology development
    - features:
        - modules for "export", "extract", "tree"-rendering
- [Jabberwocky](https://github.com/sap218/jabberwocky)  a toolkit for ontologies
    - features:
        - associated text mining using an ontology terms & synonyms
        - tf-idf for synonym curation then adding those synonyms into an ontology
- [kglab](https://github.com/DerwenAI/kglab) - Graph Data Science
    - docs: https://derwen.ai/docs/kgl/
    - tutorial: https://derwen.ai/docs/kgl/tutorial/
    - features:
        - an abstraction layer in Python for building knowledge graphs, integrated with popular graph libraries
	- perspective: there are several "camps" of graph technologies, with little discussion between them
        - focus on supporting "Hybrid AI" approaches that combine two or more graph technologies with other ML work
	- PyData stack  e.g., Pandas, scikit-learn, etc.  allows for graph work within data science workflows
	- scale-out tools  e.g., RAPIDS, Arrow/Parquet, Dask  provide for scaling graph computation (not necessarily databases)
	- graph algorithm libraries include NetworkX, iGraph, cuGraph  plus related visualization libraries in PyVis, Cairo, etc.
	- W3C libraries in Py also lacked full integration: RDFlib, pySHACL, OWL-RL, etc.
        - pslpython provides for _probabilistic soft logic_, working with uncertainty in probabilistic graphs
        - additional integration paths and examples show how to work with deep learning (PyG)
	- import paths from graph databases, such as Neo4j
        - import paths from note-taking tools, such as Roam Research
	- usage in [MkRefs](https://github.com/DerwenAI/mkrefs) to add semantic features into MkDocs so that open source projects can federate bibliographies, shared glossaries, etc.
	- kglab team provides hands-on workshops at technology conferences for people to gain experience with these different graph approaches
- [KGX](https://github.com/biolink/kgx) - Library for building and exchanging knowledge graphs
    - docs: https://kgx.readthedocs.io/
    - features:
        - Load graphs into an in-memory model to facilitate data integration, validation, and graph operations
        - Provides an easy way to bring data into Biolink Model, a a high-level data model for biomedical knowledge graphs
        - The core data structure is a Property Graph (PG), represented internally using a `networkx.MultiDiGraph`
        - Supports various input and output formats including,
            - RDF serializations
            - SPARQL endpoints
            - Neo4j endpoints
            - CSV/TSV and JSON
            - OWL
            - OBOGraph JSON format
            - SSSOM
- [LangChain](https://github.com/langchain-ai/langchain)'s GraphSparqlQAChain  A LangChain module for making RDF and OWL accessible via natural language
    - docs: https://python.langchain.com/docs/use_cases/graph/graph_sparql_qa
    - features:
        - Generates SPARQL SELECT and UPDATE queries from natural language
        - Runs the generated queries against local files, endpoints, or triple stores
        - Returns natural language responses
- [LinkML](https://github.com/linkml/linkml)  Linked Open Data Modeling Language
    - features:
        - A high level simple way of specifying data models, optionally enhanced with semantic annotations
        - A python framework for compiling these data models to json-ld, json-schema, shex, shacl, owl, sql-ddl
        - A python framework for data conversion and validation, as well as generated Python dataclasses
- [Macleod](https://github.com/thahmann/macleod)  Ontology development environment for Common Logic (CL)
    - features:
        - Translating a CLIF file to formats supported by FOL reasoners
        - Extracting an OWL approximation of a CLIF ontology
        - Verifying (non-trivial) logical consistency of a CLIF ontology
        - Proving theorems/lemmas, such as properties of concepts and relations or competency questions
        - GUI (alpha state)
- [Morph-KGC](https://github.com/oeg-upm/morph-kgc)  System to create RDF and RDF-star knowledge graphs from heterogeneous sources with R2RML, RML and RML-star
  - docs: https://morph-kgc.readthedocs.io
  - features:
    - support for relational databases, tabular files (e.g. CSV, Excel, Parquet) and hierarchical files (XML and JSON)
    - generates RDF and RDF-star knowledge graphs by running through the command line or as a library
    - integrates with RDFlib and Oxigraph to load the generated RDF directly to those libraries
- [nxontology](https://github.com/related-sciences/nxontology)  NetworkX-based library for representing ontologies
  - features:
    - load ontologies into a `networkx.DiGraph` or `MultiDiGraph` from `.obo`, `.json`, or `.owl` formats
      (powered by pronto / fastobo)
    - compute information content scores for nodes and semantic similarity scores for node pairs
- [obonet](https://github.com/dhimmel/obonet)  read OBO-formatted ontologies into NetworkX
  - features:
    - Load an `.obo` file into a `networkx.MultiDiGraph`
    - Users should try [nxontology](https://github.com/related-sciences/nxontology) first, as a more general purpose successor to this project
- [OnToology](https://github.com/OnToology/OnToology)  System for collaborative ontology development process
    - docs: http://ontoology.linkeddata.es/stepbystep
    - live version: http://ontoology.linkeddata.es/
    - citable reference: https://doi.org/10.1016/j.websem.2018.09.003
- [OntoPilot](https://github.com/stuckyb/ontopilot)  software for ontology development and deployment
  - docs: https://github.com/stuckyb/ontopilot/wiki
  - features:
    - support end users in ontology development, documentation and maintainance
    - convert spreadsheet data (one entity per row) to owl files
    - call a reasoner before triple-store insertion
- [ontospy](https://github.com/lambdamusic/Ontospy)  Python library and command-line interface for inspecting and visualizing RDF models
  - docs: http://lambdamusic.github.io/Ontospy/
  - features:
    - extract and print out any ontology-related information
    - convert different OWL syntax variants
    - generate html documentation for an ontology
- [ontor](https://github.com/felixocker/ontor)  Python library for manipulating and vizualizing OWL ontologies in Python
  - features:
    - tool set based on owlready2 and networkx
- [owlready2](https://bitbucket.org/jibalamy/owlready2/src/master/README.rst)  ontology oriented programming in Python
  - docs: https://owlready2.readthedocs.io/en/latest/index.html
  - features:
    - parse owl files (RDF/XML or OWL/XML)
    - parse SWRL rules
    - call reasoner (via java)
  - literature references:
    - [*Lamy, JB: Owlready: **Ontology-oriented programming in Python with automatic classification and high level constructs for biomedical ontologies**. Artificial Intelligence In Medicine 2017;80:11-28*](http://www.lesfleursdunormal.fr/_downloads/article_owlready_aim_2017.pdf)
    - [*Lamy, JB: **Ontologies with Python**, Apress, 2020*](https://www.apress.com/fr/book/9781484265512)
        - accompanying material: 
- [Oxrdflib](https://github.com/oxigraph/oxrdflib)  Oxrdflib provides rdflib stores using pyoxigraph (rust-based)
    - could be used as drop-in replacements of the rdflib default ones
- [pronto](https://github.com/althonos/pronto): library to parse, browse, create, and export ontologies
    - features:
        -supports several ontology languages and formats
    - docs: https://pronto.readthedocs.io/en/latest/api.html
- [pycottas](https://github.com/arenas-guerrero-julian/pycottas)  Library for working with compressed COTTAS files
  - docs: https://pycottas.readthedocs.io
  - features:
    - compress RDF files to COTTAS format
    - evaluate triple patterns over compressed RDF
    - integrates with RDFlib as a store backend to query COTTAS files with SPARQL
- [pyfactxx](https://github.com/tilde-lab/pyfactxx)  Python bindings for FaCT++ OWL 2 C++ reasoner
    - features:
        - well-optimized reasoner for SROIQ(D) description logic, with additional improvements
        - [rdflib](https://github.com/RDFLib/rdflib) integration
        - easy cross-platform installation
- [PyFuseki](https://github.com/yubinCloud/pyfuseki)  Library that interact with Jena Fuseki (SPARQL server):
    - docs: https://yubincloud.github.io/pyfuseki/

- [PyKEEN](https://github.com/pykeen/pykeen) (**Py**thon **K**nowl**E**dge **E**mbeddi**N**gs)  Python package to train and evaluate knowledge graph embedding models
    - features:
        - 44 Models
        - 37 Datasets
        - 5 Inductive Datasets
        - support for multi-modal information
- [PyLD](https://github.com/digitalbazaar/pyld) - A JSON-LD processor written in Python
    - conforms:
        - JSON-LD 1.1, W3C Candidate Recommendation, 2019-12-12 or newer
        - JSON-LD 1.1 Processing Algorithms and API, W3C Candidate Recommendation, 2019-12-12 or newer
        - JSON-LD 1.1 Framing, W3C Candidate Recommendation, 2019-12-12 or newer
- [pyLoDStorage](https://github.com/WolfgangFahl/pyLoDStorage)  python library to interchange data between SPARQL-, JSON and SQL-endpoints
    - features:
        -  Integration of [tabulate library](https://pypi.org/project/tabulate/)
        -  QueryManager class for handling named queries
        -  Basic data structure: **l**ists of **d**icts (thus: "LoD")
    - docs: https://wiki.bitplan.com/index.php/PyLoDStorage
- [PyOBO](https://github.com/pyobo/pyobo)
  - docs:  https://pyobo.readthedocs.io
  - features:
    - Provides unified, high-level access to names, descriptions, synonyms, xrefs, hierarchies, properties, relationships, etc. in ontologies from many sources listed in the Bioregistry
    - Converts databases into OWL and OBO ontologies
    - Wrapper around ROBOT for using Java tooling to convert between OBO and OWL
    - Internal DSL for generating OBO ontology
- [Pyoxigraph](https://oxigraph.org/pyoxigraph/stable/index.html)  Python graph database library implementing the SPARQL standard.
    - built on top of [Oxigraph](https://github.com/oxigraph/oxigraph) using [PyO3](https://pyo3.rs/)
    - docs: https://oxigraph.org/pyoxigraph/stable/index.html
    - two stores with SPARQL 1.1 capabilities. in-memory/disk based
- [PyRes](https://github.com/eprover/PyRes)
    - resolution-based theorem provers for first-order logic
    - focus on good comprehensibility of the code
    - Literature: [Teaching Automated Theorem Proving by Example](https://link.springer.com/chapter/10.1007/978-3-030-51054-1_9)
- [pystardog](https://github.com/stardog-union/pystardog)
    - Python bindings for the [Stardog Knowledge Graph platform](https://www.stardog.com/)
- [Quit Store](https://github.com/AKSW/QuitStore)  workspace for distributed collaborative Linked Data knowledge engineering ("Quads in Git")
    - features:
        - read and write RDF Datasets
        - create multiple branches of the Dataset
    - literature references:
        - [*Decentralized Collaborative Knowledge Management using Git*](https://natanael.arndt.xyz/bib/arndt-n-2018--jws)
by Natanael Arndt, Patrick Naumann, Norman Radtke, Michael Martin, and Edgard Marx in Journal of Web Semantics, 2018
[[@sciencedirect](https://www.sciencedirect.com/science/article/pii/S1570826818300416)] [[@arXiv](https://arxiv.org/abs/1805.03721)]

- [RaiseWikibase](https://github.com/UB-Mannheim/RaiseWikibase)  A tool for speeding up multilingual knowledge graph construction with Wikibase
    - fast inserts into a Wikibase instance: creates up to a million entities and wikitexts per hour
    - docs: https://ub-mannheim.github.io/RaiseWikibase/
    - ships with `docker-compose.yml` for Wikibase (Database, PHP-code)
    - publication: https://link.springer.com/chapter/10.1007%2F978-3-030-80418-3_11
- [Reasonable](https://github.com/gtfierro/reasonable)  An OWL 2 RL reasoner with reasonable performance
    - written in Rust with Python-Bindings (via [pyo3](https://pyo3.rs/))
- [ROBOT](https://github.com/ontodev/robot)  Java-tool for automating ontology workflow with several reasoners (ELK, Hermite, ...) and Python interface
    - General docs:  https://robot.obolibrary.org/
    - Python interfaces: https://robot.obolibrary.org/python
    - Docs on reasoning: https://robot.obolibrary.org/reason
- [rdflib](https://github.com/RDFLib/rdflib)  Python package for working with RDF
  - docs: https://rdflib.readthedocs.io/
  - graphical package overview: https://rdflib.dev/
  - features:
    - parsers and serializers for RDF/XML, NTriples, Turtle, JSON-LD and more
    - a graph interface which can be backed by any one of a number of store implementations
    - store implementations for in-memory storage and persistent storage
    - a SPARQL 1.1 implementation  supporting SPARQL 1.1 Queries and Update statements
- [rdflib-endpoint](https://github.com/vemonet/rdflib-endpoint)  Python package for easily deploying SPARQL endpoints for RDFLib Graphs
  - features:
    - exposing machine learning models or any other logic implemented in Python through a SPARQL endpoint, using custom functions
    - serving local RDF files using the command line interface
- [serd](https://gitlab.com/drobilla/python-serd)  Python serd module, providing bindings for Serd, a lightweight C library for working with RDF data
  - docs:  https://drobilla.gitlab.io/python-serd/singlehtml/
- [ sparqlfun](https://github.com/linkml/sparqlfun)
    - LinkML based SPARQL template library and execution engine
        - modularized core library of SPARQL templates
        - Fully FAIR description of templates
        - Rich expressive language for moedeling templates
            - uses [LinkML](https://linkml.io/linkml/) as base language
        - optional python bindings / [object model](https://github.com/linkml/sparqlfun/blob/main/sparqlfun/model.py) using LinkML
        - supports both SELECT and CONSTRUCT
        - optional export to TSV, JSON, YAML, RDF
        - extensive [endpoint metadata](https://github.com/linkml/sparqlfun/tree/main/sparqlfun/config)
- [SPARQL kernel](https://github.com/paulovn/sparql-kernel) for Jupyter
    - features:
        - sending queries to an SPARQL endpoint
        - fetching and presenting the results in a notebook
- [SPARQLing Unicorn QGIS Plugin](https://github.com/sparqlunicorn/sparqlunicornGoesGIS)  QGIS plugin which adds a GeoJSON layer from SPARQL enpoint queries
    - docs: https://sparqlunicorn.github.io/sparqlunicornGoesGIS/
    - QGIS plugin page: https://plugins.qgis.org/plugins/sparqlunicorn/
    - features:
        - Querying geospatial vector layers from SPARQL endpoints
        - Conversion of geoformats (GeoJSON, SHP, KML, GML, etc.) to geospatial RDF
        - Conversion of RDF geodata (GeoSPARQL-formatted) from one coordinate reference system to another
        - SHACL validation of geospatial RDF graphs including validation of geoliteral (WKT, GML) contents
- [SPARQLWrapper](https://github.com/RDFLib/sparqlwrapper)  A wrapper for a remote SPARQL endpoint
    - docs: https://sparqlwrapper.readthedocs.io/en/latest/index.html
    - features:
    	- Creating a query invocation
    	- Optionally converting the result into a more manageable format
- [WikidataIntegrator](https://github.com/SuLab/WikidataIntegrator)  Library for reading and writing to Wikidata/Wikibase
    - features:
        - high integration with the Wikidata SPARQL endpoint


## Probably Stalled or Outdated Projects

- [Athene](https://github.com/dityas/Athene) DL reasoner in pure python
    - "[C]urrent version is a beta and only supports ALC. But it can easily be extended by adding tableau rules."
    - Last update: 2017
- [cwm](https://en.wikipedia.org/wiki/Cwm_(software))
    - Self description: "\[cwm is a\] forward chaining semantic reasoner that can be used for querying, checking, transforming and filtering information".
    - Created in 2000 by Tim Berners-Lee and Dan Connolly, see [w3.org](https://www.w3.org/2000/10/swap/doc/cwm)
- [air-reasoner](https://github.com/mit-dig/air-reasoner)
    - Self description: "Reasoner for the AIR policy language, based on cwm"
    - based on cwm
    - Last update: 2013
- [FuXi](https://pypi.org/project/FuXi/)
    - Self description: "An OWL / N3-based in-memory, logic reasoning system for RDF"
    - based on cwm
    - Last update: 2013
    - see also   (hg-repo)
- [pysumo](https://github.com/pySUMO/pysumo)
    - Ontology IDE for the Sugested Upper Merged Ontology (SUMO)
    - Docs: https://pysumo.readthedocs.io/
    - Last update: 2015


## Further Projects / Links

- [ontology](https://github.com/ozekik/awesome-ontology)  A curated list of ontology things (with some python-related entries)
- [awesome-semantic-web#python](https://github.com/semantalytics/awesome-semantic-web#python) Python section of awesome list for semantic-web-related projects
- [github-semantic-web-python](https://github.com/topics/semantic-web?l=python)  github project search with `topic=semantic-web` and `language=python`
- "Graph Thinking"  Talk by Paco Nathan ([@ceteri](https://github.com/ceteri)) PyData Global 2021; [slides](https://derwen.ai/s/kcgh#84), [video](https://www.youtube.com/watch?v=bqku2a7ScXg)
- [Hydra Ecosystem](https://github.com/HTTP-APIs) - Semantically Linked REST APIs
    - docs: https://www.hydraecosystem.org/
    - tutorials: the stack has three major layers ([server](https://github.com/HTTP-APIs/hydrus), [client](https://github.com/HTTP-APIs/hydra-python-agent), [GUI](https://github.com/HTTP-APIs/hydra-python-agent-gui)); each repo has it own README
    - features:
    	- deploy a server automatically from API Documentation (JSON-LD and W3C Hydra)
    	- client automatically reads the documentation and provides access to endpoints
    	- GUI allows visualization of the network generated by the servers and external resources
    	- a [parser](https://github.com/HTTP-APIs/hydra-openapi-parser) for OpenAPI specs translation
    - notes:
    	- under development, experimental
    	- part of Google Summer of Code
- [Pywikibot](https://github.com/wikimedia/pywikibot)
    - Library to interact with Wikidata and Wikimedia API
    - see also: https://www.wikidata.org/wiki/Wikidata:Creating_a_bot#Pywikibot
- [semantic](https://github.com/crm416/semantic)  Python library for extracting semantic information from text, such as dates and numbers
- [Solving Einstein Puzzle](https://github.com/cknoll/demo-material/blob/main/expertise_system/einstein-zebra-puzzle-owlready-solution1.ipynb)  jupyter notebook demonstrating how to use owlready2 to solve a logic puzzle
- [W3C-Link-List1](https://www.w3.org/2001/sw/wiki/SemanticWebTools#Python_Developers)  link list "SemanticWebTools", section "Python_Developers" (wiki page)
  - might be outdated
- [W3C-Link-List2](https://www.w3.org/2001/sw/wiki/Python)  list of tools usable from, or with, Python (wiki page)
- [wikidata-mayors](https://github.com/njanakiev/wikidata-mayors)
    - Python code to ask wikidata for european mayors and where they where born
    - Article: https://towardsdatascience.com/where-do-mayors-come-from-querying-wikidata-with-python-and-sparql-91f3c0af22e2
- [yamlpyowl](https://github.com/cknoll/yamlpyowl)  read an yaml-specified ontology into python by means of owlready2 (experimental)
- [Notebook, which generates quiz questions from wikidata](https://gist.github.com/ak314/fc6c6f911cb4f39453b575854cdc4869)
    - [related presentation slides](https://www.slideshare.net/robertoturrin/how-to-turn-wikipedia-into-a-quiz-game)

Owner

  • Name: Julián Arenas Guerrero
  • Login: arenas-guerrero-julian
  • Kind: user
  • Location: Madrid, Spain.
  • Company: @oeg-upm

PhD Student at @oeg-upm

GitHub Events

Total
  • Push event: 1
Last Year
  • Push event: 1