ontoaligner

OntoAligner: A Python Toolkit for Ontology Alignment https://pypi.org/project/OntoAligner/

https://github.com/sciknoworg/ontoaligner

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.7%) to scientific vocabulary

Keywords

large-language-models ontology ontology-alignment ontology-engineering ontology-matching python-library
Last synced: 4 months ago · JSON representation ·

Repository

OntoAligner: A Python Toolkit for Ontology Alignment https://pypi.org/project/OntoAligner/

Basic Info
Statistics
  • Stars: 50
  • Watchers: 4
  • Forks: 6
  • Open Issues: 12
  • Releases: 12
Topics
large-language-models ontology ontology-alignment ontology-engineering ontology-matching python-library
Created about 1 year ago · Last pushed 5 months ago
Metadata Files
Readme Changelog Contributing License Code of conduct Citation

README.md

OntoAligner Logo

OntoAligner: A Comprehensive Modular and Robust Python Toolkit for Ontology Alignment

[![PyPI version](https://badge.fury.io/py/OntoAligner.svg)](https://badge.fury.io/py/OntoAligner) [![PyPI Downloads](https://static.pepy.tech/badge/ontoaligner)](https://pepy.tech/projects/ontoaligner) ![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg) [![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit)](https://github.com/pre-commit/pre-commit) [![Documentation Status](https://readthedocs.org/projects/ontoaligner/badge/?version=main)](https://ontoaligner.readthedocs.io/) [![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](MAINTANANCE.md)

OntoAligner is a Python library designed to simplify ontology alignment and matching for researchers, practitioners, and developers. With a modular architecture and robust features, OntoAligner provides powerful tools to bridge ontologies effectively.

🧪 Installation

You can install OntoAligner from PyPI using pip:

bash pip install ontoaligner

Alternatively, to get the latest version directly from the source, use the following commands:

bash git clone git@github.com:sciknoworg/OntoAligner.git pip install ./ontoaligner

📚 Documentation

Comprehensive documentation for OntoAligner, including detailed guides and examples, is available at ontoaligner.readthedocs.io. Below are some key tutorials with links to both the documentation and the corresponding example codes.

| Example | Tutorial | Script | |:-------------------------------|:--------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------:| | Lightweight | 📚 Fuzzy Matching | 📝 Code | | Retrieval | 📚 Retrieval Aligner | 📝 Code | | Large Language Models | 📚 LLM Aligner | 📝 Code | | Retrieval Augmented Generation | 📚 RAG Aligner | 📝 Code| | FewShot | 📚 FewShot-RAG Aligner | 📝 Code | In-Context Vectors Learning | 📚 In-Context Vectors RAG | 📝 Code | Knowledge Graph Embedding | 📚 KGE Aligner | 📝 Code | eCommerce | 📚 Product Alignment in eCommerce | 📝 Code

🚀 Quick Tour

Below is an example of using Retrieval-Augmented Generation (RAG) step-by-step approach for ontology matching:

```python from ontoaligner.ontology import MaterialInformationMatOntoOMDataset from ontoaligner.utils import metrics, xmlify from ontoaligner.aligner import MistralLLMBERTRetrieverRAG from ontoaligner.encoder import ConceptParentRAGEncoder from ontoaligner.postprocess import raghybridpostprocessor

Step 1: Initialize the dataset object for MaterialInformation MatOnto dataset

task = MaterialInformationMatOntoOMDataset() print("Test Task:", task)

Step 2: Load source and target ontologies along with reference matchings

dataset = task.collect( sourceontologypath="assets/MI-MatOnto/miontology.xml", targetontologypath="assets/MI-MatOnto/matontoontology.xml", referencematchingpath="assets/MI-MatOnto/matchings.xml" )

Step 3: Encode the source and target ontologies

encodermodel = ConceptParentRAGEncoder() encodedontology = encoder_model(source=dataset['source'], target=dataset['target'])

Step 4: Define configuration for retriever and LLM

retrieverconfig = {"device": 'cuda', "topk": 5,} llmconfig = {"device": "cuda", "maxlength": 300, "maxnewtokens": 10, "batch_size": 15}

Step 5: Initialize Generate predictions using RAG-based ontology matcher

model = MistralLLMBERTRetrieverRAG(retrieverconfig=retrieverconfig, llmconfig=llmconfig) model.load(llmpath = "mistralai/Mistral-7B-v0.3", irpath="all-MiniLM-L6-v2") predicts = model.generate(inputdata=encodedontology)

Step 6: Apply hybrid postprocessing

hybridmatchings, hybridconfigs = raghybridpostprocessor(predicts=predicts, irscorethreshold=0.1, llmconfidenceth=0.8)

evaluation = metrics.evaluationreport(predicts=hybridmatchings, references=dataset['reference']) print("Hybrid Matching Evaluation Report:", evaluation)

Step 7: Convert matchings to XML format and save the XML representation

xmlstr = xmlify.xmlalignmentgenerator(matchings=hybridmatchings) open("matchings.xml", "w", encoding="utf-8").write(xml_str) ```

Ontology alignment pipeline using RAG method:

```python import ontoaligner

pipeline = ontoaligner.OntoAlignerPipeline( taskclass=ontoaligner.ontology.MouseHumanOMDataset, sourceontologypath="assets/MI-MatOnto/miontology.xml", targetontologypath="assets/MI-MatOnto/matontoontology.xml", referencematching_path="assets/MI-MatOnto/matchings.xml", )

matchings, evaluation = pipeline( method="rag", encodermodel=ontoaligner.encoder.ConceptRAGEncoder(), modelclass=ontoaligner.aligner.MistralLLMBERTRetrieverRAG, postprocessor=ontoaligner.postprocess.raghybridpostprocessor, llmpath='mistralai/Mistral-7B-v0.3', retrieverpath='all-MiniLM-L6-v2', llmthreshold=0.5, irragthreshold=0.7, topk=5, maxlength=512, maxnewtokens=10, device='cuda', batchsize=32, return_matching=True, evaluate=True )

print("Matching Evaluation Report:", evaluation) ```

⭐ Contribution

We welcome contributions to enhance OntoAligner and make it even better! Please review our contribution guidelines in CONTRIBUTING.md before getting started. You are also welcome to assist with the ongoing maintenance by referring to MAINTENANCE.md. Your support is greatly appreciated.

If you encounter any issues or have questions, please submit them in the GitHub issues tracker.

💡 Acknowledgements

If you use OntoAligner in your work or research, please cite the following preprint:

bibtex @inproceedings{babaei2025ontoaligner, title={OntoAligner: A Comprehensive Modular and Robust Python Toolkit for Ontology Alignment}, author={Babaei Giglou, Hamed and D’Souza, Jennifer and Karras, Oliver and Auer, S{\"o}ren}, booktitle={European Semantic Web Conference}, pages={174--191}, year={2025}, organization={Springer} }

This software is archived in Zenodo under the DOI DOI and is licensed under License.

Owner

  • Name: SciKnowOrg
  • Login: sciknoworg
  • Kind: organization
  • Email: sciknoworg@gmail.com
  • Location: Germany

Scientific Knowledge Organization (SKO group or SciKnowOrg group)

Citation (CITATION.cff)

cff-version: 1.1.0
message: "If you use this software, please cite it as below."
title: "OntoAligner: A Comprehensive Modular and Robust Python Toolkit for Ontology Alignment"
type: software
authors:
  - family-names: "Babaei Giglou"
    given-names: "Hamed"
  - family-names: "D'Souza"
    given-names: "Jennifer"
  - family-names: "Karras"
    given-names: "Oliver"
  - family-names: "Auer"
    given-names: "Sören"
url: "https://github.com/sciknoworg/OntoAligner"
keywords:
  - "Ontology Matching"
  - "Alignment"
  - "Python Library"
license: "Apache-2.0"
version: "1.5.0"
date-released: "2025-07-29"

GitHub Events

Total
  • Create event: 25
  • Release event: 15
  • Issues event: 27
  • Watch event: 30
  • Delete event: 12
  • Issue comment event: 12
  • Public event: 1
  • Push event: 218
  • Pull request event: 28
  • Fork event: 5
Last Year
  • Create event: 25
  • Release event: 15
  • Issues event: 27
  • Watch event: 30
  • Delete event: 12
  • Issue comment event: 12
  • Public event: 1
  • Push event: 218
  • Pull request event: 28
  • Fork event: 5

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 17
  • Total pull requests: 13
  • Average time to close issues: 2 months
  • Average time to close pull requests: about 7 hours
  • Total issue authors: 5
  • Total pull request authors: 2
  • Average comments per issue: 0.18
  • Average comments per pull request: 0.0
  • Merged pull requests: 10
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 17
  • Pull requests: 13
  • Average time to close issues: 2 months
  • Average time to close pull requests: about 7 hours
  • Issue authors: 5
  • Pull request authors: 2
  • Average comments per issue: 0.18
  • Average comments per pull request: 0.0
  • Merged pull requests: 10
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • HamedBabaei (17)
  • Ell04 (1)
  • alibama (1)
  • amirrezaalasti (1)
  • HadiBayrami (1)
Pull Request Authors
  • HamedBabaei (13)
  • amirrezaalasti (3)
Top Labels
Issue Labels
enhancement (8) Aligner:[step-1]-bring-everything-to-jnb (1) bug (1) question (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 202 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 14
  • Total maintainers: 1
pypi.org: ontoaligner

OntoAligner: A Comprehensive Modular and Robust Python Toolkit for Ontology Alignment.

  • Versions: 14
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 202 Last month
Rankings
Dependent packages count: 10.2%
Average: 33.9%
Dependent repos count: 57.6%
Maintainers (1)
Last synced: 4 months ago

Dependencies

docs/requirements.txt pypi
  • sphinx ==8.1.3
  • sphinx-press-theme ==0.9.1
  • sphinx_autodoc_typehints ==2.5.0
requirements.txt pypi
  • argparse *
  • contextlib *
  • datasets *
  • numpy *
  • ontospy *
  • openai *
  • owlready2 *
  • pathlib *
  • pre-commit *
  • rank_bm25 *
  • rapidfuzz *
  • rdflib *
  • ruff *
  • sentence_transformers *
  • setuptools *
  • sklearn *
  • torch *
  • tqdm *
  • transformers *
  • twine *
  • wheel *
  • xml *
setup.py pypi
  • argparse *
  • contextlib *
  • datasets *
  • numpy *
  • ontospy *
  • openai *
  • owlready2 *
  • pathlib *
  • pre-commit *
  • rank_bm25 *
  • rapidfuzz *
  • rdflib *
  • scikit-learn *
  • sentence-transformers *
  • torch *
  • tqdm *
  • transformers *
.github/workflows/python-publish.yml actions
  • actions/checkout v4 composite
  • actions/setup-python v4 composite
pyproject.toml pypi
  • pre-commit * develop
  • ruff * develop
  • setuptools * develop
  • twine * develop
  • wheel * develop
  • argparse *
  • datasets *
  • huggingface_hub 0.23.5
  • numpy *
  • ontospy 2.1.1
  • openai 1.56.0
  • owlready2 0.44
  • pandas *
  • pathlib *
  • python >=3.10,<4.0.0
  • rank_bm25 0.2.2
  • rapidfuzz 3.5.2
  • rdflib 7.1.1
  • scikit-learn *
  • sentence-transformers 2.2.2
  • setfit 1.0.3
  • torch 2.5.0
  • tqdm 4.66.3
  • transformers 4.46.0