ontoaligner
OntoAligner: A Python Toolkit for Ontology Alignment https://pypi.org/project/OntoAligner/
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.7%) to scientific vocabulary
Keywords
Repository
OntoAligner: A Python Toolkit for Ontology Alignment https://pypi.org/project/OntoAligner/
Basic Info
- Host: GitHub
- Owner: sciknoworg
- License: apache-2.0
- Language: Python
- Default Branch: main
- Homepage: https://ontoaligner.readthedocs.io
- Size: 1.84 MB
Statistics
- Stars: 50
- Watchers: 4
- Forks: 6
- Open Issues: 12
- Releases: 12
Topics
Metadata Files
README.md
OntoAligner: A Comprehensive Modular and Robust Python Toolkit for Ontology Alignment
OntoAligner is a Python library designed to simplify ontology alignment and matching for researchers, practitioners, and developers. With a modular architecture and robust features, OntoAligner provides powerful tools to bridge ontologies effectively.
🧪 Installation
You can install OntoAligner from PyPI using pip:
bash
pip install ontoaligner
Alternatively, to get the latest version directly from the source, use the following commands:
bash
git clone git@github.com:sciknoworg/OntoAligner.git
pip install ./ontoaligner
📚 Documentation
Comprehensive documentation for OntoAligner, including detailed guides and examples, is available at ontoaligner.readthedocs.io. Below are some key tutorials with links to both the documentation and the corresponding example codes.
| Example | Tutorial | Script | |:-------------------------------|:--------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------:| | Lightweight | 📚 Fuzzy Matching | 📝 Code | | Retrieval | 📚 Retrieval Aligner | 📝 Code | | Large Language Models | 📚 LLM Aligner | 📝 Code | | Retrieval Augmented Generation | 📚 RAG Aligner | 📝 Code| | FewShot | 📚 FewShot-RAG Aligner | 📝 Code | In-Context Vectors Learning | 📚 In-Context Vectors RAG | 📝 Code | Knowledge Graph Embedding | 📚 KGE Aligner | 📝 Code | eCommerce | 📚 Product Alignment in eCommerce | 📝 Code
🚀 Quick Tour
Below is an example of using Retrieval-Augmented Generation (RAG) step-by-step approach for ontology matching:
```python from ontoaligner.ontology import MaterialInformationMatOntoOMDataset from ontoaligner.utils import metrics, xmlify from ontoaligner.aligner import MistralLLMBERTRetrieverRAG from ontoaligner.encoder import ConceptParentRAGEncoder from ontoaligner.postprocess import raghybridpostprocessor
Step 1: Initialize the dataset object for MaterialInformation MatOnto dataset
task = MaterialInformationMatOntoOMDataset() print("Test Task:", task)
Step 2: Load source and target ontologies along with reference matchings
dataset = task.collect( sourceontologypath="assets/MI-MatOnto/miontology.xml", targetontologypath="assets/MI-MatOnto/matontoontology.xml", referencematchingpath="assets/MI-MatOnto/matchings.xml" )
Step 3: Encode the source and target ontologies
encodermodel = ConceptParentRAGEncoder() encodedontology = encoder_model(source=dataset['source'], target=dataset['target'])
Step 4: Define configuration for retriever and LLM
retrieverconfig = {"device": 'cuda', "topk": 5,} llmconfig = {"device": "cuda", "maxlength": 300, "maxnewtokens": 10, "batch_size": 15}
Step 5: Initialize Generate predictions using RAG-based ontology matcher
model = MistralLLMBERTRetrieverRAG(retrieverconfig=retrieverconfig, llmconfig=llmconfig) model.load(llmpath = "mistralai/Mistral-7B-v0.3", irpath="all-MiniLM-L6-v2") predicts = model.generate(inputdata=encodedontology)
Step 6: Apply hybrid postprocessing
hybridmatchings, hybridconfigs = raghybridpostprocessor(predicts=predicts, irscorethreshold=0.1, llmconfidenceth=0.8)
evaluation = metrics.evaluationreport(predicts=hybridmatchings, references=dataset['reference']) print("Hybrid Matching Evaluation Report:", evaluation)
Step 7: Convert matchings to XML format and save the XML representation
xmlstr = xmlify.xmlalignmentgenerator(matchings=hybridmatchings) open("matchings.xml", "w", encoding="utf-8").write(xml_str) ```
Ontology alignment pipeline using RAG method:
```python import ontoaligner
pipeline = ontoaligner.OntoAlignerPipeline( taskclass=ontoaligner.ontology.MouseHumanOMDataset, sourceontologypath="assets/MI-MatOnto/miontology.xml", targetontologypath="assets/MI-MatOnto/matontoontology.xml", referencematching_path="assets/MI-MatOnto/matchings.xml", )
matchings, evaluation = pipeline( method="rag", encodermodel=ontoaligner.encoder.ConceptRAGEncoder(), modelclass=ontoaligner.aligner.MistralLLMBERTRetrieverRAG, postprocessor=ontoaligner.postprocess.raghybridpostprocessor, llmpath='mistralai/Mistral-7B-v0.3', retrieverpath='all-MiniLM-L6-v2', llmthreshold=0.5, irragthreshold=0.7, topk=5, maxlength=512, maxnewtokens=10, device='cuda', batchsize=32, return_matching=True, evaluate=True )
print("Matching Evaluation Report:", evaluation) ```
⭐ Contribution
We welcome contributions to enhance OntoAligner and make it even better! Please review our contribution guidelines in CONTRIBUTING.md before getting started. You are also welcome to assist with the ongoing maintenance by referring to MAINTENANCE.md. Your support is greatly appreciated.
If you encounter any issues or have questions, please submit them in the GitHub issues tracker.
💡 Acknowledgements
If you use OntoAligner in your work or research, please cite the following preprint:
bibtex
@inproceedings{babaei2025ontoaligner,
title={OntoAligner: A Comprehensive Modular and Robust Python Toolkit for Ontology Alignment},
author={Babaei Giglou, Hamed and D’Souza, Jennifer and Karras, Oliver and Auer, S{\"o}ren},
booktitle={European Semantic Web Conference},
pages={174--191},
year={2025},
organization={Springer}
}
This software is archived in Zenodo under the DOI and is licensed under
.
Owner
- Name: SciKnowOrg
- Login: sciknoworg
- Kind: organization
- Email: sciknoworg@gmail.com
- Location: Germany
- Repositories: 1
- Profile: https://github.com/sciknoworg
Scientific Knowledge Organization (SKO group or SciKnowOrg group)
Citation (CITATION.cff)
cff-version: 1.1.0
message: "If you use this software, please cite it as below."
title: "OntoAligner: A Comprehensive Modular and Robust Python Toolkit for Ontology Alignment"
type: software
authors:
- family-names: "Babaei Giglou"
given-names: "Hamed"
- family-names: "D'Souza"
given-names: "Jennifer"
- family-names: "Karras"
given-names: "Oliver"
- family-names: "Auer"
given-names: "Sören"
url: "https://github.com/sciknoworg/OntoAligner"
keywords:
- "Ontology Matching"
- "Alignment"
- "Python Library"
license: "Apache-2.0"
version: "1.5.0"
date-released: "2025-07-29"
GitHub Events
Total
- Create event: 25
- Release event: 15
- Issues event: 27
- Watch event: 30
- Delete event: 12
- Issue comment event: 12
- Public event: 1
- Push event: 218
- Pull request event: 28
- Fork event: 5
Last Year
- Create event: 25
- Release event: 15
- Issues event: 27
- Watch event: 30
- Delete event: 12
- Issue comment event: 12
- Public event: 1
- Push event: 218
- Pull request event: 28
- Fork event: 5
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 17
- Total pull requests: 13
- Average time to close issues: 2 months
- Average time to close pull requests: about 7 hours
- Total issue authors: 5
- Total pull request authors: 2
- Average comments per issue: 0.18
- Average comments per pull request: 0.0
- Merged pull requests: 10
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 17
- Pull requests: 13
- Average time to close issues: 2 months
- Average time to close pull requests: about 7 hours
- Issue authors: 5
- Pull request authors: 2
- Average comments per issue: 0.18
- Average comments per pull request: 0.0
- Merged pull requests: 10
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- HamedBabaei (17)
- Ell04 (1)
- alibama (1)
- amirrezaalasti (1)
- HadiBayrami (1)
Pull Request Authors
- HamedBabaei (13)
- amirrezaalasti (3)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 202 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 14
- Total maintainers: 1
pypi.org: ontoaligner
OntoAligner: A Comprehensive Modular and Robust Python Toolkit for Ontology Alignment.
- Homepage: https://ontoaligner.readthedocs.io/
- Documentation: https://ontoaligner.readthedocs.io/
- License: Apache-2.0
-
Latest release: 1.5.0
published 5 months ago
Rankings
Maintainers (1)
Dependencies
- sphinx ==8.1.3
- sphinx-press-theme ==0.9.1
- sphinx_autodoc_typehints ==2.5.0
- argparse *
- contextlib *
- datasets *
- numpy *
- ontospy *
- openai *
- owlready2 *
- pathlib *
- pre-commit *
- rank_bm25 *
- rapidfuzz *
- rdflib *
- ruff *
- sentence_transformers *
- setuptools *
- sklearn *
- torch *
- tqdm *
- transformers *
- twine *
- wheel *
- xml *
- argparse *
- contextlib *
- datasets *
- numpy *
- ontospy *
- openai *
- owlready2 *
- pathlib *
- pre-commit *
- rank_bm25 *
- rapidfuzz *
- rdflib *
- scikit-learn *
- sentence-transformers *
- torch *
- tqdm *
- transformers *
- actions/checkout v4 composite
- actions/setup-python v4 composite
- pre-commit * develop
- ruff * develop
- setuptools * develop
- twine * develop
- wheel * develop
- argparse *
- datasets *
- huggingface_hub 0.23.5
- numpy *
- ontospy 2.1.1
- openai 1.56.0
- owlready2 0.44
- pandas *
- pathlib *
- python >=3.10,<4.0.0
- rank_bm25 0.2.2
- rapidfuzz 3.5.2
- rdflib 7.1.1
- scikit-learn *
- sentence-transformers 2.2.2
- setfit 1.0.3
- torch 2.5.0
- tqdm 4.66.3
- transformers 4.46.0