morph-kgc
Powerful RDF Knowledge Graph Generation with RML Mappings
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.4%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Powerful RDF Knowledge Graph Generation with RML Mappings
Basic Info
- Host: GitHub
- Owner: morph-kgc
- License: apache-2.0
- Language: Python
- Default Branch: main
- Homepage: https://morph-kgc.readthedocs.io
- Size: 32.8 MB
Statistics
- Stars: 225
- Watchers: 12
- Forks: 41
- Open Issues: 29
- Releases: 35
Topics
Metadata Files
README.md
Morph-KGC is an engine that constructs RDF knowledge graphs from heterogeneous data sources with the R2RML and RML mapping languages. Morph-KGC is built on top of pandas and it leverages mapping partitions to significantly reduce execution times and memory consumption for large data sources.
Features :sparkles:
- User-friendly mappings with YARRRML.
- Transformation functions with RML-FNML, including Python UDFs.
- RDF-star generation with RML-star.
- RML views over tabular data sources and JSON files.
- Integration with RDFLib, Oxigraph and Kafka.
- Optimized to materialize large knowledge graphs.
- Remote data and mapping files.
- Input data formats:
- Relational databases: MySQL, PostgreSQL, Oracle, Microsoft SQL Server, MariaDB, SQLite.
- Tabular files: CSV, TSV, Excel, Parquet, Feather, ORC, Stata, SAS, SPSS, ODS.
- Hierarchical files: JSON, XML.
- In-memory data structures: Python Dictionaries, DataFrames.
- Cloud data lake solutions: Databricks, Snowflake.
- Property graph databases: Neo4j, Kùzu.
Documentation :bookmark_tabs:
Tutorial :woman_teacher:
Learn quickly with the tutorial in Google Colaboratory!
Getting Started :rocket:
PyPI is the fastest way to install Morph-KGC:
bash
pip install morph-kgc
We recommend to use virtual environments to install Morph-KGC.
To run the engine via command line you just need to execute the following:
bash
python3 -m morph_kgc config.ini
Check the documentation to see how to generate the configuration INI file. Here you can also see an example INI file.
It is also possible to run Morph-KGC as a library with RDFLib and Oxigraph: ```python import morph_kgc
generate the triples and load them to an RDFLib graph
grdflib = morphkgc.materialize('/path/to/config.ini')
work with the RDFLib graph
qres = grdflib.query('SELECT DISTINCT ?classes WHERE { ?s a ?classes }')
generate the triples and load them to Oxigraph
goxigraph = morphkgc.materialize_oxigraph('/path/to/config.ini')
work with Oxigraph
qres = goxigraph.query('SELECT DISTINCT ?classes WHERE { ?s a ?classes }')
the methods above also accept the config as a string
config = """ [DataSource1] mappings: /path/to/mapping/mappingfile.rml.ttl dburl: mysql+pymysql://user:password@localhost:3306/dbname """ grdflib = morph_kgc.materialize(config) ```
License :unlock:
Morph-KGC is available under the Apache License 2.0.
Author & Contact :mailboxwithmail:
Ontology Engineering Group, Universidad Politécnica de Madrid.
Citing :speech_balloon:
If you used Morph-KGC in your work, please cite the SWJ paper:
bib
@article{arenas2024morph,
title = {{Morph-KGC: Scalable knowledge graph materialization with mapping partitions}},
author = {Arenas-Guerrero, Julián and Chaves-Fraga, David and Toledo, Jhon and Pérez, María S. and Corcho, Oscar},
journal = {Semantic Web},
year = {2024},
volume = {15},
number = {1},
pages = {1-20},
issn = {2210-4968},
publisher = {IOS Press},
doi = {10.3233/SW-223135}
}
Sponsor :shield:
Owner
- Name: Morph-KGC
- Login: morph-kgc
- Kind: organization
- Location: Spain
- Website: https://morph-kgc.readthedocs.io
- Repositories: 2
- Profile: https://github.com/morph-kgc
Citation (CITATION.cff)
title: "Morph-KGC: Scalable Knowledge Graph Materialization with Mapping Partitions"
license: Apache-2.0
authors:
- family-names: Arenas-Guerrero
given-names: Julián
orcid: "http://orcid.org/0000-0002-3029-6469"
cff-version: 1.2.0
preferred-citation:
authors:
- family-names: Arenas-Guerrero
given-names: Julián
- family-names: Chaves-Fraga
given-names: David
- family-names: Toledo
given-names: Jhon
- family-names: Pérez
given-names: María S.
- family-names: Corcho
given-names: Oscar
title: "Morph-KGC: Scalable knowledge graph materialization with mapping partitions"
type: article
journal: Semantic Web
doi: 10.3233/SW-223135
year: 2024
volume: 15
issue: 1
identifiers:
- description: "Collection of archived snapshots for Morph-KGC"
type: doi
value: 10.5281/zenodo.6524684
GitHub Events
Total
- Create event: 1
- Release event: 1
- Issues event: 36
- Watch event: 34
- Issue comment event: 41
- Push event: 44
- Pull request event: 31
- Fork event: 6
Last Year
- Create event: 1
- Release event: 1
- Issues event: 36
- Watch event: 34
- Issue comment event: 41
- Push event: 44
- Pull request event: 31
- Fork event: 6
Committers
Last synced: almost 3 years ago
All Time
- Total Commits: 1,050
- Total Committers: 12
- Avg Commits per committer: 87.5
- Development Distribution Score (DDS): 0.345
Top Committers
| Name | Commits | |
|---|---|---|
| Julián Arenas-Guerrero | j****g@h****m | 688 |
| Julián Arenas Guerrero | a****n@o****m | 108 |
| Julián Arenas-Guerrero | a****m | 97 |
| Julián Arenas Guerrero | 1****n@u****m | 79 |
| Julián Arenas Guerrero | 1****n@u****m | 24 |
| David Chaves | d****a@g****m | 18 |
| Jhon Toledo | j****7@g****m | 14 |
| Ahmad Alobaid | a****e@g****m | 11 |
| Oscar Corcho | o****o@f****s | 5 |
| Miel Vander Sande | m****e@m****e | 4 |
| Julián Arenas Guerrero | “****n@o****m@u****” | 1 |
| Dylan Van Assche | d****e@u****e | 1 |
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 130
- Total pull requests: 104
- Average time to close issues: about 1 month
- Average time to close pull requests: 1 day
- Total issue authors: 72
- Total pull request authors: 18
- Average comments per issue: 3.13
- Average comments per pull request: 0.31
- Merged pull requests: 92
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 13
- Pull requests: 14
- Average time to close issues: 12 days
- Average time to close pull requests: about 8 hours
- Issue authors: 13
- Pull request authors: 4
- Average comments per issue: 0.77
- Average comments per pull request: 0.29
- Merged pull requests: 10
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- arenas-guerrero-julian (10)
- dgarijo (10)
- paoespinozarias (9)
- ramcaat (8)
- KappaGi (7)
- fcharras (7)
- david-martinez-garcia (4)
- Stiksels (3)
- IshanDindorkar (3)
- 00ade (3)
- midorna (3)
- Crispae (2)
- lambdakris (2)
- idomingu (2)
- neobernad (2)
Pull Request Authors
- arenas-guerrero-julian (75)
- LuciaCabanillasRodriguez (12)
- ahmad88me (6)
- mabounassif (4)
- Spothedog1 (4)
- TheRazorace (3)
- StephaneBranly (3)
- christophbrosch (2)
- david-martinez-garcia (2)
- dachafra (2)
- bollwyvl (2)
- mielvds (2)
- achiminator (2)
- eltociear (1)
- ershimen (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- mkdocs-material ==8.2.9
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- SQLAlchemy >=1.4.0, <2.0.0
- duckdb >=0.6.0, <2.0.0
- elementpath >=4.0.1, <5.0.0
- falcon >=3.0.0, <4.0.0
- jsonpath-python >=1.0.6, <2.0.0
- pandas >=1.4.0, <2.0.0
- pyoxigraph >=0.3.10, <1.0.0
- rdflib >=6.1.1, <7.0.0
- sql-metadata >=2.6.0, <3.0.0