morph-kgc

Powerful RDF Knowledge Graph Generation with RML Mappings

https://github.com/morph-kgc/morph-kgc

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.4%) to scientific vocabulary

Keywords

data-engineering data-integration database etl knowledge-graph python r2rml rdf rdf-star rml

Keywords from Contributors

mapping-languages yarrrml ontology shacl
Last synced: 6 months ago · JSON representation ·

Repository

Powerful RDF Knowledge Graph Generation with RML Mappings

Basic Info
Statistics
  • Stars: 225
  • Watchers: 12
  • Forks: 41
  • Open Issues: 29
  • Releases: 35
Topics
data-engineering data-integration database etl knowledge-graph python r2rml rdf rdf-star rml
Created over 5 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Code of conduct Citation Zenodo

README.md

morph

License DOI Latest PyPI version Python Version PyPI status build Documentation Status Open In Colab

Morph-KGC is an engine that constructs RDF knowledge graphs from heterogeneous data sources with the R2RML and RML mapping languages. Morph-KGC is built on top of pandas and it leverages mapping partitions to significantly reduce execution times and memory consumption for large data sources.

Features :sparkles:

Documentation :bookmark_tabs:

Read the documentation.

Tutorial :woman_teacher:

Learn quickly with the tutorial in Google Colaboratory!

Getting Started :rocket:

PyPI is the fastest way to install Morph-KGC: bash pip install morph-kgc

We recommend to use virtual environments to install Morph-KGC.

To run the engine via command line you just need to execute the following: bash python3 -m morph_kgc config.ini

Check the documentation to see how to generate the configuration INI file. Here you can also see an example INI file.

It is also possible to run Morph-KGC as a library with RDFLib and Oxigraph: ```python import morph_kgc

generate the triples and load them to an RDFLib graph

grdflib = morphkgc.materialize('/path/to/config.ini')

work with the RDFLib graph

qres = grdflib.query('SELECT DISTINCT ?classes WHERE { ?s a ?classes }')

generate the triples and load them to Oxigraph

goxigraph = morphkgc.materialize_oxigraph('/path/to/config.ini')

work with Oxigraph

qres = goxigraph.query('SELECT DISTINCT ?classes WHERE { ?s a ?classes }')

the methods above also accept the config as a string

config = """ [DataSource1] mappings: /path/to/mapping/mappingfile.rml.ttl dburl: mysql+pymysql://user:password@localhost:3306/dbname """ grdflib = morph_kgc.materialize(config) ```

License :unlock:

Morph-KGC is available under the Apache License 2.0.

Author & Contact :mailboxwithmail:

Ontology Engineering Group, Universidad Politécnica de Madrid.

Citing :speech_balloon:

If you used Morph-KGC in your work, please cite the SWJ paper:

bib @article{arenas2024morph, title = {{Morph-KGC: Scalable knowledge graph materialization with mapping partitions}}, author = {Arenas-Guerrero, Julián and Chaves-Fraga, David and Toledo, Jhon and Pérez, María S. and Corcho, Oscar}, journal = {Semantic Web}, year = {2024}, volume = {15}, number = {1}, pages = {1-20}, issn = {2210-4968}, publisher = {IOS Press}, doi = {10.3233/SW-223135} }

Sponsor :shield:

BASF

Owner

  • Name: Morph-KGC
  • Login: morph-kgc
  • Kind: organization
  • Location: Spain

Citation (CITATION.cff)

title: "Morph-KGC: Scalable Knowledge Graph Materialization with Mapping Partitions"
license: Apache-2.0
authors:
  - family-names: Arenas-Guerrero
    given-names: Julián
    orcid: "http://orcid.org/0000-0002-3029-6469"
cff-version: 1.2.0
preferred-citation:
  authors:
    - family-names: Arenas-Guerrero
      given-names: Julián
    - family-names: Chaves-Fraga
      given-names: David
    - family-names: Toledo
      given-names: Jhon
    - family-names: Pérez
      given-names: María S.
    - family-names: Corcho
      given-names: Oscar
  title: "Morph-KGC: Scalable knowledge graph materialization with mapping partitions"
  type: article
  journal: Semantic Web
  doi: 10.3233/SW-223135
  year: 2024
  volume: 15
  issue: 1
identifiers:
  - description: "Collection of archived snapshots for Morph-KGC"
    type: doi
    value: 10.5281/zenodo.6524684

GitHub Events

Total
  • Create event: 1
  • Release event: 1
  • Issues event: 36
  • Watch event: 34
  • Issue comment event: 41
  • Push event: 44
  • Pull request event: 31
  • Fork event: 6
Last Year
  • Create event: 1
  • Release event: 1
  • Issues event: 36
  • Watch event: 34
  • Issue comment event: 41
  • Push event: 44
  • Pull request event: 31
  • Fork event: 6

Committers

Last synced: almost 3 years ago

All Time
  • Total Commits: 1,050
  • Total Committers: 12
  • Avg Commits per committer: 87.5
  • Development Distribution Score (DDS): 0.345
Top Committers
Name Email Commits
Julián Arenas-Guerrero j****g@h****m 688
Julián Arenas Guerrero a****n@o****m 108
Julián Arenas-Guerrero a****m 97
Julián Arenas Guerrero 1****n@u****m 79
Julián Arenas Guerrero 1****n@u****m 24
David Chaves d****a@g****m 18
Jhon Toledo j****7@g****m 14
Ahmad Alobaid a****e@g****m 11
Oscar Corcho o****o@f****s 5
Miel Vander Sande m****e@m****e 4
Julián Arenas Guerrero “****n@o****m@u****” 1
Dylan Van Assche d****e@u****e 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 130
  • Total pull requests: 104
  • Average time to close issues: about 1 month
  • Average time to close pull requests: 1 day
  • Total issue authors: 72
  • Total pull request authors: 18
  • Average comments per issue: 3.13
  • Average comments per pull request: 0.31
  • Merged pull requests: 92
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 13
  • Pull requests: 14
  • Average time to close issues: 12 days
  • Average time to close pull requests: about 8 hours
  • Issue authors: 13
  • Pull request authors: 4
  • Average comments per issue: 0.77
  • Average comments per pull request: 0.29
  • Merged pull requests: 10
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • arenas-guerrero-julian (10)
  • dgarijo (10)
  • paoespinozarias (9)
  • ramcaat (8)
  • KappaGi (7)
  • fcharras (7)
  • david-martinez-garcia (4)
  • Stiksels (3)
  • IshanDindorkar (3)
  • 00ade (3)
  • midorna (3)
  • Crispae (2)
  • lambdakris (2)
  • idomingu (2)
  • neobernad (2)
Pull Request Authors
  • arenas-guerrero-julian (75)
  • LuciaCabanillasRodriguez (12)
  • ahmad88me (6)
  • mabounassif (4)
  • Spothedog1 (4)
  • TheRazorace (3)
  • StephaneBranly (3)
  • christophbrosch (2)
  • david-martinez-garcia (2)
  • dachafra (2)
  • bollwyvl (2)
  • mielvds (2)
  • achiminator (2)
  • eltociear (1)
  • ershimen (1)
Top Labels
Issue Labels
bug (56) question (42) enhancement (31) rml-fnml (19) yarrrml (11) rml-io (6) rml-star (6) build (5) needs triage (3) duplicate (1) documentation (1)
Pull Request Labels
yarrrml (7) enhancement (2) rml-io (2) build (2) rml-core (2) rml-fnml (2)

Dependencies

docs/requirements.txt pypi
  • mkdocs-material ==8.2.9
.github/workflows/ci.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/pypi-publish.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
pyproject.toml pypi
  • SQLAlchemy >=1.4.0, <2.0.0
  • duckdb >=0.6.0, <2.0.0
  • elementpath >=4.0.1, <5.0.0
  • falcon >=3.0.0, <4.0.0
  • jsonpath-python >=1.0.6, <2.0.0
  • pandas >=1.4.0, <2.0.0
  • pyoxigraph >=0.3.10, <1.0.0
  • rdflib >=6.1.1, <7.0.0
  • sql-metadata >=2.6.0, <3.0.0