flex-rml

FlexRML: A Memory-Efficient Interpreter for RML.

https://github.com/wintechis/flex-rml

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.0%) to scientific vocabulary

Keywords

data-integration knowledge-graph rdf rml
Last synced: 6 months ago · JSON representation ·

Repository

FlexRML: A Memory-Efficient Interpreter for RML.

Basic Info
  • Host: GitHub
  • Owner: wintechis
  • License: agpl-3.0
  • Language: C++
  • Default Branch: main
  • Homepage:
  • Size: 1.47 MB
Statistics
  • Stars: 6
  • Watchers: 2
  • Forks: 0
  • Open Issues: 2
  • Releases: 2
Topics
data-integration knowledge-graph rdf rml
Created over 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme License Citation

README.md

FlexRML - A Flexible RML Processor

DOI

FlexRML provides a robust RML processing solution tailored for different devices. Whether you're working with microcontrollers, single-board computers, consumer hardware, or cloud environments, FlexRML ensures seamless integration and efficient processing.

Description

RML (RDF Mapping Language) is central to data transformation and knowledge graph construction. FlexRML is a flexible RML processor optimized for a wide range of devices:

  • Microcontrollers
  • Single Board Computers
  • Consumer Hardware
  • Cloud Environments

Currently, FlexRML only supports data in CSV format. However, future versions will include support for additional data formats such as JSON and XML.

Installation

Using Prebuilt Binaries

Prebuilt binaries for various systems are available in the releases section.

Compiling from Source

Prerequisites

Before compilation, set up a build environment on your system. On Debian-based systems, this can be done using:

bash apt install build-essential cmake git curl zip unzip tar Additionally, ensure that you have vcpkg installed as it will be used for managing dependencies.

Compilation Process:

  1. Clone or download the repository. Clone or download the repository from GitHub and navigate to the project directory. bash git clone git@github.com:wintechis/flex-rml.git cd flexrml
  2. Install vcpkg as package manager. If you haven't installed vcpkg, clone it from GitHub and bootstrap it. bash git clone https://github.com/microsoft/vcpkg.git ./vcpkg/bootstrap-vcpkg.sh
  3. Configure the project with CMake Use CMake to configure the project, specifying the vcpkg toolchain file and the paths to dependencies if necessary. Note: You need to adjust the path to vcpkg. bash cmake -B build -S . -DCMAKE_TOOLCHAIN_FILE=/path/to/vcpkg/scripts/buildsystems/vcpkg.cmake
  4. Compile the project bash cmake --build build
  5. After compilation, the executable flexrml will be available in the build directory. You can run it using:

Troubleshooting - If you encounter errors during the CMake configuration, ensure the paths to serd and cityhash are correctly specified. - Make sure your system has the correct C++ compiler installed (GCC or Clang). - Clean the build directory if you face repeated configuration issues: bash rm -rf build/*

Getting Started

Depending on the use case and environment FlexRML is executed, different configurations are usefull.

Fastest Execution Speed

The method prioritizes faster execution speed at the expense of increased memory usage. It uses a 128-bit hash function to identify duplicates and bypasses the result size estimation step to achieve faster performance. To use this mode, run the following command: bash ./flexrml -m [path] -d -t

Lowest memory consumption

This mode is optimized for minimal memory usage by using a result size estimator to approximate the number of N-Quads generated. Although this process takes more time due to the additional computation, it conserves memory, in particular when the estimated number of N-Quads is less than 135,835,773. This approach is beneficial in memory-constrained environments. To enable this mode, use the following command: bash ./flexrml -m [path] -d -t -a

More informatioin about available flags can be found on the wiki.

Example

In the example folder, there is a mapping.ttl file that contains RML rules for mapping sensor data to RDF, and a sensor_values.csv file.

The sensor_values.csv contains:

| id | name | value | unit | | --- | ------- | ----- | ---- | | 10 | Sensor1 | 24 | C | | 20 | Sensor2 | 72.2 | F | | 30 | Sensor3 | 34 | C |

If you are in the example folder and run:

bash ./flexrml -m ./mapping.ttl -o output_file.nq -d

The resulting RDF graph can be found in output_file.nq. The graph looks like this: Resulting_Graph

Conformance

FlexRML is validated against applicable RML test cases to ensure conformance with the specification.
Currently, only CSV-related test cases are applicable.

| Specification | Coverage | |------------------------------------------------------------------------------|------------------------| | RML-Core | 100% Coverage | | RML-IO | Work in Progress | | RML-CC | Work in Progress |

Planned Features for FlexRML

We are constantly working to improve FlexRML and expand its capabilities. Here's what we have planned for the future development of FlexRML: - [ ] Add Support for Other Data Encodings Enhancing FlexRML to work with various data formats. + [ ] JSON - [x] Add JSON reader and JSON Path parser - [ ] Adjust generation of index for hash join to JSON - [ ] Adjust result size estimation to JSON + [ ] XML - [x] Add Support for N-Triple RDF Serialization Implementing N-Triple format compatibility for broader RDF serialization options. - [ ] Improve Performance of Join Algorithm Optimize the current join algorithm for faster and more efficient data processing. - [ ] Provide Library for Arduinos Develop a specialized library to make FlexRML easier useable on Arduino devices, expanding its use in IoT applications. - [ ] Support latest RML vocabulary Modify the parsing of RML rules to allow the new RML vocabulary to be used.

We welcome community feedback and contributions! If you have suggestions or want to contribute to any of these features, please let us know through GitHub issues.

ESP32 Compatible Version

For those working with ESP32, we have a dedicated version of this project. It's tailored specifically for compatibility with ESP32 and the Arduino IDE. You can access it and find detailed instructions for setup and use at the following link: FlexRML ESP32 Repository

JavaScript Compatible Version

For those working with JavaScript, we have created a Webassembly version of FlexRML. FlexRML-node is published on npm.

Citation

If you use this work in your research, please cite it as:

bibtex @article{Freund_FlexRML_A_Flexible_2024, author = {Freund, Michael and Schmid, Sebastian and Dorsch, Rene and Harth, Andreas}, journal = {Extended Semantic Web Conference}, title = {{FlexRML: A Flexible and Memory Efficient Knowledge Graph Materializer}}, year = {2024} }

Licenses

Project License

This project is licensed under the GNU Affero General Public License version 3 (AGPLv3). The full text of the license can be found in the LICENSE file in this repository.

External Libraries

This project uses external libraries:

  • Serd is licensed under the ISC License.
  • CityHash is licensed under the MIT License.
  • AdrduinoJson is licensed under the MIT License.

Owner

  • Name: Chair of Technical Information Systems, Friedrich-Alexander-University
  • Login: wintechis
  • Kind: organization
  • Location: Nuremberg, Germany

The organization of the Chair of Technical Information Systems, FAU. Also used for cooperation with Fraunhofer SCS-DSIoT

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - given-names: Michael
    family-names: Freund
    email: michael.freund@iis.fraunhofer.de
    orcid: 'https://orcid.org/0000-0003-1601-9331'
  - given-names: Sebastian
    family-names: Schmid
    orcid: 'https://orcid.org/0000-0002-5836-3029'
  - given-names: Rene
    family-names: Dorsch
    orcid: 'https://orcid.org/0000-0001-6857-7314'
  - given-names: Andreas
    family-names: Harth
    orcid: 'https://orcid.org/0000-0002-0702-510X'
title: "FlexRML: A Flexible and Memory Efficient Knowledge Graph Materializer"
url: "https://github.com/wintechis/flex-rml/"
preferred-citation:
  type: article
  authors:
  - given-names: Michael
    family-names: Freund
    orcid: 'https://orcid.org/0000-0003-1601-9331'
  - given-names: Sebastian
    family-names: Schmid
    orcid: 'https://orcid.org/0000-0002-5836-3029'
  - given-names: Rene
    family-names: Dorsch
    orcid: 'https://orcid.org/0000-0001-6857-7314'
  - given-names: Andreas
    family-names: Harth
    orcid: 'https://orcid.org/0000-0002-0702-510X'
  journal: "Extended Semantic Web Conference"
  month: 5
  title: "FlexRML: A Flexible and Memory Efficient Knowledge Graph Materializer"
  year: 2024

GitHub Events

Total
  • Watch event: 3
Last Year
  • Watch event: 3