graflo

A framework for transforming tabular (CSV, SQL) and hierarchical data (JSON, XML) into property graphs and ingesting them into graph databases (ArangoDB, Neo4j)

https://github.com/growgraph/graflo

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.5%) to scientific vocabulary
Last synced: 9 months ago · JSON representation ·

Repository

A framework for transforming tabular (CSV, SQL) and hierarchical data (JSON, XML) into property graphs and ingesting them into graph databases (ArangoDB, Neo4j)

Basic Info
Statistics
  • Stars: 11
  • Watchers: 1
  • Forks: 0
  • Open Issues: 5
  • Releases: 11
Created almost 5 years ago · Last pushed 9 months ago
Metadata Files
Readme Changelog Contributing License Citation

README.md

GraFlo graflo logo

A framework for transforming tabular (CSV, SQL) and hierarchical data (JSON, XML) into property graphs and ingesting them into graph databases (ArangoDB, Neo4j).

⚠️ Package Renamed: This package was formerly known as graphcast.

[Rest of your existing README content...]

Python PyPI version PyPI Downloads License: BSL pre-commit DOI

Core Concepts

Property Graphs

graflo works with property graphs, which consist of:

  • Vertices: Nodes with properties and optional unique identifiers
  • Edges: Relationships between vertices with their own properties
  • Properties: Both vertices and edges may have properties

Schema

The Schema defines how your data should be transformed into a graph and contains:

  • Vertex Definitions: Specify vertex types, their properties, and unique identifiers
  • Edge Definitions: Define relationships between vertices and their properties
  • Resource Mapping: describe how data sources map to vertices and edges
  • Transforms: Modify data during the casting process

Resources

Resources are your data sources that can be:

  • Table-like: CSV files, database tables
  • JSON-like: JSON files, nested data structures

Features

  • Graph Transformation Meta-language: A powerful declarative language to describe how your data becomes a property graph:
    • Define vertex and edge structures
    • Set compound indexes for vertices and edges
    • Use blank vertices for complex relationships
    • Specify edge constraints and properties
    • Apply advanced filtering and transformations
  • Parallel processing: Use as many cores as you have
  • Database support: Ingest into ArangoDB and Neo4j using the same API (database agnostic)

Documentation

Full documentation is available at: growgraph.github.io/graflo

Installation

bash pip install graflo

Usage Examples

Simple ingest

```python from suthing import ConfigFactory, FileHandle

from graflo import Schema, Caster, Patterns

schema = Schema.from_dict(FileHandle.load("schema.yaml"))

connconf = ConfigFactory.createconfig({ "protocol": "http", "hostname": "localhost", "port": 8535, "username": "root", "password": "123", "database": "_system", } )

patterns = Patterns.from_dict( { "patterns": { "work": {"regex": "\Sjson$"}, } } )

schema.fetch_resource()

caster = Caster( schema, )

caster.ingestfiles( path="./data", connconf=conn_conf, patterns=patterns, ) ```

Development

To install requirements

shell git clone git@github.com:growgraph/graflo.git && cd graflo uv sync --dev

Tests

Test databases

Spin up Arango from arango docker folder by

shell docker-compose --env-file .env up arango

and Neo4j from neo4j docker folder by

shell docker-compose --env-file .env up neo4j

To run unit tests

shell pytest test

Requirements

  • Python 3.11+
  • python-arango

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Owner

  • Name: GrowGraph
  • Login: growgraph
  • Kind: organization
  • Email: team@growgraph.dev
  • Location: France

Citation (CITATION.cff)

abstract: <p><span>A framework for transforming</span><span>&nbsp;</span><strong>tabular</strong><span>&nbsp;</span><span>data</span><span>&nbsp;(</span><span>CSV,</span><span>&nbsp;</span><span>SQL</span><span>)&nbsp;</span><span>and</span><span>&nbsp;</span><strong>hierarchical</strong><span>&nbsp;</span><span>data</span><span>&nbsp;(</span><span>JSON,</span><span>&nbsp;</span><span>XML</span><span>)&nbsp;</span><span>into
  property graphs and ingesting them into graph databases</span><span>&nbsp;(</span><span>ArangoDB,</span><span>&nbsp;</span><span>Neo4j</span><span>)</span><span>.</span></p>
authors:
- affiliation: GrowGraph
  family-names: Belikov
  given-names: Alexander
  orcid: 0000-0002-5649-0913
cff-version: 1.2.0
date-released: '2025-05-16'
doi: 10.5281/zenodo.15446131
license: []
license-url: https://github.com/growgraph/graflo/blob/main/LICENSE
message: If you use this software, please cite it using the metadata from this file.
title: graflo
type: software

GitHub Events

Total
  • Create event: 1
  • Release event: 1
  • Issues event: 1
  • Watch event: 2
  • Push event: 3
  • Pull request event: 3
Last Year
  • Create event: 1
  • Release event: 1
  • Issues event: 1
  • Watch event: 2
  • Push event: 3
  • Pull request event: 3

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 2
  • Total pull requests: 3
  • Average time to close issues: N/A
  • Average time to close pull requests: 1 minute
  • Total issue authors: 1
  • Total pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 3
  • Average time to close issues: N/A
  • Average time to close pull requests: 1 minute
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • alexander-belikov (2)
Pull Request Authors
  • alexander-belikov (3)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads: unknown
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 1
  • Total maintainers: 1
pypi.org: graflo

(!) Deprecated: this package has been renamed to GraFlo. A framework for transforming tabular (CSV, SQL) and hierarchical data (JSON, XML) into property graphs and ingesting them into graph databases (ArangoDB, Neo4j)

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 8.6%
Average: 28.6%
Dependent repos count: 48.5%
Maintainers (1)
Last synced: 9 months ago

Dependencies

.github/workflows/pre-commit.yml actions
  • actions/checkout v4 composite
  • astral-sh/setup-uv v5 composite
  • pre-commit/action v3.0.1 composite
docker/arango/docker-compose.yml docker
  • ${IMAGE_VERSION} latest
docker/neo4j/docker-compose.yml docker
  • ${IMAGE_VERSION} latest
pyproject.toml pypi
  • click >=8.1.7,<9
  • ijson >=3.2.3,<4
  • neo4j >=5.22.0,<6
  • networkx ~=3.3
  • pandas >=2.0.3,<3
  • python-arango >=8.1.2,<9
  • suthing ==0.4.1
  • xmltodict >=0.14.2,<0.15
uv.lock pypi
  • certifi 2024.12.14
  • cfgv 3.4.0
  • charset-normalizer 3.4.1
  • click 8.1.8
  • colorama 0.4.6
  • dataclass-wizard 0.34.0
  • distlib 0.3.9
  • filelock 3.17.0
  • graph-cast 0.13.19
  • identify 2.6.6
  • idna 3.10
  • ijson 3.3.0
  • importlib-metadata 8.6.1
  • iniconfig 2.0.0
  • neo4j 5.27.0
  • networkx 3.4.2
  • nodeenv 1.9.1
  • numpy 2.2.2
  • packaging 24.2
  • pandas 2.2.3
  • platformdirs 4.3.6
  • pluggy 1.5.0
  • pre-commit 3.8.0
  • pygraphviz 1.14
  • pyjwt 2.10.1
  • pytest 7.4.4
  • python-arango 8.1.4
  • python-dateutil 2.9.0.post0
  • python-dotenv 1.1.0
  • pytz 2024.2
  • pyyaml 6.0.2
  • requests 2.32.3
  • requests-toolbelt 1.0.0
  • setuptools 75.8.0
  • six 1.17.0
  • suthing 0.4.1
  • typing-extensions 4.13.2
  • tzdata 2025.1
  • urllib3 2.3.0
  • virtualenv 20.29.1
  • xmltodict 0.14.2
  • zipp 3.21.0