pycottas

Python COTTAS library for compressing and querying RDF

https://github.com/arenas-guerrero-julian/pycottas

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.8%) to scientific vocabulary

Keywords

apache-parquet cottas data-engineering knowledge-graph python rdf
Last synced: 6 months ago · JSON representation

Repository

Python COTTAS library for compressing and querying RDF

Basic Info
Statistics
  • Stars: 11
  • Watchers: 2
  • Forks: 2
  • Open Issues: 0
  • Releases: 1
Topics
apache-parquet cottas data-engineering knowledge-graph python rdf
Created almost 3 years ago · Last pushed 6 months ago
Metadata Files
Readme License Code of conduct Zenodo

README.md

pycottas

License DOI Latest PyPI version Python Version PyPI status Documentation Status

pycottas is a library for working with compressed RDF files in the COTTAS format. COTTAS stores triples as a triple table in Apache Parquet. It is built on top of DuckDB and provides an HDT-like interface.

Features :sparkles:

  • Compression and decompression of RDF files.
  • Querying COTTAS files with triple patterns.
  • RDFLib store backend for querying COTTAS files with SPARQL.
  • Supports RDF datasets (quads).
  • Can be used as a library or via command line.

Documentation :bookmark_tabs:

Read the documentation.

Getting Started :rocket:

PyPI is the fastest way to install pycottas: bash pip install pycottas

We recommend to use virtual environments to install pycottas.

```python

import pycottas from rdflib import Graph, URIRef

pycottas.rdf2cottas('myfile.ttl', 'myfile.cottas', index='spo') res = pycottas.search('myfile.cottas', '?s http://www.w3.org/1999/02/22-rdf-syntax-ns#type ?o') print(res) pycottas.cottas2rdf('myfile.cottas', 'my_file.nt')

COTTASDocument class for querying with triple patterns

cottasdoc = pycottas.COTTASDocument('myfile.cottas')

the triple pattern can be a string (below) or a tuple of RDFLib terms

res = cottas_doc.search('?s http://www.w3.org/1999/02/22-rdf-syntax-ns#type ?o')

COTTASStore class for querying with SPARQL

graph = Graph(store=pycottas.COTTASStore('my_file.cottas')) res = graph.query(''' PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# SELECT DISTINCT ?s ?o WHERE { ?s rdf:type ?o . } LIMIT 10''') for row in res: print(row) ```

To execute via command line check the docs.

License :unlock:

pycottas is available under the Apache License 2.0.

Author & Contact :mailboxwithmail:

Universidad Politécnica de Madrid.

Citing :speech_balloon:

If you used pycottas in your work, please cite the ISWC paper:

bib @inproceedings{arenas2025cottas, title = {{COTTAS: Columnar Triple Table Storage for Efficient and Compressed RDF Management}}, author = {Arenas-Guerrero, Julián and Ferrada, Sebastián}, booktitle = {Proceedings of the 24th International Semantic Web Conference}, year = {2025}, publisher = {Springer Nature Switzerland}, }

GitHub Events

Total
  • Release event: 2
  • Watch event: 8
  • Delete event: 1
  • Public event: 1
  • Push event: 74
  • Fork event: 1
  • Create event: 1
Last Year
  • Release event: 2
  • Watch event: 8
  • Delete event: 1
  • Public event: 1
  • Push event: 74
  • Fork event: 1
  • Create event: 1

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 22 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 1
  • Total maintainers: 1
pypi.org: pycottas

Python COTTAS library for compressing and querying RDF.

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 22 Last month
Rankings
Dependent packages count: 9.2%
Average: 30.5%
Dependent repos count: 51.8%
Maintainers (1)
Last synced: 6 months ago