https://github.com/algebraicjulia/algebraicrelations.jl

Relational Algebra, now with more algebra!

https://github.com/algebraicjulia/algebraicrelations.jl

Science Score: 46.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
    1 of 5 committers (20.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.6%) to scientific vocabulary

Keywords

algebra algebraic-structures category-theory relational relational-algebra relational-databases

Keywords from Contributors

chemistry
Last synced: 5 months ago · JSON representation

Repository

Relational Algebra, now with more algebra!

Basic Info
Statistics
  • Stars: 54
  • Watchers: 8
  • Forks: 3
  • Open Issues: 16
  • Releases: 8
Topics
algebra algebraic-structures category-theory relational relational-algebra relational-databases
Created over 5 years ago · Last pushed 6 months ago
Metadata Files
Readme License

README.md

AlgebraicRelations.jl

Stable Documentation Development Documentation Code Coverage CI/CD

Tests

AlgebraicRelations.jl is a Julia library built to provide an intuitive and elegant method for generating and querying a scientific database. This package provides tooling for defining database schemas, generating query visualizations, and connecting directly up to a PostgreSQL server. This package is built on top of Catlab.jl which is the powerhouse behind its functions.

Learning by Doing

The functions of this library may be best explained by showing an example of how it can be used. This will be done in the steps of Defining a Schema, Creating Queries, and Connecting to PostgreSQL.

Defining a Schema

Within this library, we define database schemas based on the presentation of a workflow (more generally, the presentation of a symmetric monoidal category). The presentation of a workflow includes the data types of products in the workflow (objects in an SMC) and the processes that transform these products (homomorphisms in an SMC). We will give an example of defining the schema of a traditional computer vision workflow. This involves extracting images from a file, performing a test/train split on images, training a neural network on images, and finally evaluating a network on images. This example is also presented in this notebook.

Defining Types

In order to define types for the presentation, we need to provide the name of the type (e.g. File for compressed files of images) and then the Julia datatype which can store this type (The filename can be stored uniquely as a String). The definition of all types that we will need for our example is as follows:

```julia

Initialize presentation object

present = Presentation()

Add types to presentation

File, Images, NeuralNet, Accuracy, Metadata = add_types!(present, [(:File, String), (:Images, String), (:NeuralNet, String), (:Accuracy, Real), (:Metadata, String)]); ```

Defining Processes

To define processes that operate on these types, we need three pieces of information. First, we need the name of the processes (extract for the process that extracts images from files), the input types (File for the file to extract) and the output types (Images for the images which were extracted). The symbol (monoidal product) joins two types, allowing for multiple types in the inputs and outputs of processes. To the schema, this means nothing more than that, for the process train there are two objects need for the input, the first of type NeuralNet and the second of type Images.

```julia

Add Processes to presentation

extract, split, train, evaluate = add_processes!(present, [(:extract, File, Images), (:split, Images, Images⊗Images), (:train, NeuralNet⊗Images, NeuralNet⊗Metadata), (:evaluate, NeuralNet⊗Images, Accuracy⊗Metadata)]); ```

Generating the Schema

Once this presentation is defined, the database schema can be generated as follows:

```julia

Convert to Schema

TrainDB = presenttoschema(present); print(generateschemasql(TrainDB())) ```

sql CREATE TABLE evaluate (NeuralNet1 text, Images2 text, Accuracy3 real, Metadata4 text); CREATE TABLE extract (File1 text, Images2 text); CREATE TABLE split (Images1 text, Images2 text, Images3 text); CREATE TABLE train (NeuralNet1 text, Images2 text, NeuralNet3 text, Metadata4 text);

Creating Queries

In order to create queries, we use the @query macro (based on the @relation macro in Catlab). For this, we must specify a list of objects to get as results of the query, list of all objects used in the query, and finally a list of relationships between these objects (based on the primitives defined for the workflow). In this case, the relationships between objects are the processes from the presentation and the types of objects are the types defined in the presentation. Following is an example workflow

julia q = @query TrainDB() (im_train, nn, im_test, acc, md2) where (im_train, im_test, nn, nn_trained, acc, md, md2, _base_acc, im) begin train(nn, im_train, nn_trained, md) evaluate(nn_trained, im_test, acc, md2) split(im, im_train, im_test) >=(acc, _base_acc) end print(to_sql(q))

This produces the following query:

sql SELECT t1.Images2 AS im_train, t1.NeuralNet1 AS nn, t2.Images2 AS im_test, t2.Accuracy3 AS acc, t2.Metadata4 AS md2 FROM train AS t1, evaluate AS t2, split AS t3 WHERE t2.NeuralNet1=t1.NeuralNet3 AND t3.Images2=t1.Images2 AND t3.Images3=t2.Images2 AND t2.Accuracy3>=$1

Connecting to PostgreSQL

The connection to PostgreSQL is fairly straightforward. We first create a connection using the LibPQ.jl library:

Julia conn = Connection("dbname=test_db");

We then can prepare statements and run them with arguments like:

Julia statement = prepare(conn,q) execute(statement, [0.6])

which will obtain all of the rows from the previous query which contain an accuracy of greater than 0.6.

The execute function will return a DataFrame object (from the DataFrames.jl library)

Theory

Some excellent resources for understanding how Bicategories of Relations relate to SQL queries (and inspiriation for this library) are as follows:

Owner

  • Name: AlgebraicJulia
  • Login: AlgebraicJulia
  • Kind: organization

GitHub Events

Total
  • Issues event: 1
  • Watch event: 7
  • Delete event: 5
  • Issue comment event: 2
  • Push event: 56
  • Pull request review event: 5
  • Pull request review comment event: 3
  • Pull request event: 21
  • Fork event: 1
  • Create event: 9
Last Year
  • Issues event: 1
  • Watch event: 7
  • Delete event: 5
  • Issue comment event: 2
  • Push event: 56
  • Pull request review event: 5
  • Pull request review comment event: 3
  • Pull request event: 21
  • Fork event: 1
  • Create event: 9

Committers

Last synced: almost 3 years ago

All Time
  • Total Commits: 77
  • Total Committers: 5
  • Avg Commits per committer: 15.4
  • Development Distribution Score (DDS): 0.117
Top Committers
Name Email Commits
bosonbaas b****s@g****m 68
Evan Patterson e****n@e****g 5
Micah Halter m****r@g****u 2
github-actions[bot] 4****]@u****m 1
Micah Halter m****h@b****o 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 14
  • Total pull requests: 51
  • Average time to close issues: 26 days
  • Average time to close pull requests: 3 months
  • Total issue authors: 9
  • Total pull request authors: 9
  • Average comments per issue: 2.29
  • Average comments per pull request: 0.45
  • Merged pull requests: 33
  • Bot issues: 0
  • Bot pull requests: 5
Past Year
  • Issues: 1
  • Pull requests: 20
  • Average time to close issues: N/A
  • Average time to close pull requests: 8 days
  • Issue authors: 1
  • Pull request authors: 3
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.1
  • Merged pull requests: 13
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • jpfairbanks (3)
  • epatters (3)
  • jmatsushita (2)
  • mehalter (1)
  • slwu89 (1)
  • schrauf (1)
  • sdwfrost (1)
  • JuliaTagBot (1)
  • bosonbaas (1)
Pull Request Authors
  • bosonbaas (14)
  • quffaro (13)
  • algebraicjuliabot (6)
  • github-actions[bot] (5)
  • TheCedarPrince (4)
  • epatters (3)
  • mehalter (2)
  • slwu89 (1)
  • jmatsushita (1)
Top Labels
Issue Labels
question (1) bug (1)
Pull Request Labels
enhancement (4) documentation (2)