eggplant

High-Level Rust API for egglog

https://github.com/milkblock/eggplant

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org, acm.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.3%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

High-Level Rust API for egglog

Basic Info
  • Host: GitHub
  • Owner: MilkBlock
  • License: mit
  • Language: Rust
  • Default Branch: main
  • Size: 421 KB
Statistics
  • Stars: 6
  • Watchers: 0
  • Forks: 0
  • Open Issues: 1
  • Releases: 0
Created 8 months ago · Last pushed 7 months ago
Metadata Files
Readme License Citation

README.md

eggplant

eggplant is the High-Level Rust API repo for the egglog tool accompanying the paper "Better Together: Unifying Datalog and Equality Saturation" (ACM DL, arXiv).

It is hard to do Graph operations directly on EGraph because EGraph is a highly compressed data structure.

Based on that fact, eggplant provides out-of-box Graph API that allows you to do revisions on EGraph intuitively.

There is also a Proc-Macro library for users to quickly define a suite of DSL.

Motivation

Recently, I've been researching the implementation of egglog. Since the egglog ecosystem is not yet mature, I spent some time providing a better API implementation with procedural macros and compile-time type checking for rust-native. This includes:

  1. DSL Define API: Quickly define a set of DSL
  2. Pattern Define API: Pattern definition for pattern matching, using declarative definition approach
  3. Commit API: Since egraph is a highly compressed structure that compresses node relationship information and is naturally averse to deletion operations, we need a bridge between normal graph structures and egraph. I borrowed from git's commit API for batch commits and version control.
  4. Insert API: Insert API with compile-time type checking added on top of the original, so you no longer need to nervously add nodes like with the native API
  5. Run Rule API: Run Pattern Rewrite

Finally, this forms a framework that allows users to implement a pattern rewriting framework supporting integer addition, subtraction, multiplication, and division with constant propagation in just fifty lines of code.

DSL Define API

```rust

[eggplant::ty]

pub enum Expr { Const { num: i64 }, Mul { l: Expr, r: Expr }, Sub { l: Expr, r: Expr }, Add { l: Expr, r: Expr }, Div { l: Expr, r: Expr }, } ```

Yes, you read that correctly. Our addition, subtraction, multiplication, and division constant DSL is defined just like that. Using Rust's Sum Type can represent a set of DSL very well, which is very intuitive.

Pattern Define API

Here's an example of addition pattern definition API:

```rust

[eggplant::pat_vars]

struct AddPat { l: Const, r: Const, p: Add, }

let pat = || { let l = Const::query(); let r = Const::query(); let p = Add::query(&l, &r); AddPat::new(l, r, p) } ```

Note that currently we must add a generic parameter after AddPat. I think we can use procedural macro tricks to omit this generic parameter later :) (Now supports omitting generics).

Defining a Pattern is actually defining two things: what to extract from the Pattern for computation and what this Pattern looks like. From the user's perspective, using a declarative approach to define is definitely more convenient. Here we use a closure to define a Pattern, but as the generic parameter exposed by the struct above shows, there's a global singleton managing all Patterns and submitting them to the database at the appropriate time.

Addition and Pattern Rewrite Definition!

rust let action = |ctx, values| { let cal = ctx.devalue(values.l.num) + ctx.devalue(values.r.num); // Used as struct rather than enum let add_value = ctx.insert_const(cal); ctx.union(values.p, add_value); },

You can see that we still need to use ctx.devalue to convert from database ID to real data. Through compile-time type information deduction, we can save the type annotation for devalue. Also, the procedural macro generates APIs corresponding to the DSL for ctx, maximizing the power of intellisense and compile-time type information.

Running

```rust txrxvt_pr!(MyTx, MyPatRec); // Global singleton definition for storing graph and patterns fn main() { let expr: Expr = Add::new(&Mul::new(&Const::new(3), &Const::new(2)), &Const::new(4)); expr.commit();

let ruleset = MyTx::new_ruleset("constant_prop");
prop!(Add,+,AddPat,ruleset);
prop!(Sub,-,SubPat,ruleset);
prop!(Mul,*,MulPat,ruleset);
prop!(Div,/,DivPat,ruleset);
for _ in 0..4 {
    let _ = MyTx::run_ruleset(ruleset, RunConfig::None);
}
MyTx::sgl().egraph_to_dot("egraph.dot".into());

} ```

Finally, the following EGraph is generated, and you can see that the root node value is directly derived.

Note that the execution count of run_ruleset is not the number of matches, but should be less than the tree depth.

Dependencies

Using eggplant requires three dependencies:

toml eggplant = { git = "https://github.com/MilkBlock/eggplant" } derive_more = "2.0.1" strum = "0.27.2"

Since egglog's database backend is not yet released on crate.io, I'm using git links to add dependencies here. This will be updated as egglog becomes stable and released.

Future Work

Of course, this framework still lacks some other APIs such as:

  • Adding Predicate API to pattern definitions as constraint conditions for pattern judgment (currently can be solved by judging in action)
  • Supporting pattern references for writing complex patterns (feels relatively easy to implement)
  • Cost attributes for extracting optimal values

Complete Code

Here's the complete code for implementing addition, subtraction, multiplication, and division constant propagation:

```rust use eggplant::{prelude::*, txrxvt_pr};

[eggplant::ty]

pub enum Expr { Const { num: i64 }, Mul { l: Expr, r: Expr }, Sub { l: Expr, r: Expr }, Add { l: Expr, r: Expr }, Div { l: Expr, r: Expr }, }

txrxvt_pr!(MyTx, MyPatRec);

macrorules! prop { ($ty:ident,$op:tt,$patname:ident,$ruleset:ident) => { #[eggplant::patvars] struct $patname { l: Const, r: Const, p: $ty, } MyTx::addrule( stringify!($patname), $ruleset, || { let l = Const::query(); let r = Const::query(); let p = $ty::query(&l, &r); $patname::new(l, r, p) }, |ctx, values| { let cal = ctx.devalue(values.l.num) $op ctx.devalue(values.r.num); let opvalue = ctx.insertconst(cal); ctx.union(values.p, opvalue); }, ); }; }

fn main() { let expr: Expr = Add::new(&Mul::new(&Const::new(3), &Const::new(2)), &Const::new(4)); expr.commit();

let ruleset = MyTx::new_ruleset("constant_prop");
prop!(Add,+,AddPat,ruleset);
prop!(Sub,-,SubPat,ruleset);
prop!(Mul,*,MulPat,ruleset);
prop!(Div,/,DivPat,ruleset);
for _ in 0..4 {
    let _ = MyTx::run_ruleset(ruleset, RunConfig::None);
}
MyTx::sgl().egraph_to_dot("egraph.dot".into());

} ```

Documentation

To view documentation, run cargo doc --open.

Contributing

Welcome to submit issues! Hope you have fun!

GitHub repository: https://github.com/MilkBlock/eggplant

Owner

  • Name: MineralSteins
  • Login: MilkBlock
  • Kind: user
  • Location: China Hangzhou

Citation (CITATION.bib)

@article{egglog,
  author = {
    Zhang, Yihong and
    Wang, Yisu Remy and
    Flatt, Oliver and
    Cao, David and
    Zucker, Philip and
    Rosenthal, Eli and
    Tatlock, Zachary and
    Willsey, Max
  },
  title = {Better Together: Unifying Datalog and Equality Saturation},
  year = {2023},
  issue_date = {June 2023},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  volume = {7},
  number = {PLDI},
  url = {https://doi.org/10.1145/3591239},
  doi = {10.1145/3591239},
  abstract = {
    We present egglog, a fixpoint reasoning system that unifies Datalog and equality saturation (EqSat). 
    Like Datalog, egglog supports efficient incremental execution, cooperating analyses, and lattice-based reasoning. 
    Like EqSat, egglog supports term rewriting, efficient congruence closure, and extraction of optimized terms. 
    We identify two recent applications -- a unification-based pointer analysis in Datalog and an EqSat-based floating-point term rewriter -- 
    that have been hampered by features missing from Datalog but found in EqSat or vice-versa. 
    We evaluate our system by reimplementing those projects in egglog. 
    The resulting systems in egglog are faster, simpler, and fix bugs found in the original systems.
  },
  journal = {Proc. ACM Program. Lang.},
  month = {jun},
  articleno = {125},
  numpages = {25},
  keywords = {Equality saturation, Program optimization, Datalog, Rewrite systems}
}

  


GitHub Events

Total
  • Issues event: 1
  • Watch event: 4
  • Issue comment event: 1
  • Push event: 15
  • Create event: 2
Last Year
  • Issues event: 1
  • Watch event: 4
  • Issue comment event: 1
  • Push event: 15
  • Create event: 2

Dependencies

Cargo.lock cargo
  • 167 dependencies
Cargo.toml cargo
proc_macros/Cargo.toml cargo