Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.4%) to scientific vocabulary
Repository
A probabalistic ML tool for science
Basic Info
Statistics
- Stars: 127
- Watchers: 5
- Forks: 9
- Open Issues: 11
- Releases: 4
Metadata Files
README.md
Lace: A Probabilistic Machine Learning tool for Scientific Discovery
Lace is a probabilistic cross-categorization engine written in rust with an optional interface to python. Unlike traditional machine learning methods, which learn some function mapping inputs to outputs, Lace learns a joint probability distribution over your dataset, which enables users to...
- predict or compute likelihoods of any number of features conditioned on any number of other features
- identify, quantify, and attribute uncertainty from variance in the data, epistemic uncertainty in the model, and missing features
- determine which variables are predictive of which others
- determine which records/rows are similar to which others on the whole or given a specific context
- simulate and manipulate synthetic data
- work natively with missing data and make inferences about missingness (missing not-at-random)
- work with continuous and categorical data natively, without transformation
- identify anomalies, errors, and inconsistencies within the data
- edit, backfill, and append data without retraining
and more, all in one place, without any explicit model building.
```python import pandas as pd import lace
Create an engine from a dataframe
df = pd.readcsv("animals.csv", indexcol=0) engine = lace.Engine.from_df(df)
Fit a model to the dataframe over 5000 steps of the fitting procedure
engine.update(5000)
Show the statistical structure of the data -- which features are likely
dependent (predictive) on each other
engine.clustermap("depprob", zmin=0, zmax=1) ```

The Problem
The goal of lace is to fill some of the massive chasm between standard machine learning (ML) methods like deep learning and random forests, and statistical methods like probabilistic programming languages. We wanted to develop a machine that allows users to experience the joy of discovery, and indeed optimizes for it.
Short version
Standard, optimization-based ML methods don't help you learn about your data. Probabilistic programming tools assume you already have learned a lot about your data. Neither approach is optimized for what we think is the most important part of data science: the science part: asking and answering questions.
Long version
Standard ML methods are easy to use. You can throw data into a random forest and start predicting with little thought. These methods attempt to learn a function f(x) -> y that maps inputs x, to outputs y. This ease-of-use comes at a cost. Generally f(x) does not reflect the reality of the process that generated your data, but was instead chosen by whoever developed the approach to be sufficiently expressive to better achieve the optimization goal. This renders most standard ML completely uninterpretable and unable to yield sensible uncertainty estimate.
On the other extreme you have probabilistic tools like probabilistic programming languages (PPLs). A user specifies a model to a PPL in terms of a hierarchy of probability distributions with parameters θ. The PPL then uses a procedure (normally Markov Chain Monte Carlo) to learn about the posterior distribution of the parameters given the data p(θ|x). PPLs are all about interpretability and uncertainty quantification, but they place a number of pretty steep requirements on the user. PPL users must specify the model themselves from scratch, meaning they must know (or at least guess) the model. PPL users must also know how to specify such a model in a way that is compatible with the underlying inference procedure.
Example use cases
- Combine data sources and understand how they interact. For example, we may wish to predict cognitive decline from demographics, survey or task performance, EKG data, and other clinical data. Combined, this data would typically be very sparse (most patients will not have all fields filled in), and it is difficult to know how to explicitly model the interaction of these data layers. In Lace, we would just concatenate the layers and run them through.
- Understanding the amount and causes of uncertainty over time. For example, a farmer may wish to understand the likelihood of achieving a specific yield over the growing season. As the season progresses, new weather data can be added to the prediction in the form of conditions. Uncertainty can be visualized as variance in the prediction, disagreement between posterior samples, or multi-modality in the predictive distribution (see this blog post for more information on uncertainty)
- Data quality control. Use
surprisalto find anomalous data in the table and use-logpto identify anomalies before they enter the table. Because Lace creates a model of the data, we can also contrive methods to find data that are inconsistent with that model, which we have used to good effect in error finding.
Who should not use Lace
There are a number of use cases for which Lace is not suited
- Non-tabular data such as images and text
- Highly optimizing specific predictions
- Lace would rather over-generalize than over fit.
Quick start
Installation
Lace requires rust.
To install the CLI:
$ cargo install --locked lace-cli
To install pylace
$ pip install pylace
Examples
Lace comes with two pre-fit example data sets: Satellites and Animals.
```python
from lace.examples import Satellites engine = Satellites()
Predict the class of orbit given the satellite has a 75-minute
orbital period and that it has a missing value of geosynchronous
orbit longitude, and return epistemic uncertainty via Jensen-
Shannon divergence.
engine.predict( ... 'ClassofOrbit', ... given={ ... 'Periodminutes': 75.0, ... 'longituderadiansofgeo': None, ... }, ... ) ('LEO', 0.023981898950561048)
Find the top 10 most surprising (anomalous) orbital periods in
the table
engine.surprisal('Periodminutes') \ ... .sort('surprisal', reverse=True) \ ... .head(10) shape: (10, 3) ┌─────────────────────────────────────┬────────────────┬───────────┐ │ index ┆ Periodminutes ┆ surprisal │ │ --- ┆ --- ┆ --- │ │ str ┆ f64 ┆ f64 │ ╞═════════════════════════════════════╪════════════════╪═══════════╡ │ Wind (International Solar-Terres... ┆ 19700.45 ┆ 11.019368 │ │ Integral (INTErnational Gamma-Ra... ┆ 4032.86 ┆ 9.556746 │ │ Chandra X-Ray Observatory (CXO) ┆ 3808.92 ┆ 9.477986 │ │ Tango (part of Cluster quartet, ... ┆ 3442.0 ┆ 9.346999 │ │ ... ┆ ... ┆ ... │ │ Salsa (part of Cluster quartet, ... ┆ 3418.2 ┆ 9.338377 │ │ XMM Newton (High Throughput X-ra... ┆ 2872.15 ┆ 9.13493 │ │ Geotail (Geomagnetic Tail Labora... ┆ 2474.83 ┆ 8.981458 │ │ Interstellar Boundary EXplorer (... ┆ 0.22 ┆ 8.884579 │ └─────────────────────────────────────┴────────────────┴───────────┘ ```
And similarly in rust:
```rust,noplayground use lace::prelude::*; use lace::examples::Example;
fn main() { // In rust, you can create an Engine or and Oracle. The Oracle is an // immutable version of an Engine; it has the same inference functions as // the Engine, but you cannot train or edit data. let mut engine = Example::Satellites.engine().unwrap();
// Predict the class of orbit given the satellite has a 75-minute
// orbital period and that it has a missing value of geosynchronous
// orbit longitude, and return epistemic uncertainty via Jensen-
// Shannon divergence.
engine.predict(
"Class_of_Orbit",
&Given::Conditions(vec![
("Period_minutes", Datum:Continuous(75.0)),
("Longitude_of_radians_geo", Datum::Missing),
]),
Some(PredictUncertaintyType::JsDivergence),
None,
)
} ```
Fitting a model
To fit a model to your own data you can use the CLI
console
$ lace run --csv my-data.csv -n 1000 my-data.lace
...or initialize an engine from a file or dataframe.
```python
import pandas as pd # Lace supports polars as well from lace import Engine engine = Engine.fromdf(pd.readcsv("my-data.csv", indexcol=0)) engine.update(1000) engine.save("my-data.lace") ```
You can monitor the progress of the training using diagnostic plots
```python
from lace.plot import diagnostics diagnostics(engine) ```

License
Lace is licensed under the Business Source License v1.1, which restricts commercial use. See LICENSE for full details.
If you would like a license for use in commercial please contact lace@redpoll.ai
Academic use
Lace is free for academic use. Please cite lace according the the CITATION.cff metadata.
Owner
- Name: promised-ai
- Login: promised-ai
- Kind: organization
- Repositories: 1
- Profile: https://github.com/promised-ai
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: 'Lace: Bayesian Tabular Analysis for Scientific Discovery'
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Baxter
family-names: Eaves
name-suffix: Jr
email: bax@redpoll.ai
affiliation: Redpoll
- given-names: Michael
family-names: Schmidt
email: schmidt@redpoll.ai
affiliation: Redpoll
- given-names: Ken
family-names: Swanson
email: ken.swanson@redpoll.ai
affiliation: Redpoll
identifiers:
- type: url
value: 'https://github.com/promised-ai/lace'
description: Github repository
repository-code: 'https://github.com/promised-ai/lace'
url: 'https://lace.dev'
abstract: >-
Lace is a probabilistic cross-categorization engine
written in rust.
keywords:
- Bayesian
- Machine Learning
license: BUSL-1.1
version: 0.8.0
date-released: '2024-02-07'
GitHub Events
Total
- Watch event: 22
- Push event: 4
- Fork event: 1
- Create event: 1
Last Year
- Watch event: 22
- Push event: 4
- Fork event: 1
- Create event: 1
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 32
- Total pull requests: 199
- Average time to close issues: about 2 months
- Average time to close pull requests: 4 days
- Total issue authors: 10
- Total pull request authors: 4
- Average comments per issue: 1.34
- Average comments per pull request: 0.17
- Merged pull requests: 174
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- BaxterEaves (9)
- schmidmt (6)
- firekg (3)
- amifalk (1)
- TempTemperson (1)
- perplexes (1)
- thomasaarholt (1)
- zamazan4ik (1)
Pull Request Authors
- Swandog (73)
- BaxterEaves (69)
- schmidmt (44)
- TempTemperson (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 11
-
Total downloads:
- cargo 112,481 total
- pypi 1,135 last-month
-
Total dependent packages: 29
(may contain duplicates) -
Total dependent repositories: 8
(may contain duplicates) - Total versions: 96
- Total maintainers: 6
crates.io: lace_utils
Miscellaneous utilities for Lace and shared libraries
- Homepage: https://www.lace.dev/
- Documentation: https://docs.rs/lace_utils/
- License: BUSL-1.1
-
Latest release: 0.3.0
published about 2 years ago
Rankings
Maintainers (3)
pypi.org: pylace
A probabalistic programming ML tool for science
- Documentation: https://pylace.readthedocs.io/
- License: BUSL-1.1
-
Latest release: 0.8.0
published over 1 year ago
Rankings
Maintainers (3)
crates.io: lace_data
Data definitions and data container definitions for Lace
- Homepage: https://www.lace.dev/
- Documentation: https://docs.rs/lace_data/
- License: BUSL-1.1
-
Latest release: 0.3.0
published about 2 years ago
Rankings
Maintainers (3)
crates.io: lace_stats
Contains component model and hyperprior specifications
- Homepage: https://www.lace.dev/
- Documentation: https://docs.rs/lace_stats/
- License: BUSL-1.1
-
Latest release: 0.4.0
published over 1 year ago
Rankings
Maintainers (3)
crates.io: lace_consts
Default constants for Lace
- Homepage: https://www.lace.dev/
- Documentation: https://docs.rs/lace_consts/
- License: BUSL-1.1
-
Latest release: 0.2.1
published about 2 years ago
Rankings
Maintainers (3)
crates.io: lace_codebook
Contains the Lace codebook specification as well as utilities for generating defaults
- Homepage: https://www.lace.dev/
- Documentation: https://docs.rs/lace_codebook/
- License: BUSL-1.1
-
Latest release: 0.7.0
published over 1 year ago
Rankings
Maintainers (3)
crates.io: lace_geweke
Geweke tester for Lace
- Homepage: https://www.lace.dev/
- Documentation: https://docs.rs/lace_geweke/
- License: BUSL-1.1
-
Latest release: 0.4.0
published over 1 year ago
Rankings
Maintainers (3)
crates.io: lace_cc
Core of the Lace cross-categorization engine library
- Homepage: https://www.lace.dev/
- Documentation: https://docs.rs/lace_cc/
- License: BUSL-1.1
-
Latest release: 0.7.0
published over 1 year ago
Rankings
Maintainers (3)
crates.io: lace_metadata
Archive of the metadata (savefile) formats for Lace. In charge of versioning and conversion.
- Homepage: https://www.lace.dev/
- Documentation: https://docs.rs/lace_metadata/
- License: BUSL-1.1
-
Latest release: 0.7.0
published over 1 year ago
Rankings
Maintainers (3)
crates.io: lace
A probabilistic cross-categorization engine
- Homepage: https://www.lace.dev/
- Documentation: https://docs.rs/lace/
- License: BUSL-1.1
-
Latest release: 0.8.0
published over 1 year ago
Rankings
Maintainers (3)
crates.io: lace-cli
A probabilistic cross-categorization engine
- Homepage: https://www.lace.dev/
- Documentation: https://docs.rs/lace-cli/
- License: BUSL-1.1
-
Latest release: 0.7.0
published about 2 years ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v3 composite
- actions/configure-pages v3 composite
- actions/deploy-pages v1 composite
- actions/upload-pages-artifact v1 composite
- dtolnay/rust-toolchain stable composite
- PyO3/maturin-action v1 composite
- Swatinem/rust-cache v2 composite
- actions/checkout v3 composite
- actions/download-artifact v3 composite
- actions/setup-python v4 composite
- actions/upload-artifact v3 composite
- dtolnay/rust-toolchain stable composite
- Swatinem/rust-cache v2 composite
- actions-rs/cargo v1 composite
- actions/checkout v3 composite
- dtolnay/rust-toolchain stable composite
- 363 dependencies
- 328 dependencies
- approx 0.5.1 development
- criterion 0.5 development
- indoc 2.0.3 development
- once_cell 1.13.0 development
- plotly 0.8 development
- tempfile 3.4 development
- bincode 1
- clap 4.3.17
- ctrlc 3.2.1
- dirs 5
- env_logger 0.10
- flate2 1.0.23
- indexmap 2.0.0
- indicatif 0.17.0
- itertools 0.11
- lace_cc 0.2.0
- lace_codebook 0.2.0
- lace_consts 0.1.4
- lace_data 0.1.2
- lace_geweke 0.1.2
- lace_metadata 0.2.0
- lace_stats 0.1.3
- lace_utils 0.1.2
- log 0.4
- maplit 1
- num 0.4
- polars 0.33
- rand 0.8
- rand_distr 0.4
- rand_xoshiro 0.6
- rayon 1.5
- regex 1
- serde 1
- serde_json 1
- serde_yaml 0.9.4
- special 0.10
- thiserror 1.0.19
- toml 0.7
- approx 0.5.1 development
- criterion 0.5 development
- indoc 2.0.3 development
- enum_dispatch 0.3.10
- indicatif 0.17.0
- itertools 0.11
- lace_codebook 0.2.0
- lace_consts 0.1.4
- lace_data 0.1.2
- lace_geweke 0.1.2
- lace_stats 0.1.2
- lace_utils 0.1.2
- once_cell 1
- rand 0.8
- rand_xoshiro 0.6
- rayon 1.5
- serde 1
- special 0.10
- thiserror 1.0.19
- indoc 2 development
- tempfile 3.3.0 development
- flate2 1.0.23
- lace_consts 0.1.4
- lace_data 0.1.2
- lace_stats 0.1.4
- lace_utils 0.1.2
- maplit 1
- polars 0.33
- rand 0.8.5
- rayon 1.5
- serde 1
- serde_yaml 0.9.4
- thiserror 1.0.11
- approx 0.5.1 development
- criterion 0.5 development
- rand 0.8 development
- serde_json 1 development
- lace_utils 0.1.2
- regex 1
- serde 1
- thiserror 1.0.19
- tempfile 3 development
- bincode 1
- dirs 5
- hex 0.4
- lace_cc 0.2.0
- lace_codebook 0.2.0
- lace_data 0.1.2
- lace_stats 0.1.4
- log 0.4
- once_cell 1
- rand_xoshiro 0.6
- rayon 1.5
- serde 1
- serde_json 1
- serde_yaml 0.9.4
- thiserror 1.0.19
- toml 0.7
- approx 0.5.1 development
- criterion 0.5 development
- maplit 1 development
- rand_distr 0.4 development
- serde_json 1 development
- itertools 0.11
- lace_consts 0.1.4
- lace_data 0.1.2
- lace_utils 0.1.2
- rand 0.8
- rand_xoshiro 0.6
- regex 1.6.0
- serde 1
- special 0.10
- thiserror 1.0.11
- approx 0.5.1 development
- rand 0.8
- alpine 3.11 build
- rust 1.42-alpine build