https://github.com/cwida/alp

ALP: Adaptive Lossless Floating-Point Compression

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 7 DOI reference(s) in README
✓
Academic publication links
Links to: acm.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.1%) to scientific vocabulary

Last synced: 9 months ago · JSON representation

Repository

ALP: Adaptive Lossless Floating-Point Compression

Basic Info

Host: GitHub
Owner: cwida
License: mit
Language: C++
Default Branch: main
Homepage:
Size: 60.9 MB

Statistics

Stars: 101
Watchers: 11
Forks: 11
Open Issues: 3
Releases: 0

Created almost 3 years ago · Last pushed about 1 year ago

Metadata Files

Readme License

ALP: Adaptive Lossless Floating-Point Compression

Authors: Azim Afroozeh, Leonardo Kuffó, Peter Boncz
Conference: ACM SIGMOD 2024

What is this repo?

This repository contains the source code and benchmarks for the paper ALP: Adaptive Lossless Floating-Point Compression, published at ACM SIGMOD 2024.

ALP is a state-of-the-art lossless compression algorithm designed for IEEE 754 floating-point data. It encodes data by exploiting two common patterns found in real-world floating-point values:

Decimal Floating-Point Numbers:
A large portion of floats/doubles in real-world datasets are decimals. ALP maps these values into integers by multiplying the number by a power of 10 and then compressing the result using a FastLanes variant of Frame-of-Reference encoding[^1], which is SIMD-friendly.
Example: the number 10.12 becomes 1012 and is then fed to the FastLanes encoder.
High-Precision Floating-Point Numbers:
The remaining values are typically high-precision floats/doubles. ALP targets compression opportunities in only the left part of these values, which it compresses using FastLanes dictionary encoding. The right part is left uncompressed, as it is required to preserve high precision and is often highly random and incompressible.

📊 How does ALP perform?

ALP Results

These results highlight ALP’s superior performance across all three key metrics of a compression algorithm:
Decoding Speed, Compression Ratio, and Compression Speed—outperforming other schemes in every category.

🧪 How to Reproduce Results

Just run the following script:

bash ./publication/script/master_script.sh

For more information on reproducing our benchmarks, refer to our guide here,
or read the official ACM reproducibility report:
https://dl.acm.org/doi/10.1145/3687998.3717057

🏅 ACM Artifacts & Awards

We are happy to share that we participated in the SIGMOD Availability & Reproducibility Initiative, and our paper earned all three badges:

ACM Artifacts Available ACM Artifacts Evaluated ACM Results Reproduced

🎉 We're also proud to share that ALP won the SIGMOD Best Artifact Award!

Trophy

⏱️ Want to Benchmark Your Dataset?

Check out our guide: How to Benchmark Your Dataset
It explains how to run ALP on your own data.

🗂️ Repository Structure

src/: Core implementation of ALP and ALP_RD
benchmarks/: Benchmarking tools and datasets
include/: Header files for integration
scripts/: Utility scripts for data processing
test/: Unit tests
publication/: Publications and supplementary materials

📚 Publications

Conference Paper:
ALP: Adaptive Lossless Floating-Point Compression, ACM SIGMOD 2024
https://dl.acm.org/doi/10.1145/3626717
Reproducibility Report:
Reproducibility Report for ACM SIGMOD 2024 Paper: 'ALP: Adaptive Lossless Floating-Point Compression'
https://dl.acm.org/doi/10.1145/3687998.3717057

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

📬 Contact

If you have questions, want to contribute, or just want to stay up to date with ALP and related projects, join our community on Discord:

🧩 Used By

ALP has been integrated into the following systems:

[^1]: Learn more about FastLanes here: https://github.com/cwida/fastlanes

Owner

Name: CWI Database Architectures Group
Login: cwida
Kind: organization
Location: Amsterdam, The Netherlands

Website: https://www.cwi.nl/research/groups/database-architectures
Twitter: cwi_da
Repositories: 19
Profile: https://github.com/cwida

GitHub Events

Total

Issues event: 17
Watch event: 67
Delete event: 3
Member event: 1
Issue comment event: 67
Push event: 106
Pull request event: 15
Fork event: 5
Create event: 6

Last Year

Issues event: 17
Watch event: 67
Delete event: 3
Member event: 1
Issue comment event: 67
Push event: 106
Pull request event: 15
Fork event: 5
Create event: 6

Issues and Pull Requests

Last synced: 9 months ago

All Time

Total issues: 5
Total pull requests: 2
Average time to close issues: 9 days
Average time to close pull requests: about 11 hours
Total issue authors: 5
Total pull request authors: 1
Average comments per issue: 4.2
Average comments per pull request: 0.0
Merged pull requests: 2
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 5
Pull requests: 2
Average time to close issues: 9 days
Average time to close pull requests: about 11 hours
Issue authors: 5
Pull request authors: 1
Average comments per issue: 4.2
Average comments per pull request: 0.0
Merged pull requests: 2
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

lovedancer075 (2)
LeeSure5986 (2)
cqujk (1)
Eclesia (1)
Hzc492 (1)
tsctsch (1)
SeeYouLaterPromise (1)
aabduvakhobov (1)
zhou9402 (1)
jiangzhuti (1)
Guan-JW (1)
mwlon (1)

Pull Request Authors

azimafroozeh (18)
SvenHepkema (2)
lyp-bobi (1)
aabduvakhobov (1)
lkuffo (1)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/cwida/alp

Science Score: 49.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

ALP: Adaptive Lossless Floating-Point Compression

What is this repo?

📊 How does ALP perform?

🧪 How to Reproduce Results

🏅 ACM Artifacts & Awards

⏱️ Want to Benchmark Your Dataset?

🗂️ Repository Structure

📚 Publications

📄 License

📬 Contact

🧩 Used By

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies