Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.9%) to scientific vocabulary
Keywords
Repository
Flock: A Low-Cost Streaming Query Engine on FaaS Platforms
Basic Info
- Host: GitHub
- Owner: flock-lab
- License: agpl-3.0
- Language: Rust
- Default Branch: master
- Homepage: https://flock-lab.github.io/flock/
- Size: 4.08 MB
Statistics
- Stars: 270
- Watchers: 11
- Forks: 37
- Open Issues: 17
- Releases: 4
Topics
Metadata Files
README.md
Flock: A Low-Cost Streaming Query Engine on FaaS Platforms
Flock is a cloud-native streaming query engine that leverages the on-demand elasticity of Function-as-a-Service (FaaS) platforms to perform real-time data analytics. Traditional server-centric deployments often suffer from resource under- or over-provisioning, leading to resource wastage or performance degradation. Flock addresses these issues by providing more fine-grained elasticity that can dynamically match the per-query basis with continuous scaling, and its billing methods are more fine-grained with millisecond granularity, making it a low-cost solution for stream processing. Our approach, payload invocation, eliminates the need for external storage services and eliminates the requirement for a query coordinator in the data architecture. Our evaluation shows that Flock significantly outperforms state-of-the-art systems in terms of cost, especially on ARM processors, making it a promising solution for real-time data analytics on FaaS platforms.
The generic lambda function code is built in advance and uploaded to AWS S3.
| FaaS Service | AWS Lambda | GCP Functions | Azure Functions | Architectures | SIMD | YSB | NEXMark | | :----------: | :--------: | :-----------: | :-------------: | :-----------: | :--: | :--------------------------------------------------: | :-------------------------------------------------------------------------: | | Flock | 🏅🏅🏅🏅 | 👉 TBD | 👉 TBD | Arm, x86 | ✅ | ✅ | ✅ |
Arxiv Paper
@misc{gang2023flock,
title={Flock: A Low-Cost Streaming Query Engine on FaaS Platforms},
author={Gang Liao and Amol Deshpande and Daniel J. Abadi},
year={2023},
eprint={2312.16735},
archivePrefix={arXiv},
primaryClass={cs.DB}
}
Build From Source Code
You can enable the features simd (to use SIMD instructions) and/or mimalloc or snmalloc (to use either the mimalloc or snmalloc allocator) as features by passing them in as --features:
To build and deploy Flock to AWS Lambda in one step, you can use the following command:
ignore
$ ./configure -c -a x86_64
Output
```ignore ============================================================ Compiling and Deploying Benchmarks ============================================================ Building x86_64-unknown-linux-gnu [1/3] Compiling Flock Lambda Function... [2/3] Compiling Flock CLI... [3/3] Deploying Flock Lambda Function... ============================================================ Upload function code to S3 ============================================================ Packaging code and uploading to S3... [OK] Upload Succeed. ============================================================ ```If you prefer to use the cargo command to build and deploy Flock, you can use the following commands:
Commands
1. Build Flock for x86_64 ```ignore $ cargo +nightly build --target x86_64-unknown-linux-gnu --release --features "simd mimalloc" ``` 2. Deploy Flock binary to AWS S3 ```ignore $ cd ./target/x86_64-unknown-linux-gnu/release $ ./flock-cli s3 put --path ./flock --key flock_x86_64 ```Client Output
``````bash /` /:y/` ` `shdhso. -yhddh+. .yhhhy+- .syyhs+/. `+shhs++:. `:syyyo++/. .+ssys+++/-` `.----.` ./oyyyo+++/:.` `-/+++/-..` -/osyso++++/:.` -/++/-` .-/osssoo++++/:++++` `.-/++osooo++++++++- `-:/+oooo++++++o/ `-:/+o++++++oo- ````` ` `.-//++++++o/ `:++::::// .:++:` .:///////. .://///+- ./++:` .++/. ``..:+++++o+` os` -+ ss `/+-`//. `-+/` `+s:` `o: `so `:+- :+++++/` os` -- ss /o` `+o` `++ +s: `` `so .+:` `:+++++: os:::o/ ss o/ /+++` :s ss. `ss/so` .:++++:.` os` -- ss ` /o``+/``o/:`++ +s: ` `so .oo. `.:/++++/. os` ss :+ /+:-` .-:+/` `+s:` o/ `so `+s: .-----:/++- `:++:- .:++::::+/ .:++//++:. ./++///+- .:o+:` :o/: `.-:::-/:` `` `--.``-/:` .:-` Flock: A Practical Serverless Streaming SQL Query Engine (https://github.com/flock-lab/flock) Copyright (c) 2020-present, UMD Data System Group. ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ This program is free software: you can use, redistribute, and/or modify it under the terms of the GNU Affero General Public License, version 3 or later ("AGPL"), as published by the Free Software Foundation. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. You should have received a copy of the GNU Affero General Public License along with this program. If not, seeFunction Output
```bash TART RequestId: 78a68707-3f3d-4244-a51a-584f9432709d Version: $LATEST [2021-12-15T15:20:56Z INFO nexmark_lambda::actor] Receiving a data packet: Uuid { tid: "q5-1639581654", seq_num: 0, seq_len: 10 } [2021-12-15T15:20:56Z INFO nexmark_lambda::actor] Window data collection has not been completed. END RequestId: 78a68707-3f3d-4244-a51a-584f9432709d REPORT RequestId: 78a68707-3f3d-4244-a51a-584f9432709d Duration: 38.83 ms Billed Duration: 66 ms Memory Size: 128 MB Max Memory Used: 17 MB Init Duration: 26.32 ms START RequestId: 23dae113-ccf3-449f-944f-116bb925daaf Version: $LATEST [2021-12-15T15:20:56Z INFO nexmark_lambda::actor] Receiving a data packet: Uuid { tid: "q5-1639581654", seq_num: 5, seq_len: 10 } [2021-12-15T15:20:56Z INFO nexmark_lambda::actor] Window data collection has not been completed. END RequestId: 23dae113-ccf3-449f-944f-116bb925daaf REPORT RequestId: 23dae113-ccf3-449f-944f-116bb925daaf Duration: 1.71 ms Billed Duration: 2 ms Memory Size: 128 MB Max Memory Used: 17 MB START RequestId: e5e51594-5819-494c-a6d3-c9c9ed9ab865 Version: $LATEST [2021-12-15T15:20:56Z INFO nexmark_lambda::actor] Receiving a data packet: Uuid { tid: "q5-1639581654", seq_num: 6, seq_len: 10 } [2021-12-15T15:20:56Z INFO nexmark_lambda::actor] Window data collection has not been completed. END RequestId: e5e51594-5819-494c-a6d3-c9c9ed9ab865 REPORT RequestId: e5e51594-5819-494c-a6d3-c9c9ed9ab865 Duration: 1.30 ms Billed Duration: 2 ms Memory Size: 128 MB Max Memory Used: 18 MB START RequestId: def2fc0b-61da-49f8-80b4-9e49f5f4a091 Version: $LATEST [2021-12-15T15:20:56Z INFO nexmark_lambda::actor] Receiving a data packet: Uuid { tid: "q5-1639581654", seq_num: 7, seq_len: 10 } [2021-12-15T15:20:56Z INFO nexmark_lambda::actor] Window data collection has not been completed. END RequestId: def2fc0b-61da-49f8-80b4-9e49f5f4a091 REPORT RequestId: def2fc0b-61da-49f8-80b4-9e49f5f4a091 Duration: 6.89 ms Billed Duration: 7 ms Memory Size: 128 MB Max Memory Used: 18 MB START RequestId: a18c2e75-d1a4-4595-aa84-4cde90eecad4 Version: $LATEST [2021-12-15T15:20:56Z INFO nexmark_lambda::actor] Receiving a data packet: Uuid { tid: "q5-1639581654", seq_num: 8, seq_len: 10 } [2021-12-15T15:20:56Z INFO nexmark_lambda::actor] Window data collection has not been completed. END RequestId: a18c2e75-d1a4-4595-aa84-4cde90eecad4 REPORT RequestId: a18c2e75-d1a4-4595-aa84-4cde90eecad4 Duration: 1.16 ms Billed Duration: 2 ms Memory Size: 128 MB Max Memory Used: 18 MB START RequestId: 01168950-7558-4af6-9e8c-8f71c4542149 Version: $LATEST [2021-12-15T15:20:56Z INFO nexmark_lambda::actor] Receiving a data packet: Uuid { tid: "q5-1639581654", seq_num: 9, seq_len: 10 } [2021-12-15T15:20:56Z INFO nexmark_lambda::actor] Window data collection has not been completed. END RequestId: 01168950-7558-4af6-9e8c-8f71c4542149 REPORT RequestId: 01168950-7558-4af6-9e8c-8f71c4542149 Duration: 8.22 ms Billed Duration: 9 ms Memory Size: 128 MB Max Memory Used: 18 MB START RequestId: c28193bb-b818-4cf4-863c-2e9d34dd2398 Version: $LATEST [2021-12-15T15:20:58Z INFO nexmark_lambda::actor] Receiving a data packet: Uuid { tid: "q5-1639581654", seq_num: 1, seq_len: 10 } [2021-12-15T15:20:58Z INFO nexmark_lambda::actor] Window data collection has not been completed. END RequestId: c28193bb-b818-4cf4-863c-2e9d34dd2398 REPORT RequestId: c28193bb-b818-4cf4-863c-2e9d34dd2398 Duration: 1.18 ms Billed Duration: 2 ms Memory Size: 128 MB Max Memory Used: 18 MB START RequestId: 46e54b2e-91da-4609-9f54-f152d38681c7 Version: $LATEST [2021-12-15T15:20:58Z INFO nexmark_lambda::actor] Receiving a data packet: Uuid { tid: "q5-1639581654", seq_num: 2, seq_len: 10 } [2021-12-15T15:20:58Z INFO nexmark_lambda::actor] Window data collection has not been completed. END RequestId: 46e54b2e-91da-4609-9f54-f152d38681c7 REPORT RequestId: 46e54b2e-91da-4609-9f54-f152d38681c7 Duration: 1.15 ms Billed Duration: 2 ms Memory Size: 128 MB Max Memory Used: 18 MB START RequestId: 2b1c1fe0-9556-4849-8251-39ede796f0f0 Version: $LATEST [2021-12-15T15:20:58Z INFO nexmark_lambda::actor] Receiving a data packet: Uuid { tid: "q5-1639581654", seq_num: 3, seq_len: 10 } [2021-12-15T15:20:58Z INFO nexmark_lambda::actor] Window data collection has not been completed. END RequestId: 2b1c1fe0-9556-4849-8251-39ede796f0f0 REPORT RequestId: 2b1c1fe0-9556-4849-8251-39ede796f0f0 Duration: 1.08 ms Billed Duration: 2 ms Memory Size: 128 MB Max Memory Used: 18 MB START RequestId: 78c64a1a-b312-4099-b596-541c078b04b7 Version: $LATEST [2021-12-15T15:20:58Z INFO nexmark_lambda::actor] Receiving a data packet: Uuid { tid: "q5-1639581654", seq_num: 4, seq_len: 10 } [2021-12-15T15:20:58Z INFO nexmark_lambda::actor] Received all data packets for the window: "q5-1639581654" [2021-12-15T15:20:58Z INFO nexmark_lambda::actor] +---------+-----+ | auction | num | +---------+-----+ | 1500 | 841 | +---------+-----+ ```Advanced Usage
flock-cli has a number of advanced features that can be used to control and customize the behavior of Flock.
For example, to delete all functions, you can use the flock-cli lambda -D command. Or use the flock-cli lambda -d <function pattern> command to delete specific functions. To list all functions, use the flock-cli lambda -L command.
To see the help for the nexmark run command, issue the command: flock-cli nexmark run -h
```ignore Runs the NEXMark Benchmark
USAGE: flock-cli nexmark run [OPTIONS]
OPTIONS: -a, --async-type Runs the NEXMark benchmark with async function invocations
-e, --events-per-second <events per second>
Runs the NEXMark benchmark with a number of events per second [default: 1000]
-g, --generators <data generators>
Runs the NEXMark benchmark with a number of data generators [default: 1]
-h, --help
Print help information
--log-level <log-level>
Log level [default: info] [possible values: error, warn, info, debug, trace, off]
-m, --memory-size <memory size>
Sets the memory size (MB) for the worker function [default: 128]
-q, --query <query number>
Sets the NEXMark benchmark query number [default: 3] [possible values: 0, 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 13]
-r, --arch <architecture>
Sets the architecture for the worker function [default: x86_64] [possible values:
x86_64, arm64]
-s, --seconds <duration>
Runs the NEXMark benchmark for a number of seconds [default: 20]
--silent
Suppress all output
-t, --data-sink-type <data sink type>
Runs the NEXMark benchmark with a data sink type [default: blackhole] [possible values:
sqs, s3, dynamodb, efs, blackhole]
--trace
Log ultra-verbose (trace level) information
```
License
Copyright (c) 2020-present UMD Database Group. The library, examples, and all source code are released under AGPL-3.0 License.
Owner
- Name: flock-lab
- Login: flock-lab
- Kind: organization
- Repositories: 1
- Profile: https://github.com/flock-lab
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Liao" given-names: "Gang" - family-names: "Abadi" given-names: "Daniel" title: "Flock: A Practical Serverless Streaming SQL Query Engine" url: "https://github.com/flock-lab/flock"
GitHub Events
Total
- Watch event: 21
- Fork event: 2
Last Year
- Watch event: 21
- Fork event: 2
Dependencies
- base64 0.13.0
- chrono 0.4.19
- env_logger ^0.9
- futures 0.3
- humantime 2.1.0
- itertools 0.10.0
- lazy_static 1.4
- log 0.4.14
- openssl 0.10.32
- rand 0.8.3
- reqwest 0.11.7
- serde_json 1.0
- tokio 1.4
- async-trait 0.1.42
- aws_lambda_events 0.6
- base64 0.13.0
- bytes 1.0.1
- chrono 0.4.19
- env_logger ^0.9
- fake 2.4
- filetime 0.2
- fixedbitset 0.4.0
- futures 0.3.12
- glob 0.3
- hashbrown 0.12
- humantime 2.1.0
- indoc 1.0.3
- itertools 0.10.0
- json 0.12.4
- lazy_static 1.4
- log 0.4.14
- lz4 1.23.1
- mimalloc 0.1
- num_cpus 1.13.0
- openssl 0.10.32
- rand 0.8.3
- rayon 1.5
- regex 1.4.3
- remove_dir_all 0.7
- rust-ini 0.18
- serde 1.0
- serde_bytes 0.11
- serde_json 1.0
- snap 1.0.3
- snmalloc-rs 0.2
- sqlparser 0.14.0
- text_io 0.1.8
- tokio 1.4
- typetag 0.1.8
- url 2.0
- uuid 0.8.2
- zstd 0.9.0+zstd.1.5.0
- anyhow 1.0.51
- clap 3.0.0
- ctrlc 3.1.1
- env_logger ^0.9
- futures 0.3.12
- lazy_static 1.4.0
- log 0.4.14
- rust-ini 0.18
- rustyline 9.0.0
- sqlparser 0.14.0
- tokio 1.4
- zip 0.5.12
- async-trait 0.1.42
- aws_lambda_events 0.6
- base64 0.13.0
- bytes 1.1.0
- chrono 0.4.19
- env_logger ^0.9
- futures 0.3.12
- itertools 0.10.0
- lazy_static 1.4
- log 0.4.14
- mimalloc 0.1
- openssl 0.10.32
- rand 0.8.3
- rayon 1.5
- serde_json 1.0
- snmalloc-rs 0.2
- text_io 0.1.8
- tokio 1.4
- uuid 0.8.2
- async-trait 0.1.42
- aws_lambda_events 0.6
- chrono 0.4.19
- env_logger ^0.9
- futures 0.3.12
- indoc 1.0.3
- itertools 0.10.0
- lazy_static 1.4
- log 0.4.14
- mimalloc 0.1
- rayon 1.5
- serde 1.0
- serde_json 1.0
- snmalloc-rs 0.2
- tokio 1.4
- typetag 0.1.8
- anyhow 1.0.51
- clap 3.0.0
- env_logger ^0.9
- humantime 2.1.0
- itertools 0.10.0
- log 0.4.14
- Bogdanp/setup-racket v0.10 composite
- actions/checkout main composite
- actions/download-artifact v2 composite
- actions/upload-artifact v2 composite
- peaceiris/actions-gh-pages v3 composite
- actions-rs/toolchain v1 composite
- actions/cache v2 composite
- actions/checkout v2 composite
- dorny/paths-filter v2 composite
- actions-rs/toolchain v1 composite
- actions/cache v2 composite
- actions/checkout v2 composite
- actions/download-artifact v2 composite
- actions/upload-artifact v2 composite
- softprops/action-gh-release v1 composite