https://github.com/awslabs/barometer
A tool to automate analytic platform evaluations. Barometer helps customers to get data points needed for service selection/service configurations for given workload
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.2%) to scientific vocabulary
Keywords
Repository
A tool to automate analytic platform evaluations. Barometer helps customers to get data points needed for service selection/service configurations for given workload
Basic Info
Statistics
- Stars: 19
- Watchers: 4
- Forks: 2
- Open Issues: 9
- Releases: 0
Topics
Metadata Files
README.md
Barometer
A tool to automate analytic platform evaluations
Barometer helps customers to get data points needed for service selection/service configurations for given workload.
Barometer tool is created by AWS Prototyping team (EMEA)
📋 Table of content
- Description
- Use cases
- Pre-requisites
- Installing
- Deployment
- Quickstart
- Run Benchmark Only
- Bring your own workload
- Architecture
- Cleanup
- See Also
🔰 Description
Barometer will deploy cdk stack which is used to run benchmarking experiments. The experiment is a combination of platform and workload which can be defined using cli-wizard provided by Barometer tool. Example running experiment in Quickstart.
🛠 Use cases
- Comparison of service performance: Redshift vs Redshift Serverless
- Comparison of configurations: Redshift dc2 vs ra3 node type
- Performance impact of feature: Redshift AQUA vs Redshift WLM
- Right tool for the job selection: Athena vs Redshift for your workload
- Registering your custom platform: Redshift vs My Own Database
- Registering your custom workload: My own dataset vs Redshift
- Run benchmarking only on my platform
- Bring your own workload (dataset, ddl and queries to benchmark)
Barometer supports below combinations as experiment
Supported platforms:
Supported workloads:
🎒 Pre-requisites
- Docker: Install docker service and docker cli. This tool uses docker to build image and run containers.
- Minimum disk space of 2 GB for building and deploying docker image
🚀 Installing
Clone this repository and run docker build -t barometer . in barometer directory (root of the git project)
🎮 Deployment
- Run below command to deploy
barometerto your aws account.
```shell
Example 1: Passing local aws credentials to the docker container for deployment (deploying in eu-west-1 region)
docker run --rm -it -v /var/run/docker.sock:/var/run/docker.sock -v ~/.aws:/root/.aws barometer deploy eu-west-1
Example 2: Using AWS profile (ex: dev) to deploy
docker run --rm -it -v /var/run/docker.sock:/var/run/docker.sock -v ~/.aws:/root/.aws -e AWS_PROFILE=dev barometer deploy eu-west-1
Example 3: Passing aws region as environment variable
docker run --rm -it -v /var/run/docker.sock:/var/run/docker.sock \ -v ~/.aws:/root/.aws -e AWSPROFILE=dev \ -e AWSREGION=eu-west-1 barometer deploy
Example 4: Using aws secret access key and aws secret id to deploy (with optional session token - temporary credentials)
docker run --rm -it -v /var/run/docker.sock:/var/run/docker.sock \
-e AWSACCESSKEYID=
```
- Run below command to run
cli-wizardoncebarometeris successfully deployed to your AWS account.
```shell
Example 1: Passing local aws credentials to the docker container for running wizard (deployed in eu-west-1 region)
docker run -it -v /var/run/docker.sock:/var/run/docker.sock -v ~/.aws:/root/.aws \ --name barometer-wizard \ barometer wizard eu-west-1
Example 2: Using AWS profile (ex: dev) to run wizard
docker run -it -v /var/run/docker.sock:/var/run/docker.sock -v ~/.aws:/root/.aws -e AWS_PROFILE=dev \ --name barometer-wizard \ barometer wizard eu-west-1
Example 3: Using aws secret access key and aws secret id to run wizard (with optional session token - temporary credentials)
docker run -it -v /var/run/docker.sock:/var/run/docker.sock \
-e AWSACCESSKEYID=
Example 4: Reusing wizard configurations
docker start -ia barometer-wizard
Example 5: Persisting wizard configurations
docker run -it -v /var/run/docker.sock:/var/run/docker.sock -v ~/.aws:/root/.aws \ -v ~/storage:/build/cli-wizard/storage \ --name barometer-wizard \ barometer wizard eu-west-1 ```
🎬 Quickstart

Run benchmark only
This option can be used as
Benchmark your own platformorBring your own platform
You can directly benchmark any database with this option. The option is available
under Manage Experiments > Run benchmarking only. Depending on where the database is hosted you need to follow below
steps as prerequisites to use run benchmark only option.

If database and Barometer is in the same VPC
- Create a new secret manager secret having values in below defined json format. All properties are case-sensitive and
required except
dbClusterIdentifier
json
{
"username": "database-user",
"password": "*******",
"engine": "redshift",
"host": "my-database-host.my-domain.com",
"port": 5439,
"dbClusterIdentifier": "redshift-cluster-1",
"dbname": "dev"
}
- Add tag to the secret Tag name =
ManagedBy, Tag Value =BenchmarkingStack. This is for Barometer to have permissions to use it - Upload your benchmarking queries to the
DataBucket(Bucket created by BenchmarkingStack, available as Output) in new folder with any name (for example:my-benchmarking-queries). Note: the queries can have any name and will be executed in sorted order of their names.
s3://benchmarkingstack-databucket-random-id
my-benchmarking-queries
|
| +-- query1.sql
| +-- query2.sql
- Allow network connection from
QueryRunnerSG(Available as Output of BenchmarkingStack) to your database security group
If database and Barometer is not in the same VPC
In addition to the steps 1,2 and 3 mentioned above (both in the same VPC), follow below steps to Establish VPC Peering connection between BenchmarkingVPC and the VPC where database is hosted.
- Go to VPC console > Peering connection menu from left navigation
- Create new Peering connection selecting both VPCs (BenchmarkingVPC and DatabaseVPC)
- Accept peering connection request from Action menu
- Go to the VPC > Route tables and select any route table associated with BenchmarkingStack subnet
- Add new route with Destination = CIDR range of the DatabaseVPC and Target = Peering connection id (starts
with
pcx-) - Repeat steps 4 and 5 for route table associated with BenchmarkingStack second subnet
- Go to the VPC > Route tables and select route table associated with DatabaseVPC subnet (if using default VPC select the only route table available)
- Add new route with Destination =
10.0.0.0/16and Target = Peering connection id (starts withpcx-) - Follow last
step 4 - allow network connectionfromboth in the same VPCabove.
Bring your own workload (BYOW)
You can bring your own workload for benchmarking to Barometer. In this context, workload is defined as files arranged in specific structure on your s3 bucket. To bring your own workload for the benchmarking you need to follow below steps as prerequisites.

- Prepare workload on your s3 bucket. It should contain folder structure as defined below. You can create folder with
the name of your workload (ex:
my-workload) at any level in your s3 bucket. The root of your workload folder should have three sub-directories calledvolumes,ddlandbenchmarking-queries.volumessub-directory: this directory contains scale factor for your workload. for example your workload may have dataset available in1gb,50gband1tbscales. You can create as many scale factors as you want with minimum one. Within each scale factor sub-directory you should have directory matchingtable namewith all table data in.parquetformat under it.ddlsub-directory: this directory contains ddl-scripts to create tables respective to the platform in question. For example, ddl-scripts forredshiftplatform should go under redshift folder and ddl specific tomysqlshould be placed under its own directory matching with platform name. You can place more than one ddl scripts too, they will be executed in order of their names.benchmarking-queriessub-directory: this directory contains benchmarking queries with respect to the platform in question. You can place more than one benchmarking-query files, they will be executed in order of their names per user session.
```
Requires my-workload (can be any name) to follow convention on s3 bucket
my-workload
| +-- volumes
| | +-- 1gb
| | | +-- table_name_1
| | | | +-- file-1.parquet
| | | | +-- file-2.parquet
| | | +-- table_name_2
| | | | +-- file-1.parquet
| | | | +-- file-2.parquet
| +-- ddl
| | +-- redshift
| | | +-- ddl.query1.sql
| | | +-- ddl.query2.sql
| | +-- mysql
| | | +-- ddl.query.sql
| +-- benchmarking-queries
| | +-- redshift
| | | +-- query1.sql
| | | +-- query2.sql
| | +-- mysql
| | | +-- query1.sql
| | | +-- query2.sql
```
- Run the cli-wizard and go to
Manage workload > Add new workloadto import your workload. Wizard will validate and import workload if structure validation is successful. - Wizard will print
bucket policywhile importing your workload. Please update your s3 bucket's bucket policy with printed one.
BYOW sample
In this project, you can find a BYOW example (custom-workload directory). You can create the same structure as mentioned above, by copying these 3 directories (SQL and DDL statements, and dataset) to your S3 bucket. After this, you can run this workload using the Barometer cli-wizard, to configure it as a "BYOW from S3" workload.
custom-workload/benchmarking-queries/redshift
- Contains 5 SQL OLAP-like queries (.sql files).
- It disables the Redshift query results cache
- It tags the sessions for better monitoring.
custom-workload/ddl/redshift
- It creates three tables: one Fact table and two dimensions.
- It doesn't specify any Distribution Styles, nor Sort keys. Redshift will create these automatically, based on the workloads. You're free to change these, to analyze their query plans and performance.
custom-workload/volumes/small
- A small (less than 30MB) dataset, containing the data for the 3 tables above in Apache Parquet format.
Architecture
User flow

- User deploys Barometer Benchmarking Stack
- Barometer Benchmarking stack creates infrastructure & step function workflows
- User uses cli-wizard to define & run experiments which triggers experiment runner workflow internally
- Workflow deploys, benchmarks & destroys platform (additional cloudformation stack to deploy service, e.g. Redshift Cluster)
- Workflow creates persistent dashboard registering metrics
- User uses this dashboard to compare benchmarking results
Detailed architecture for Redshift platform

Cleanup
- To clean up any platform, delete stack with name starting with platform name. Example:
redshift-xyz - Go to Cloudformation service and select stack named
BenchmarkingStack(or runcdk destroyfrom cdk-stack folder)
👀 See Also
- Architectural & design concepts driving this project
- Benchmarking Stack infrastructure
- Cli Wizard
- How to add new platform support
- How to add new workload support
Owner
- Name: Amazon Web Services - Labs
- Login: awslabs
- Kind: organization
- Location: Seattle, WA
- Website: http://amazon.com/aws/
- Repositories: 914
- Profile: https://github.com/awslabs
AWS Labs
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: almost 2 years ago
All Time
- Total issues: 3
- Total pull requests: 20
- Average time to close issues: N/A
- Average time to close pull requests: 10 days
- Total issue authors: 1
- Total pull request authors: 3
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 14
- Bot issues: 0
- Bot pull requests: 18
Past Year
- Issues: 0
- Pull requests: 12
- Average time to close issues: N/A
- Average time to close pull requests: 7 days
- Issue authors: 0
- Pull request authors: 3
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 6
- Bot issues: 0
- Bot pull requests: 10
Top Authors
Issue Authors
- anandshah123 (3)
Pull Request Authors
- dependabot[bot] (17)
- anandshah123 (1)
- badogan (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- com.amazonaws:aws-java-sdk-bom 1.12.178 import
- com.amazon.redshift:redshift-jdbc42 2.1.0.5 provided
- com.amazonaws.secretsmanager:aws-secretsmanager-jdbc 1.0.7
- com.amazonaws:aws-java-sdk-cloudwatchmetrics
- com.amazonaws:aws-java-sdk-s3
- com.amazonaws:aws-lambda-java-core 1.2.1
- com.amazonaws:aws-lambda-java-log4j2 1.5.1
- com.google.code.gson:gson 2.9.0
- 441 dependencies
- @aws-cdk/assert 1.151.0 development
- @types/adm-zip ^0.4.34 development
- @types/jest ^27.4.1 development
- @types/node ^17.0.23 development
- aws-cdk ^1.151.0 development
- jest ^27.3.1 development
- ts-jest ^27.1.4 development
- ts-node ^10.7.0 development
- typescript ~4.6.3 development
- @aws-cdk/aws-dynamodb 1.151.0
- @aws-cdk/aws-ec2 1.151.0
- @aws-cdk/aws-ecs 1.151.0
- @aws-cdk/aws-iam 1.151.0
- @aws-cdk/aws-kms 1.151.0
- @aws-cdk/aws-lambda 1.151.0
- @aws-cdk/aws-s3 1.151.0
- @aws-cdk/aws-sns 1.151.0
- @aws-cdk/aws-sns-subscriptions 1.151.0
- @aws-cdk/aws-stepfunctions 1.151.0
- @aws-cdk/aws-stepfunctions-tasks 1.151.0
- @aws-cdk/core 1.151.0
- source-map-support ^0.5.21
- 636 dependencies
- @aws-cdk/assert 1.150.0 development
- @testing-library/jest-dom ^5.14.1 development
- @testing-library/react ^12.1.0 development
- @testing-library/user-event ^13.2.1 development
- @types/inquirer ^8.2.0 development
- @types/jest ^27.0.1 development
- @types/node ^17.0.23 development
- @typescript-eslint/eslint-plugin ^5.17.0 development
- @typescript-eslint/parser ^5.17.0 development
- aws-cdk ^1.150.0 development
- esbuild ^0.12.28 development
- eslint ^7.32.0 development
- jest ^27.2.0 development
- ts-jest ^27.0.5 development
- ts-node ^10.2.1 development
- typescript ~4.6.3 development
- @aws-cdk/aws-athena ^1.150.0
- @aws-cdk/aws-redshift ^1.150.0
- @aws-cdk/core 1.150.0
- @aws-sdk/client-cloudformation ^3.58.0
- @aws-sdk/client-lambda ^3.58.0
- @aws-sdk/client-s3 ^3.58.0
- @aws-sdk/client-sfn ^3.58.0
- inquirer ^8.2.2
- joi ^17.4.2
- open ^8.4.0
- source-map-support ^0.5.21
- uuid ^8.3.2
- alpine 3.15 build
- maven 3-openjdk-8 build
- node 17-alpine build
- public.ecr.aws/lambda/java 8.al2 build