https://github.com/awslabs/sagemaker-deep-demand-forecast

Using Deep Learning for Demand Forecasting with Amazon SageMaker

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.3%) to scientific vocabulary

Keywords

aws-sagemaker deep-learning demand-forecast forecasting gluonts time-series

Last synced: 5 months ago · JSON representation

Repository

Using Deep Learning for Demand Forecasting with Amazon SageMaker

Basic Info

Host: GitHub
Owner: awslabs
License: apache-2.0
Language: Python
Default Branch: mainline
Homepage:
Size: 14.4 MB

Statistics

Stars: 85
Watchers: 10
Forks: 28
Open Issues: 2
Releases: 0

Topics

aws-sagemaker deep-learning demand-forecast forecasting gluonts time-series

Created almost 6 years ago · Last pushed over 3 years ago

Metadata Files

Readme Contributing License Code of conduct

Deep Demand Forecasting with Amazon SageMaker

This project provides an end-to-end solution for **Demand Forecasting** task using a new state-of-the-art *Deep Learning* model [LSTNet](https://arxiv.org/abs/1703.07015) available in [GluonTS](https://github.com/awslabs/gluon-ts) and [Amazon SageMaker](https://aws.amazon.com/sagemaker/). ## Overview ### How Does the Input Data Look Like? The input data is a **multi-variate time-series**. An example includes hourly [electricity consumption]((https://archive.ics.uci.edu/ml/datasets/ElectricityLoadDiagrams20112014)) of 321 users over the period of 41 months. Here is a snapshot of the normalized data

sample

How to Prepare Your Data to Feed into the Model?

We have provided example of how to feed your time-series data with GluonTS in the notebook. To convert CSV data or other formats to GluonTS format, please see the customization.

What Are the Outputs?

A trained LSTNet model and
A SageMaker endpoint that can predict the future (multi-variate) values given a prediction interval

For example, we can estimate the hourly electricity consumption of 321 users for the coming week.

What Algorithm is Used?

We have implemented LSTNet which is a state-of-the-art Deep Learning model and is available in GluonTS.

What is the Estimated Cost?

Running the solution end-to-end costs less than $5 USD. Please make sure you have read the cleaning up part here.

How does the Data Flow look like?

data_flow

Solution Details

Demand forecasting uses historical time-series data to help streamline the supply-demand decision-making process across businesses. Examples include predicting the number of

Customer representatives to hire for multiple locations in the next month
Product sales across multiple regions in the next quarter
Cloud server usage for next day for a video streaming service
Electricity consumption for multiple regions over the next week
IoT devices and sensors such as energy consumption

Deep Learning for Time Series Forecasting

The status quo approaches for time-series forecasting include:

Auto Regressive Integrated Moving Average (ARIMA) for univariate time-series data and
Vector Auto-Regression (VAR) for multi-variate time-series data

These methods often require tedious data preprocessing and features generation prior to model training. One main advantage of Deep Learning (DL) methods such as LSTNet is automating the feature generation step prior to model training such as incorporating various data normalization, lags, different time scales, some categorical data, dealing with missing values, etc. with better prediction power and fast GPU-enabled training and deployment.

Please check out our blog post for more details.

Getting Started

You will need an AWS account to use this solution. Sign up for an account here.

To run this JumpStart 1P Solution and have the infrastructure deploy to your AWS account you will need to create an active SageMaker Studio instance (see Onboard to Amazon SageMaker Studio). When your Studio instance is Ready, use the instructions in SageMaker JumpStart to 1-Click Launch the solution.

The solution artifacts are included in this GitHub repository for reference.

Note: Solutions are available in most regions including us-west-2, and us-east-1.

Caution: Cloning this GitHub repository and running the code manually could lead to unexpected issues! Use the AWS CloudFormation template. You'll get an Amazon SageMaker Notebook instance that's been correctly setup and configured to access the other resources in the solution.

cloudformation/
- deep-demand-forecast.yaml: The root cloudformation nested stack which creates the AWS stack for this solution
- deep-demand-forecast-sagemaker-notebook-instance.yaml: Creates SageMaker notebook instance
- deep-demand-forecast-permissions.yaml: Manages all the permission necessary to launch the stack
- deep-demand-forecast-endpoint.yaml: Creates demo endpoint using in demo.ipynb
- solution-assistant: Deletes the created resources such as endpoint, S3 bucket etc. during cleanup
src/
- preprocess/
- container/: To build and register the preprocessing ECR job
  - Dockerfile: Docker container config
  - build_and_push.sh: Build and push bash scripts used in deep-demand-forecast.ipynb
  - requirements.txt: Dependencies for preprocess.py
- container_build/: Uses CodeBuild to the build the container for ECR
- preprocess.py: Preprocessing script
- deep_demand_forecast/: Contains the train and inference code
- train.py: SageMaker train code
- inference.py: SageMaker inference code
- data.py: GluonTS data preparation
- metrics.py: A training metric
- monitor.py: Preparing results for visualization
- utils.py: Helper functions
- requirements.txt: Dependencies for SageMaker MXNet Estimator
- demo.ipynb: Demo notebook to quickly get some predictions from the demo endpoint
- deep-demand-forecast.ipynb: See below

What Does `deep-demand-forecast.ipynb` Offer?

The notebook trains an LSTNet estimator on electricity consumption data which is multivariate time-series dataset capturing the electricity consumption (in kW) with 15min frequency from 2011-01-01 to 2014-05-26. We compare the model performance by visualizing the metrics MASE vs. sMAPE.

Finally, we deploy an endpoint for the trained model and can interactively compare its performance by comparing the train, test data and predictions.

interactive

For example, here, re-training with more epochs would be helpful to increase the model performance and we can re-deploy.

Architecture Overview

Here is architecture for the end-to-end training and deployment process

Solution Architecture

The input data located in an Amazon S3 bucket
The provided SageMaker notebook that gets the input data and launches the later stages below
Preprocessing step to normalize the input data. We use SageMaker processing job that is designed as a microservice. This allows users to build and register their own Docker image via Amazon ECR and execute the job using Amazon SageMaker
Training an LSTNet model using the previous preprocessed step and evaluating its results using Amazon SageMaker. If desired, one can deploy the trained model and create a SageMaker endpoint
SageMaker endpoint created from the previous step, is an HTTPS endpoint and is capable of producing predictions
Monitoring the training and deployed model via Amazon CloudWatch

Here is the architecture of the inference

Solution Architecture

The input data, located in an Amazon S3 bucket
From SageMaker notebook, normalize the new input data using the statistics of the training data
Sending the requests to the SageMaker endpoint
Predictions ## Cleaning Up

When you've finished with this solution, make sure that you delete all unwanted AWS resources. AWS CloudFormation can be used to automatically delete all standard resources that have been created by the solution and notebook. Go to the AWS CloudFormation Console, and delete the parent stack. Choosing to delete the parent stack will automatically delete the nested stacks.

Caution: You need to manually delete any extra resources that you may have created in this notebook. Some examples include, extra Amazon S3 buckets (to the solution's default bucket), extra Amazon SageMaker endpoints (using a custom name), and extra Amazon ECR repositories.

Customization

To use your own data, please take a look at

Extensive GluonTS tutorials
Consult with the dataset API

Useful Resources

License

This project is licensed under the Apache-2.0 License.

Owner

Name: Amazon Web Services - Labs
Login: awslabs
Kind: organization
Location: Seattle, WA

Website: http://amazon.com/aws/
Repositories: 914
Profile: https://github.com/awslabs

AWS Labs

GitHub Events

Total

Watch event: 5
Fork event: 3

Last Year

Watch event: 5
Fork event: 3

Committers

Last synced: about 2 years ago

All Time

Total Commits: 113
Total Committers: 5
Avg Commits per committer: 22.6
Development Distribution Score (DDS): 0.177

Past Year

Commits: 2
Committers: 1
Avg Commits per committer: 2.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Ehsan M. Kermani	6****k	93
Ehsan M. Kermani	k**n@a**m	16
Patrick Yang	3****7	2
Amazon GitHub Automation	5****o	1
Alex Voitau	v****u	1

Committer Domains (Top 20 + Academic)

amazon.com: 1

Issues and Pull Requests

Last synced: almost 2 years ago

All Time

Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

HaseoBoss (1)

Pull Request Authors

Top Labels

Issue Labels

bug (1)

Pull Request Labels

Dependencies

cloudformation/solution-assistant/requirements.in pypi

crhelper *

cloudformation/solution-assistant/requirements.txt pypi

crhelper ==2.0.6

src/deep_demand_forecast/requirements.in pypi

altair ==4.1.0
gluonts ==0.5.1

src/deep_demand_forecast/requirements.txt pypi

Pillow >=8.1.1
altair ==4.1.0
attrs ==20.2.0
certifi ==2020.6.20
cycler ==0.10.0
dataclasses ==0.7
entrypoints ==0.3
gluonts ==0.5.1
holidays ==0.9.12
importlib-metadata ==2.0.0
jinja2 ==2.11.3
jsonschema ==3.2.0
kiwisolver ==1.2.0
markupsafe ==1.1.1
matplotlib ==3.3.2
numpy ==1.19.2
pandas ==1.0.5
pydantic ==1.7.4
pyparsing ==2.4.7
pyrsistent ==0.17.3
python-dateutil ==2.8.1
pytz ==2020.1
six ==1.15.0
toolz ==0.11.1
tqdm ==4.51.0
ujson ==1.35
zipp ==3.4.0

src/preprocess/container/codebuild/requirements.in pypi

gluonts ==0.5.1
mxnet ==1.6

src/preprocess/container/codebuild/requirements.txt pypi

Pillow >=8.1.1
certifi ==2020.6.20
chardet ==3.0.4
cycler ==0.10.0
dataclasses ==0.7
gluonts ==0.5.1
graphviz ==0.8.4
holidays ==0.9.12
idna ==2.10
kiwisolver ==1.2.0
matplotlib ==3.3.2
mxnet ==1.6
numpy ==1.19.2
pandas ==1.0.5
pydantic ==1.7.4
pyparsing ==2.4.7
python-dateutil ==2.8.1
pytz ==2020.1
requests ==2.24.0
six ==1.15.0
tqdm ==4.51.0
ujson ==1.35
urllib3 >=1.26.5

src/preprocess/container/local/requirements.in pypi

gluonts ==0.5.1
mxnet ==1.6

src/preprocess/container/local/requirements.txt pypi

certifi ==2020.6.20
chardet ==3.0.4
cycler ==0.10.0
dataclasses ==0.7
gluonts ==0.5.1
graphviz ==0.8.4
holidays ==0.9.12
idna ==2.10
kiwisolver ==1.2.0
matplotlib ==3.3.2
mxnet ==1.6
numpy ==1.19.2
pandas ==1.0.5
pillow ==8.0.1
pydantic ==1.7
pyparsing ==2.4.7
python-dateutil ==2.8.1
pytz ==2020.1
requests ==2.24.0
six ==1.15.0
tqdm ==4.51.0
ujson ==1.35
urllib3 ==1.25.11

.github/workflows/ci.yaml actions

actions/checkout v2 composite
actions/setup-python v2 composite

.github/workflows/codeql.yaml actions

actions/checkout v2 composite
github/codeql-action/analyze v1 composite
github/codeql-action/init v1 composite

src/preprocess/container/codebuild/Dockerfile docker

${BASE_IMAGE} latest build

src/preprocess/container/local/Dockerfile docker

python 3.6-slim-buster build

https://github.com/awslabs/sagemaker-deep-demand-forecast

Science Score: 10.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Deep Demand Forecasting with Amazon SageMaker

How to Prepare Your Data to Feed into the Model?

What Are the Outputs?

What Algorithm is Used?

What is the Estimated Cost?

How does the Data Flow look like?

Solution Details

Deep Learning for Time Series Forecasting

Getting Started

Contents

What Does deep-demand-forecast.ipynb Offer?

Architecture Overview

Customization

Useful Resources

License

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies

What Does `deep-demand-forecast.ipynb` Offer?