https://github.com/mikekeith52/scalecast

The practitioner's forecasting library

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 3 DOI reference(s) in README
✓
Academic publication links
Links to: springer.com
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.2%) to scientific vocabulary

Keywords

auto-ml data-science deep-learning easy-to-use forecasting keras lstm machine-learning mase msis pandas python recurrent-neural-networks scikit-learn scikit-learn-python smape time-series vecm

Last synced: 6 months ago · JSON representation

Repository

The practitioner's forecasting library

Basic Info

Host: GitHub
Owner: mikekeith52
License: mit
Language: Python
Default Branch: main
Homepage:
Size: 1.16 GB

Statistics

Stars: 341
Watchers: 5
Forks: 40
Open Issues: 189
Releases: 5

Topics

auto-ml data-science deep-learning easy-to-use forecasting keras lstm machine-learning mase msis pandas python recurrent-neural-networks scikit-learn scikit-learn-python smape time-series vecm

Created over 4 years ago · Last pushed 7 months ago

Metadata Files

Readme Contributing License Code of conduct

Scalecast

Scalecast Logo

About

Scalecast helps you forecast time series. Here is how to initiate its main object: ```python from scalecast.Forecaster import Forecaster

f = Forecaster( y = arrayofvalues, currentdates = arrayofdates, futuredates=fcsthorizonlength, test_length = 0, # do you want to test all models? if so, on how many or what percent of observations? cis = False, # evaluate conformal confidence intervals for all models? metrics = ['rmse','mape','mae','r2'], # what metrics to evaluate over the validation/test sets? ) ``Uniform ML modeling (with models from a diverse set of libraries, including scikit-learn, statsmodels, and tensorflow), reporting, and data visualizations are offered through theForecasterandMVForecaster` interfaces. Data storage and processing then becomes easy as all applicable data, predictions, and many derived metrics are contained in a few objects with much customization available through different modules. Feature requests and issue reporting are welcome! Don't forget to leave a star!⭐

Documentation

Popular Features

Easy LSTM Modeling: setting up an LSTM model for time series using tensorflow is hard. Using scalecast, it's easy. Many tutorials and Kaggle notebooks that are designed for those getting to know the model use scalecast (see the aritcle). python f.set_estimator('lstm') f.manual_forecast( lags=36, batch_size=32, epochs=15, validation_split=.2, activation='tanh', optimizer='Adam', learning_rate=0.001, lstm_layer_sizes=(100,)*3, dropout=(0,)*3, )
Auto lag, trend, and seasonality selection: python f.auto_Xvar_select( # iterate through different combinations of covariates estimator = 'lasso', # what estimator? alpha = .2, # estimator hyperparams? monitor = 'ValidationMetricValue', # what metric to monitor to make decisions? cross_validate = True, # cross validate cvkwargs = {'k':3}, # 3 folds )
Hyperparameter tuning using grid search and time series cross validation: ```python from scalecast import GridGenerator

GridGenerator.getexamplegrids() models = ['ridge','lasso','xgboost','lightgbm','knn'] f.tunetestforecast( models, limitgridsize = .2, featureimportance = True, # save pfi feature importance for each model? crossvalidate = True, # cross validate? if False, using a seperate validation set that the user can specify rolling = True, # rolling time series cross validation? k = 3, # how many folds? ) 4. **Plotting results:** plot test predictions, forecasts, fitted values, and more.python import matplotlib.pyplot as plt

fig, ax = plt.subplots(2,1, figsize = (12,6)) f.plottestset(models=models,orderby='TestSetRMSE',ax=ax[0]) f.plot(models=models,orderby='TestSetRMSE',ax=ax[1]) plt.show() 5. **Pipelines that include transformations, reverting, and backtesting:**python from scalecast import GridGenerator from scalecast.Pipeline import Transformer, Reverter, Pipeline from scalecast.util import findoptimaltransformation, backtest_metrics

def forecaster(f): models = ['ridge','lasso','xgboost','lightgbm','knn'] f.tunetestforecast( models, limitgridsize = .2, # randomized grid search on 20% of original grid sizes featureimportance = True, # save pfi feature importance for each model? crossvalidate = True, # cross validate? if False, using a seperate validation set that the user can specify rolling = True, # rolling time series cross validation? k = 3, # how many folds? )

transformer, reverter = findoptimaltransformation(f) # just one of several ways to select transformations for your series

pipeline = Pipeline( steps = [ ('Transform',transformer), ('Forecast',forecaster), ('Revert',reverter), ] )

f = pipeline.fitpredict(f) backtestresults = pipeline.backtest(f) metrics = backtestmetrics(backtestresults) 6. **Model stacking:** There are two ways to stack models with scalecast, with the [`StackingRegressor`](https://medium.com/towards-data-science/expand-your-time-series-arsenal-with-these-models-10c807d37558) from scikit-learn or using [its own stacking procedure](https://medium.com/p/7977c6667d29).python from scalecast.auxmodels import auto_arima

f.setestimator('lstm') f.manualforecast( lags=36, batchsize=32, epochs=15, validationsplit=.2, activation='tanh', optimizer='Adam', learningrate=0.001, lstmlayer_sizes=(100,)3, dropout=(0,)3, )

f.setestimator('prophet') f.manualforecast()

auto_arima(f)

stack previously evaluated models

f.addsignals(['lstm','prophet','arima']) f.setestimator('catboost') f.manualforecast() 7. **Multivariate modeling and multivariate pipelines:**python from scalecast.MVForecaster import MVForecaster from scalecast.Pipeline import MVPipeline from scalecast.util import findoptimaltransformation, backtestmetrics from scalecast import GridGenerator

GridGenerator.getmvgrids()

def mvforecaster(mvf): models = ['ridge','lasso','xgboost','lightgbm','knn'] mvf.tunetestforecast( models, limitgridsize = .2, # randomized grid search on 20% of original grid sizes cross_validate = True, # cross validate? if False, using a seperate validation set that the user can specify rolling = True, # rolling time series cross validation? k = 3, # how many folds? )

mvf = MVForecaster(f1,f2,f3) # can take N Forecaster objects

transformer1, reverter1 = findoptimaltransformation(f1) transformer2, reverter2 = findoptimaltransformation(f2) transformer3, reverter3 = findoptimaltransformation(f3)

pipeline = MVPipeline( steps = [ ('Transform',[transformer1,transformer2,transformer3]), ('Forecast',mvforecaster), ('Revert',[reverter1,reverter2,reverter3]) ] )

f1, f2, f3 = pipeline.fitpredict(f1, f2, f3) backtestresults = pipeline.backtest(f1, f2, f3) metrics = backtestmetrics(backtestresults) 8. **Transfer Learning (new with 0.19.0):** Train a model in one `Forecaster` object and use that model to make predictions on the data in a separate `Forecaster` object.python f = Forecaster(...) f.autoXvarselect() f.setestimator('xgboost') f.crossvalidate() f.auto_forecast()

fnew = Forecaster(...) # different series than f fnew = inferapplyXvarselection(inferfrom=f,applyto=fnew) fnew.transferpredict(transferfrom=f,model='xgboost') # transfers the xgboost model from f to fnew ```

Installation

Only the base package is needed to get started:
- pip install --upgrade scalecast
Optional add-ons:
- pip install tensorflow (for RNN/LSTM on Windows) or pip install tensorflow-macos (for MAC/M1)
- pip install darts
- pip install prophet
- pip install greykite (for the silverkite model)
- pip install kats (changepoint detection)
- pip install pmdarima (auto arima)
- pip install tqdm (progress bar for notebook)
- pip install ipython (widgets for notebook)
- pip install ipywidgets (widgets for notebook)
- jupyter nbextension enable --py widgetsnbextension (widgets for notebook)
- jupyter labextension install @jupyter-widgets/jupyterlab-manager (widgets for Lab)

Papers that use scalecast

Udemy Course

Scalecast: Machine Learning & Deep Learning

Blog posts and notebooks

Forecasting with Different Model Types

Sklearn Univariate
- Expand your Time Series Arsenal with These Models
- Notebook
Sklearn Multivariate
RNN
ARIMA
- Forecast with ARIMA in Python More Easily with Scalecast
- Notebook
Theta
- Easily Employ A Theta Model For Time Series
- Notebook
VECM
- Employ a VECM to predict FANG Stocks with an ML Framework
- Notebook
Stacking
- Stacking Time Series Models to Improve Accuracy
- Notebook
Other Notebooks

Transforming and Reverting

Confidence Intervals

Dynamic Validation

Model Input Selection

Scaled Forecasting on Many Series

Transfer Learning

Anomaly Detection

Contributing

Contributing.md
Want something that's not listed? Open an issue!

How to cite scalecast

@misc{scalecast, title = {{scalecast}}, author = {Michael Keith}, year = {2024}, version = {<your version>}, url = {https://scalecast.readthedocs.io/en/latest/}, }

Owner

Name: Michael Keith
Login: mikekeith52
Kind: user
Location: Salt Lake City, UT

Repositories: 2
Profile: https://github.com/mikekeith52

Data Scientist and Python Developer

GitHub Events

Total

Issues event: 1
Watch event: 11
Issue comment event: 1
Push event: 18
Pull request event: 15
Fork event: 1
Create event: 17

Last Year

Issues event: 1
Watch event: 11
Issue comment event: 1
Push event: 18
Pull request event: 15
Fork event: 1
Create event: 17

Committers

Last synced: 11 months ago

All Time

Total Commits: 523
Total Committers: 3
Avg Commits per committer: 174.333
Development Distribution Score (DDS): 0.216

Past Year

Commits: 8
Committers: 1
Avg Commits per committer: 8.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Michael Keith	m**h@u**v	410
Michael Keith	m**2@g**m	112
snyk-bot	s**t@s**o	1

Committer Domains (Top 20 + Academic)

snyk.io: 1 utah.gov: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 50
Total pull requests: 208
Average time to close issues: about 2 months
Average time to close pull requests: about 1 hour
Total issue authors: 33
Total pull request authors: 3
Average comments per issue: 2.96
Average comments per pull request: 0.03
Merged pull requests: 18
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 2
Pull requests: 39
Average time to close issues: 13 days
Average time to close pull requests: N/A
Issue authors: 2
Pull request authors: 1
Average comments per issue: 1.0
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

mikekeith52 (5)
jroy12345 (4)
amengjiao (3)
callmegar (3)
Mehul-Sanghvi (3)
raedbsili1991 (3)
ahmad-shahi (2)
fstayco (2)
fcekalovic (1)
pmudgal-Intel (1)
John-Miller12 (1)
justicedarko1000 (1)
Jansza (1)
bhishanpdl (1)
ricardobarroslourenco (1)

Pull Request Authors

mikekeith52 (322)
snyk-bot (3)
michellebaugraczyk (1)

Top Labels

Issue Labels

bug (16) question (10) enhancement (6) documentation (1) good first issue (1)

Pull Request Labels

enhancement (1)

Packages

Total packages: 2
Total downloads:
- pypi 832 last-month

Total dependent packages: 0
(may contain duplicates)
Total dependent repositories: 4
(may contain duplicates)
Total versions: 195
Total maintainers: 1

pypi.org: scalecast

The practitioner's time series forecasting library

Homepage: https://github.com/mikekeith52/scalecast
Documentation: https://scalecast.readthedocs.io/
License: MIT
Latest release: 0.19.10
published over 1 year ago

Versions: 191
Dependent Packages: 0
Dependent Repositories: 3
Downloads: 818 Last month

Rankings

Stargazers count: 4.0%

Forks count: 6.4%

Downloads: 6.5%

Average: 7.2%

Dependent repos count: 8.9%

Dependent packages count: 10.1%

Maintainers (1)

mikekeith52

Last synced: 6 months ago

pypi.org: scalecastdev

Homepage: https://github.com/mikekeith52/scalecast
Documentation: https://scalecastdev.readthedocs.io/
License: MIT
Latest release: 0.2.0
published over 4 years ago

Versions: 4
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 14 Last month

Rankings

Stargazers count: 4.0%

Forks count: 6.4%

Dependent packages count: 10.1%

Average: 21.4%

Dependent repos count: 21.5%

Downloads: 64.8%

Maintainers (1)

mikekeith52

Last synced: 6 months ago

Dependencies

docs/requirements.txt pypi

autodocsumm *
ipywidgets *
myst_parser *
nbsphinx *
numpydoc *
pandoc *
pdflatex *
pyyaml *
scalecast *
sphinx *
sphinx_rtd_theme *
sphinxcontrib-confluencebuilder *
sphinxcontrib-napoleon *
tqdm *

setup.py pypi

eli5 *
lightgbm *
matplotlib *
numpy *
openpyxl *
pandas *
pandas-datareader *
scikit-learn *
scipy *
seaborn *
statsmodels *
xgboost *

src/SCALECAST.egg-info/requires.txt pypi

eli5 *
lightgbm *
matplotlib *
numpy *
openpyxl *
pandas *
pandas-datareader *
scikit-learn *
scipy *
seaborn *
statsmodels *
xgboost *

test/requirements.txt pypi

darts * test
greykite * test
kats * test
pmdarima * test
prophet * test
scalecast * test
shap * test
tensorflow * test