https://github.com/Nike-Inc/timeseries-generator
A library to generate synthetic time series data by easy-to-use factors and generator
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.9%) to scientific vocabulary
Keywords
Repository
A library to generate synthetic time series data by easy-to-use factors and generator
Basic Info
Statistics
- Stars: 153
- Watchers: 12
- Forks: 38
- Open Issues: 8
- Releases: 1
Topics
Metadata Files
README.md
timeseries-generator
This repository consists of a python packages that generates synthetic time series dataset in a generic way (under /timeseries_generator) and demo notebooks on how to generate synthetic timeseries data (under /examples). The goal here is to have non-sensitive data available to demo solutions and test the effectiveness of those solutions and/or algorithms. In order to test your algorithm, you want to have time series available containing different kinds of trends. The python package should help create different kinds of time series while still being maintainable.
timeseries_generator package
For this package, it is assumed that a time series is composed of a base value multiplied by many factors.
ts = base_value * factor1 * factor2 * ... * factorN + Noiser

These factors can be anything, random noise, linear trends, to seasonality. The factors can affect different features. For example, some features in your time series may have a seasonal component, while others do not.
Different factors are represented in different classes, which inherit from the BaseFactor class. Factor classes are input for the Generator class, which creates a dataframe containing the features, base value, all the different factors working on the base value and and the final factor and value.
Core concept
- Generator: a python class to generate the time series. A generator contains a list of factors and noiser. By overlaying the factors and noiser, generator can produce a customized time series
- Factor: a python class to generate the trend, seasonality, holiday factors, etc. Factors take effect by multiplying on the base value of the generator.
- Noised: a python class to generate time series noise data. Noiser take effect by summing on top of "factorized" time series. This formula describes the concepts we talk above
Built-in Factors
- LinearTrend: give a linear trend based on the input slope and intercept
- CountryYearlyTrend: give a yearly-based market cap factor based on the GDP per - capita.
- EUEcoTrendComponents: give a monthly changed factor based on EU industry product public data
- HolidayTrendComponents: simulate the holiday sale peak. It adapts the holiday days - differently in different country
- BlackFridaySaleComponents: simulate the BlackFriday sale event
- WeekendTrendComponents: more sales at weekends than on weekdays
- FeatureRandFactorComponents: set up different sale amount for different stores and different product
- ProductSeasonTrendComponents: simulate season-sensitive product sales. In this example code, we have 3 different types of product:
- winter jacket: inverse-proportional to the temperature, more sales in winter
- basketball top: proportional to the temperature, more sales in summer
- Yoga Mat: temperature insensitive
Installation
sh
pip install timeseries-generator
Usage
``` python from timeseries_generator import LinearTrend, Generator, WhiteNoise, RandomFeatureFactor import pandas as pd
setting up a linear tren
lt = LinearTrend(coef=2.0, offset=1., colname="mylineartrend") g = Generator(factors={lt}, features=None, daterange=pd.date_range(start="01-01-2020", end="01-20-2020")) g.generate() g.plot()
update by adding some white noise to the generator
wn = WhiteNoise(stdevfactor=0.05) g.updatefactor(wn) g.generate() g.plot() ```
Example Notebooks
We currently have 2 example notebooks available:
1. generate_stationary_process: Good for introducing the basics of the timeseries_generator. Shows how to apply
simple linear trends and how to introduce features and labels, as well as random noise.
1. use_external_factors: Goes more into detail and shows how to use the external_factors submodule. Shows how to
create seasonal trends.
Web based prototyping UI
We also use Streamlit to build a web-based UI to demonstrate how to use this package to generate synthesis time series data in an interactive web UI.
sh
streamlit run examples/streamlit/app.py

License
This package is released under the Apache License, Version 2.0
Owner
- Name: Nike Inc.
- Login: Nike-Inc
- Kind: organization
- Location: Beaverton, OR
- Website: http://engineering.nike.com
- Repositories: 74
- Profile: https://github.com/Nike-Inc
GitHub Events
Total
- Watch event: 15
- Pull request review event: 1
- Fork event: 2
Last Year
- Watch event: 15
- Pull request review event: 1
- Fork event: 2
Committers
Last synced: over 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| Zhe Sun | z****n@n****m | 1 |
| twobitunicorn | j****z@g****m | 1 |
| Jakub | 6****a | 1 |
| Nils Leger | 4****r | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 7
- Total pull requests: 7
- Average time to close issues: 14 days
- Average time to close pull requests: 2 days
- Total issue authors: 6
- Total pull request authors: 5
- Average comments per issue: 0.57
- Average comments per pull request: 0.86
- Merged pull requests: 3
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- twobitunicorn (2)
- YAYAYru (1)
- athewsey (1)
- haskarb (1)
- isabelsandstrom (1)
- nileger (1)
Pull Request Authors
- twobitunicorn (3)
- jakub-mizera (2)
- mattpitkin (1)
- siegstedt (1)
- nileger (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- altair ==4.1.0
- streamlit ==0.75.0
- build ==0.3.0
- jupyter ==1.0.0
- jupyterlab ==3.0.4
- matplotlib ==3.3.3
- pandas ==1.2.0
- pytest ==6.2.2
- scipy ==1.6.0
- twine ==3.3.0
- workalendar ==15.0.1
- matplotlib ==3.3.3
- pandas ==1.2.0
- workalendar ==15.0.1