https://github.com/sdv-dev/deepecho
Synthetic Data Generation for mixed-type, multivariate time series.
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
1 of 12 committers (8.3%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (16.0%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Synthetic Data Generation for mixed-type, multivariate time series.
Basic Info
Statistics
- Stars: 116
- Watchers: 8
- Forks: 16
- Open Issues: 6
- Releases: 17
Topics
Metadata Files
README.md
This repository is part of The Synthetic Data Vault Project, a project from DataCebo.
[](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha) [](https://pypi.python.org/pypi/deepecho) [](https://github.com/sdv-dev/DeepEcho/actions?query=workflow%3A%22Run+Tests%22+branch%3Amain) [](https://pepy.tech/project/deepecho) [](https://codecov.io/gh/sdv-dev/DeepEcho) [](https://mybinder.org/v2/gh/sdv-dev/DeepEcho/main?filepath=tutorials/timeseries_data) [](https://bit.ly/sdv-slack-invite)Overview
DeepEcho is a Synthetic Data Generation Python library for mixed-type, multivariate time series. It provides:
- Multiple models based both on classical statistical modeling of time series and the latest in Deep Learning techniques.
- A robust benchmarking framework for evaluating these methods on multiple datasets and with multiple metrics.
- Ability for Machine Learning researchers to submit new methods following our
modelandsampleAPI and get evaluated.
| Important Links | |
| --------------------------------------------- | -------------------------------------------------------------------- |
| :computer: Website | Check out the SDV Website for more information about the project. |
| :orange_book: SDV Blog | Regular publshing of useful content about Synthetic Data Generation. |
| :book: Documentation | Quickstarts, User and Development Guides, and API Reference. |
| :octocat: Repository | The link to the Github Repository of this library. |
| :keyboard: Development Status | This software is in its Pre-Alpha stage. |
|
Community | Join our Slack Workspace for announcements and discussions. |
|
Tutorials | Run the SDV Tutorials in a Binder environment. |
Install
DeepEcho is part of the SDV project and is automatically installed alongside it. For details about this process please visit the SDV Installation Guide
Optionally, DeepEcho can also be installed as a standalone library using the following commands:
Using pip:
bash
pip install deepecho
Using conda:
bash
conda install -c pytorch -c conda-forge deepecho
For more installation options please visit the DeepEcho installation Guide
Quickstart
DeepEcho is included as part of SDV to model and sample synthetic time series. In most cases, usage through SDV is recommeded, since it provides additional functionalities which are not available here. For more details about how to use DeepEcho whithin SDV, please visit the corresponding User Guide:
Standalone usage
DeepEcho can also be used as a standalone library.
In this short quickstart, we show how to learn a mixed-type multivariate time series dataset and then generate synthetic data that resembles it.
We will start by loading the data and preparing the instance of our model.
```python3 from deepecho import PARModel from deepecho.demo import load_demo
Load demo data
data = load_demo()
Define data types for all the columns
datatypes = { 'region': 'categorical', 'dayofweek': 'categorical', 'totalsales': 'continuous', 'nb_customers': 'count', }
model = PARModel(cuda=False) ```
If we want to use different settings for our model, like increasing the number of epochs or enabling CUDA, we can pass the arguments when creating the model:
python # keep this as python (without the 3) to avoid using it in test-readme
model = PARModel(epochs=1024, cuda=True)
Notice that for smaller datasets like the one used on this demo, CUDA usage introduces more overhead than the gains it obtains from parallelization, so the process in this case is more efficient without CUDA, even if it is available.
Once we have created our instance, we are ready to learn the data and generate new synthetic data that resembles it:
```python3
Learn a model from the data
model.fit( data=data, entitycolumns=['storeid'], contextcolumns=['region'], datatypes=datatypes, sequenceindex='date' )
Sample new data
model.sample(num_entities=5) ```
The output will be a table with synthetic time series data with the same properties to the demo data that we used as input.
What's next?
For more details about DeepEcho and all its possibilities and features, please check and run the tutorials.
If you want to see how we evaluate the performance and quality of our models, please have a look at the SDGym Benchmarking framework.
Also, please feel welcome to visit our contributing guide in order to help us developing new features or cool ideas!
The Synthetic Data Vault Project was first created at MIT's Data to AI Lab in 2016. After 4 years of research and traction with enterprise, we created DataCebo in 2020 with the goal of growing the project. Today, DataCebo is the proud developer of SDV, the largest ecosystem for synthetic data generation & evaluation. It is home to multiple libraries that support synthetic data, including:
- 🔄 Data discovery & transformation. Reverse the transforms to reproduce realistic data.
- 🧠 Multiple machine learning models -- ranging from Copulas to Deep Learning -- to create tabular, multi table and time series data.
- 📊 Measuring quality and privacy of synthetic data, and comparing different synthetic data generation models.
Get started using the SDV package -- a fully integrated solution and your one-stop shop for synthetic data. Or, use the standalone libraries for specific needs.
Owner
- Name: The Synthetic Data Vault Project
- Login: sdv-dev
- Kind: organization
- Email: sdv@sdv.dev
- Website: https://sdv.dev
- Repositories: 9
- Profile: https://github.com/sdv-dev
GitHub Events
Total
- Create event: 27
- Release event: 2
- Issues event: 16
- Watch event: 14
- Delete event: 22
- Issue comment event: 14
- Push event: 41
- Pull request review comment event: 5
- Pull request review event: 29
- Pull request event: 40
- Fork event: 2
Last Year
- Create event: 27
- Release event: 2
- Issues event: 16
- Watch event: 14
- Delete event: 22
- Issue comment event: 14
- Push event: 41
- Pull request review comment event: 5
- Pull request review event: 29
- Pull request event: 40
- Fork event: 2
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Carles Sala | c****s@p****m | 144 |
| Andrew Montanez | a****w@s****v | 34 |
| Kevin Alex Zhang | k****z@m****u | 21 |
| SDV Team | 9****m | 20 |
| Felipe Alex Hofmann | f****o@g****m | 17 |
| R-Palazzo | 1****o | 11 |
| Katharine Xiao | 2****o | 10 |
| Plamen Valentinov Kolev | 4****r | 10 |
| lajohn4747 | j****n@d****m | 7 |
| Gaurav Sheni | g****i@g****m | 3 |
| Frances Hartwell | f****9@g****m | 3 |
| Roy Wedge | r****e@g****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 46
- Total pull requests: 116
- Average time to close issues: 4 months
- Average time to close pull requests: 5 days
- Total issue authors: 15
- Total pull request authors: 13
- Average comments per issue: 0.13
- Average comments per pull request: 0.58
- Merged pull requests: 105
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 8
- Pull requests: 39
- Average time to close issues: about 2 months
- Average time to close pull requests: 2 days
- Issue authors: 4
- Pull request authors: 8
- Average comments per issue: 0.0
- Average comments per pull request: 0.64
- Merged pull requests: 35
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- amontanez24 (14)
- fealho (8)
- R-Palazzo (4)
- npatki (4)
- gsheni (3)
- pvk-developer (3)
- Ale0x78 (2)
- sarahmish (2)
- csala (1)
- alextopology (1)
- Myprojectjoy (1)
- HarisNaveed17 (1)
- joanvaquer (1)
- frances-h (1)
- marketneutral (1)
Pull Request Authors
- sdv-team (52)
- R-Palazzo (16)
- csala (15)
- pvk-developer (15)
- fealho (14)
- amontanez24 (13)
- gsheni (10)
- frances-h (5)
- k15z (5)
- lajohn4747 (4)
- katxiao (2)
- rwedge (2)
- CristianCuadrado (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 5
-
Total downloads:
- pypi 121,248 last-month
- Total docker downloads: 34,514
-
Total dependent packages: 5
(may contain duplicates) -
Total dependent repositories: 13
(may contain duplicates) - Total versions: 65
- Total maintainers: 6
pypi.org: deepecho
Create sequential synthetic data of mixed types using a GAN.
- Documentation: https://deepecho.readthedocs.io/
- License: BSL-1.1
-
Latest release: 0.7.0
published about 1 year ago
Rankings
Maintainers (5)
proxy.golang.org: github.com/sdv-dev/deepecho
- Documentation: https://pkg.go.dev/github.com/sdv-dev/deepecho#section-documentation
- License: other
-
Latest release: v0.7.0
published about 1 year ago
Rankings
proxy.golang.org: github.com/sdv-dev/DeepEcho
- Documentation: https://pkg.go.dev/github.com/sdv-dev/DeepEcho#section-documentation
- License: other
-
Latest release: v0.7.0
published about 1 year ago
Rankings
spack.io: py-deepecho
DeepEcho is a Synthetic Data Generation Python library for mixed-type, multivariate time series.
- Homepage: https://github.com/sdv-dev/DeepEcho
- License: []
-
Latest release: 0.3.0.post1
published almost 4 years ago
Rankings
Maintainers (1)
conda-forge.org: deepecho
- Homepage: https://github.com/sdv-dev/DeepEcho
- License: BUSL-1.1
-
Latest release: 0.2.1
published over 4 years ago
Rankings
Dependencies
- numpy >=1.18.0,<1.20.0
- numpy >=1.20.0,<2
- pandas >=1.1.3,<2
- torch >=1.8.0,<2
- tqdm >=4.15,<5
- actions/checkout v1 composite
- actions/setup-python v2 composite
- codecov/codecov-action v2 composite
- actions/checkout v1 composite
- actions/setup-python v2 composite
- actions/checkout v1 composite
- actions/setup-python v2 composite
- actions/checkout v1 composite
- actions/setup-python v2 composite
- actions/checkout v1 composite
- actions/setup-python v2 composite
