synthpred
A Julia package for synthetic data analysis, advanced imputation (ARIMA, RNN), AutoML, and ensemble modeling.
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.2%) to scientific vocabulary
Keywords
Repository
A Julia package for synthetic data analysis, advanced imputation (ARIMA, RNN), AutoML, and ensemble modeling.
Basic Info
Statistics
- Stars: 2
- Watchers: 1
- Forks: 2
- Open Issues: 0
- Releases: 2
Topics
Metadata Files
README.md
SynthPred.jl
SynthPred.jl is a Julia package for synthetic data analysis, advanced imputation (ARIMA, RNN), AutoML, and ensemble modeling.
🚀 Features
- 🔍 Descriptive statistics and missing data reporting
- 🧼 Simple and advanced imputation:
- Mean, median, mode
- Forward/backward fill
- Gaussian distribution sampling
- Time series-based: ARIMA
- Sequence learning-based: RNN (Flux.jl)
- 🤖 AutoML for classification (MLJ.jl-based)
- ⚖️ Blending top-performing models via ensembling
- 📊 Predictions on new data
- 📑 JSON/CSV imputation reports
📦 Installation
julia
using Pkg
Pkg.add(url="https://github.com/TyMill/SynthPred.jl")
🧪 Quick Example
```julia using SynthPred using CSV, DataFrames
Load training data
df = CSV.read("data/example.csv", DataFrame)
Explore data
SynthPred.Exploration.describe_data(df)
Impute missing values (e.g. RNN strategy)
dfclean, report = SynthPred.Imputer.imputeadvanced(df, "rnn", threshold=0.1) SynthPred.Imputer.saveimputationreport(report, "reports/imputation_report.json")
Run AutoML pipeline
topmodels, scores = SynthPred.AutoML.runautoml(dfclean, :target) X = select(dfclean, Not(:target)) y = dfclean[:, :target] ensemble = SynthPred.AutoML.blendtopmodels(topmodels, X, y)
Predict on new data
Xnew = CSV.read("data/newdata.csv", DataFrame) preds = SynthPred.AutoML.predictensemble(ensemble, Xnew) println(preds) ```
📚 Documentation
Full documentation is available at: https://your-username.github.io/SynthPred.jl
🧪 Project Structure
SynthPred/
├── Project.toml
├── src/
│ ├── SynthPred.jl
│ ├── Exploration.jl
│ ├── Imputer.jl
│ └── AutoML.jl
├── data/
│ ├── example.csv
│ └── new_data.csv
├── reports/
│ └── imputation_report.json
├── docs/
│ └── src/index.md
├── test/
│ └── runtests.jl
└── main.jl
📌 Roadmap
- [x] Core modules: Exploration, Imputer, AutoML
- [x] ARIMA and RNN-based imputations
- [x] AutoML + model blending with MLJ.jl
- [x] Imputation reports (CSV/JSON)
- [x] Documentation (Documenter.jl + GitHub Pages)
- [ ] Exporting trained models (
JLD2,BSON) - [ ] Web GUI with Pluto.jl or Dash.jl
- [ ] Integration with JuliaHub and Zenodo DOI
🤝 Contributing
Pull requests are welcome! For major changes, please open an issue first to discuss your proposal.
📜 License
MIT License © 2025 Tymoteusz Miller
📬 Contact
📧 me@tymoteuszmiller.dev
Built with ❤️ in Julia for real-world ML and scientific discovery.
Owner
- Login: TyMill
- Kind: user
- Repositories: 2
- Profile: https://github.com/TyMill
Citation (CITATION.bib)
@software{miller2025synthpred,
author = {Tymoteusz Miller},
title = {SynthPred.jl: A Julia Library for Synthetic Data Analysis, Advanced Imputation, and Ensemble AutoML},
year = {2025},
publisher = {GitHub},
journal = {SoftwareX (planned)},
url = {https://github.com/TyMill/SynthPred.jl},
version = {v0.1.0},
doi = {10.5281/zenodo.15090893}
}
GitHub Events
Total
- Release event: 2
- Watch event: 2
- Push event: 42
- Pull request event: 2
- Fork event: 2
- Create event: 3
Last Year
- Release event: 2
- Watch event: 2
- Push event: 42
- Pull request event: 2
- Fork event: 2
- Create event: 3
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 0
- Total pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: less than a minute
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: less than a minute
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- TyMill (2)