Scikit-Longitudinal
Scikit-Longitudinal: A Machine Learning Library for Longitudinal Classification in Python - Published in JOSS (2025)
Science Score: 93.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 7 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org -
○Committers with academic emails
-
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords
Scientific Fields
Repository
☂️ Scikit-longitudinal (Sklong) is an open-source Python library & Scikit-Learn API compliant, tailored to longitudinal machine learning classification tasks. It is ideal for researchers, data scientists, and analysts, as it provides specialist tools for dealing with repeated-measures data challenges
Basic Info
- Host: GitHub
- Owner: simonprovost
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://scikit-longitudinal.readthedocs.io/latest/
- Size: 69.6 MB
Statistics
- Stars: 66
- Watchers: 4
- Forks: 3
- Open Issues: 3
- Releases: 6
Topics
Metadata Files
README.md
Scikit-longitudinal
A specialised Python library for longitudinal data analysis built on Scikit-learn
💡 About The Project
Scikit-longitudinal (Sklong) is a machine learning library designed to analyse
longitudinal data (Classification tasks focussed as of today). It offers tools and models for processing, analysing,
and predicting longitudinal data, with a user-friendly interface that
integrates with the Scikit-learn ecosystem.
Wait, what is Longitudinal Data — In layman's terms ?
Longitudinal data is a "time-lapse" snapshot of the same subject, entity, or group tracked over time-periods, similar to checking in on patients to see how they change. For instance, doctors may monitor a patient's blood pressure, weight, and cholesterol every year for a decade to identify health trends or risk factors. This data is more useful for predicting future results than a one-time survey because it captures evolution, patterns, and cause-effect throughout time.
Not enough?
- For more scientific details, you can refer to our paper published in the Journal of Open Source Software (JOSS).
- For more technical details, visit the official documentation.
🛠️ Installation
[!NOTE] Want to be using
Jupyter Notebook,Marimo,Google Colab, orJupyterLab? Head to theGetting Startedsection of the documentation, we explain it all! 🎉
To install Scikit-longitudinal:
- ✅ Install the latest version:
bash pip install Scikit-longitudinal
To install a specific version:
bash
pip install Scikit-longitudinal==0.1.0
[!CAUTION]
Scikit-longitudinalis currently compatible with Python versions3.9only. Ensure you have one of these versions installed before proceeding with the installation.Now, while we understand that this is a limitation, we are tied for the time being because of
Deep Forest.Deep Forestis a dependency ofScikit-longitudinalthat is not compatible with Python versions greater than3.9.Deep Foresthelps us with theDeep Forestalgorithm, to which we have made some modifications to welcomeLexicographical Deep Forest.To follow up on this discussion, please refer to this github issue.
If you encounter any errors, feel free to explore further the
installationsection in theGetting Startedof the documentation. If it still doesn't work, please open an issue on GitHub.
🚀 Getting Started
Here's how to analyse longitudinal data with Scikit-longitudinal:
``` py from scikitlongitudinal.datapreparation import LongitudinalDataset from scikitlongitudinal.estimators.ensemble.lexicographical.lexicogradient_boosting import LexicoGradientBoostingClassifier
dataset = LongitudinalDataset('./stroke.csv') # Note this is a fictional dataset. Use yours! dataset.loaddatatargettraintestsplit( targetcolumn="classstrokewave_4", )
Pre-set or manually set your temporal dependencies
dataset.setupfeaturesgroup(input_data="elsa")
model = LexicoGradientBoostingClassifier( featuresgroup=dataset.featuregroups(), threshold_gain=0.00015 # Refer to the API for more hyper-parameters and their meaning )
model.fit(dataset.Xtrain, dataset.ytrain) ypred = model.predict(dataset.Xtest)
Classification report
print(classificationreport(ytest, y_pred)) ```
📝 How to Cite
If you use Sklong in your research, please cite our paper:
bibtex
@article{Provost2025,
doi = {10.21105/joss.08481},
url = {https://doi.org/10.21105/joss.08481},
year = {2025},
publisher = {The Open Journal},
volume = {10},
number = {112},
pages = {8481},
author = {Provost, Simon and Freitas, Alex A.},
title = {Scikit-Longitudinal: A Machine Learning Library for Longitudinal Classification in Python},
journal = {Journal of Open Source Software}
}
🔐 License
Scikit-longitudinal is licensed under the MIT License.
Owner
- Name: Provost Simon
- Login: simonprovost
- Kind: user
- Location: London
- Company: @UniversityOfKent
- Website: https://bento.me/simon-provost
- Repositories: 26
- Profile: https://github.com/simonprovost
Incoming Visiting Researcher @ NYU | TANDON 🇺🇸 –– Ph.D student @ University of Kent 🎓
JOSS Publication
Scikit-Longitudinal: A Machine Learning Library for Longitudinal Classification in Python
Authors
Tags
machine learning longitudinal data classification Scikit-learnGitHub Events
Total
- Create event: 7
- Commit comment event: 2
- Issues event: 19
- Release event: 2
- Watch event: 35
- Delete event: 7
- Issue comment event: 43
- Push event: 50
- Pull request event: 13
- Fork event: 2
Last Year
- Create event: 7
- Commit comment event: 2
- Issues event: 19
- Release event: 2
- Watch event: 35
- Delete event: 7
- Issue comment event: 43
- Push event: 50
- Pull request event: 13
- Fork event: 2
Committers
Last synced: 6 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Provost Simon | s****t@e****u | 157 |
| sgp28 | s****v@g****n | 17 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 27
- Total pull requests: 40
- Average time to close issues: 23 days
- Average time to close pull requests: about 11 hours
- Total issue authors: 7
- Total pull request authors: 1
- Average comments per issue: 2.0
- Average comments per pull request: 0.1
- Merged pull requests: 36
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 13
- Pull requests: 10
- Average time to close issues: 4 days
- Average time to close pull requests: 43 minutes
- Issue authors: 7
- Pull request authors: 1
- Average comments per issue: 3.62
- Average comments per pull request: 0.2
- Merged pull requests: 8
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- simonprovost (15)
- blengerich (4)
- TahiriNadia (3)
- klemenpe (2)
- orduek (1)
- edqzhang (1)
- SteffKL (1)
Pull Request Authors
- simonprovost (40)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 2
-
Total downloads:
- pypi 330 last-month
-
Total dependent packages: 0
(may contain duplicates) -
Total dependent repositories: 0
(may contain duplicates) - Total versions: 16
- Total maintainers: 1
pypi.org: scikit-longitudinal
Scikit-longitudinal is an open-source Python library for longitudinal data analysis, building on Scikit-learn's foundation with tools tailored for repeated measures data.
- Homepage: https://github.com/simonprovost/scikit-longitudinal
- Documentation: https://scikit-longitudinal.readthedocs.io/latest/
- License: MIT
-
Latest release: 0.1.1
published 4 months ago
Rankings
Maintainers (1)
pypi.org: scikit-lexicographical-trees
A set of python modules for machine learning and data mining
- Homepage: https://simonprovost.github.io/scikit-longitudinal/
- Documentation: https://scikit-lexicographical-trees.readthedocs.io/
- License: new BSD
-
Latest release: 0.0.4
published over 1 year ago
