scikit-learn-template
Generic template to bootstrap your scikit-learn project
Science Score: 41.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.4%) to scientific vocabulary
Keywords
Repository
Generic template to bootstrap your scikit-learn project
Basic Info
- Host: GitHub
- Owner: insane-group
- License: apache-2.0
- Language: Jupyter Notebook
- Default Branch: master
- Homepage: https://insane-group.github.io/scikit-learn-template/
- Size: 1010 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Project Name
Accompanying code for the paper Paper Title.
Add a brief description of your project here. You can use Markdown syntax for formatting, such as bold, italics, and links.
Make sure you:
- [ ] Replace the Project Name with the name of your project
- [ ] Add a brief description of your project
- [ ] Remove the :thinking: Why ? section
- [ ] Rename the project appropriately
- [ ] Change the project details (e.g. name, description, URLs) in the following files:
- [ ] pyproject.toml
- [ ] mkdocs.yml
- [ ] README.md
- [ ] Update CITATION.cff (and the :bookmark_tabs: Citation section below)
- [ ] Update the arXiv badge in the README.md with the correct arXiv ID (when available)
- [ ] Update the logo in the README.md
:thinking: Why ?
When working on a new project, we frequently encountered challenges such as:
- Reproducibility: How can we ensure that our results are reproducible across different environments?
- Boilerplate Code: We often find ourselves writing the same boilerplate code over and over again.
To address these challenges, we have created a template for scikit-learn projects that streamlines the setup process and helps you focus on your research.
Main Technologies
- scikit-learn: A Python library for machine learning that provides simple and efficient tools for data mining and analysis. Built on NumPy, SciPy, and Matplotlib, it is widely used for both research and production applications.
- Hydra: A powerful configuration framework for managing complex applications. It enables dynamic composition of hierarchical configurations, allowing overrides via config files and the command line.
:rocket: Getting Started
Click Use this template to create a new repository.
Once your repository is set up using the template, clone it and start working with the following commands (We use the Rye Python package manager):
```shell
Install Rye (https://rye.astral.sh/guide/installation/)
curl -sSf https://rye.astral.sh/get | bash
Clone the repository & cd into it
git clone https://github.com/insane-group/
Rename the project and make sure you change the project details (e.g. name, description, URLs) in the following files:
1. pyproject.toml
2. mkdocs.yml
3. README.md
4. CITATION.cff
mv src/project src/
Install dependencies using Rye
rye sync
Activate the virtual environment
source .venv/bin/activate
Install the pre-commit hooks
rye run pre-commit install
Run the training/evaluation script (Run with --help to see all options)
Example usage
Override any config parameter from command line
python src/project/train.py
test checkpoint on validation dataset
python src/project/test.py checkpoint="/path/to/ckpt/name.ckpt"
make predictions on test dataset
python src/project/predict.py checkpoint="/path/to/ckpt/name.ckpt" ```
Feel free to share any relevant details to help others get started, for example, content similar to the Setup and Quickstart sections in Google’s Prompt-to-Prompt.
Performing tasks using poethepoet
We are using poethepoet, to perform various development oriented tasks. Formatting, type-checking, as well as a few other operations, can be performed by running
shell
poe <task>
where <task> is one of the tasks listed by running:
```shell poe --help Poe the Poet - A task runner that works well with poetry. version 0.28.0
Result: No task specified.
Usage: poe [global options] task [task arguments]
Global options: -h, --help Show this help page and exit --version Print the version and exit -v, --verbose Increase command output (repeatable) -q, --quiet Decrease command output (repeatable) -d, --dry-run Print the task contents but don't actually run it -C PATH, --directory PATH Specify where to find the pyproject.toml -e EXECUTOR, --executor EXECUTOR Override the default task executor --ansi Force enable ANSI output --no-ansi Force disable ANSI output
Configured tasks: clean Clean up any auxiliary files format Format your codebase hooks Run all pre-commit hooks test Run the test suite type-check Run static type checking on your codebase lint Lint your code for errors docs Build and serve the documentation ```
Consider installing
poeas global dependency to make your life easier usingrye install poethepoet:stuckouttongue:.
:openfilefolder: Project Structure
The project follows a standard structure for a Python project.
shell
├── CITATION.cff <- Citation file for referencing the project
├── configs <- Hydra configuration files
│ ├── cross_validate <- Cross-validation configuration
│ ├── data <- Configs for loading the dataset
│ ├── hydra <- Hydra-specific settings
│ ├── metrics <- Metrics configuration
│ ├── model <- Model-specific config
│ ├── predict.yaml <- Prediction configuration file
│ ├── test.yaml <- Test configuration file
│ └── train.yaml <- Training configuration file
├── data <- Dataset storage directory
├── docs <- Project documentation
│ ├── code <- Source code documentation
│ ├── CODE_OF_CONDUCT.md <- Guidelines for community behavior
│ ├── CONTRIBUTING.md <- Instructions for contributing to the project
│ ├── LICENSE.md <- License information
│ ├── index.md <- Main documentation page
│ └── welcome.md <- Welcome page for the project
├── .editorconfig <- Editor configuration for consistent formatting
├── .github <- GitHub-specific configurations
│ └── workflows <- CI/CD workflow definitions for GitHub Actions
├── .gitignore <- Files and directories to ignore in Git
├── models <- Trained models and related files
├── notebooks <- Jupyter notebooks for experiments and analysis
│ └── template.ipynb <- Notebook template for new experiments
├── .pre-commit-config.yaml <- Pre-commit hook configurations
├── src <- Source code directory
│ └── project <- Main project codebase
├── tests <- Unit tests for the project
│ ├── __init__.py <- Init file for test module
│ └── test_model.py <- Tests for model functionality
├── LICENSE <- License information for the project
├── README.md <- Main project README file
├── mkdocs.yml <- Configuration for MkDocs documentation site
├── pyproject.toml <- Python project configuration file
├── .python-version <- Python version specification
├── requirements-dev.lock <- Locked dependencies for development
├── requirements.lock <- Locked dependencies for production
└── .vscode <- VS Code workspace settings
├── extensions.json <- Recommended extensions for VS Code
├── launch.json <- Debugging configurations
└── settings.json <- VS Code-specific settings
:book: Exploring the Documentation
The documentation is generated from Python docstrings using MkDocs and mkdocstrings for the source code, while the rest is written in standard Markdown. To view it, run poe docs in the terminal or visit https://insane-group.github.io/scikit-learn-template/.
:bookmark_tabs: Citation
Please use the following citation if you use this project in your work:
bibtex
@software{Sioros_scikit-learn-template,
author = {Sioros, Vassilis},
license = {Apache-2.0},
title = {{scikit-learn-template}},
url = {https://github.com/insane-group/scikit-learn-template}
}
:coin: Credits
This template was created by INSANE Group and is based on the following projects:
Owner
- Name: INSANE: Intelligent Science & Engineering
- Login: insane-group
- Kind: organization
- Email: insane@iit.demokritos.gr
- Location: Greece
- Website: https://insane.iit.demokritos.gr/
- Repositories: 1
- Profile: https://github.com/insane-group
INSANE Group is a research lab, a knowledge-sharing hub, and a key agent enabling the interaction between Artificial Intelligence and Physical Sciences.
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: scikit-learn-template
message: 'If you use this software, please cite it as below.'
type: software
authors:
- given-names: Vassilis
family-names: Sioros
email: v.sioros@iit.demokritos.gr
affiliation: NCSR “Demokritos”
orcid: 'https://orcid.org/0000-0002-6266-0755'
repository-code: 'https://github.com/insane-group/scikit-learn-template'
url: 'https://insane-group.github.io/scikit-learn-template/'
abstract: Generic template to bootstrap your scikit-learn project
keywords:
- template
- scikit-learn
- hydra
license: Apache-2.0
GitHub Events
Total
- Push event: 24
- Create event: 3
Last Year
- Push event: 24
- Create event: 3
Issues and Pull Requests
Last synced: about 1 year ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0