https://github.com/axect/pytorch_template
A flexible PyTorch template for ML experiments with configuration management, logging, and hyperparameter optimization.
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.5%) to scientific vocabulary
Keywords
Repository
A flexible PyTorch template for ML experiments with configuration management, logging, and hyperparameter optimization.
Basic Info
- Host: GitHub
- Owner: Axect
- Language: Python
- Default Branch: main
- Homepage: https://axect.github.io/pytorch_template
- Size: 193 KB
Statistics
- Stars: 9
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
PyTorch Template Project
A flexible and reusable template for PyTorch-based machine learning experiments. Streamline your workflow with YAML configurations, integrated hyperparameter optimization (Optuna), experiment tracking (Weights & Biases), custom components, and easy result analysis.
Key Features
YAML Configuration: Easily manage all experiment settings (model, optimizer, scheduler, training parameters) using simple YAML files
Hyperparameter Optimization: Automatically find the best hyperparameters using Optuna integration
Experiment Tracking: Log metrics, configurations, and models seamlessly with Weights & Biases
Advanced Pruning: Speed up optimization using custom pruners like the Predicted Final Loss (PFL) Pruner
Customizable Components: Easily add or modify models (
model.py), learning rate schedulers, optimizers, and the training loop (Trainerinutil.py).Analysis Tools: Interactively load, analyze, and evaluate trained models and optimization results (
analyze.py).Reproducibility: Ensure consistent results with built-in seed management.
Quick Start
Create Your Repository: Click "Use this template" on the GitHub page to create your own repository based on this template.
Clone Your Repository:
bash git clone https://github.com/<your-username>/<your-new-repository-name>.git cd <your-new-repository-name>Set Up Environment & Install Dependencies: (Using uv is recommended) ```bash
Create and activate virtual environment
uv venv source .venv/bin/activate # On Windows use
.venv\Scripts\activateInstall prerequsites (Recommended)
uv pip install -U torch wandb rich beaupy fireducks numpy optuna matplotlib scienceplots
Or execute install script (shell script)
sh install_requirements.sh
Or sync requirements (caution: this version is optimized for cuda environment)
uv pip sync requirements.txt
Or using pip: pip install -r requirements.txt
```
(Optional) Login to Weights & Biases: ```bash
only once per a machine
wandb login ```
Run a Default Experiment:
bash python main.py --run_config configs/run_template.yamlRun Hyperparameter Optimization:
bash python main.py --run_config configs/run_template.yaml --optimize_config configs/optimize_template.yamlAnalyze Results:
bash python analyze.py
Documentation
For a deeper dive into the components and customization options, check out the detailed documentation:
- Project Documentation (Covers Configuration, Execution, Training Loop, Model Definition, Optimization, Pruning, Analysis) (Generated by Tutorial-Codebase-Knowledge)
Outline
- PyTorch Template Project
- Project Structure
- Prerequisites
- Usage
- Configuration Files
- Customization
- Analysis Script (
analyze.py) - Contributing
- License
- Appendix
Project Structure
config.py: DefinesRunConfigandOptimizeConfigfor managing experiment and optimization settings.main.py: Entry point, handles arguments and experiment execution.model.py: Contains model architectures (e.g., MLP).util.py: Utility functions (data loading, training loop, analysis helpers, etc.).analyze.py: Script for analyzing completed runs and optimizations.hyperbolic_lr.py: Implementation of custom hyperbolic learning rate schedulers.pruner.py: Contains custom pruners like PFLPruner.configs/: Directory for configuration files.run_template.yaml: Template for basic run configuration.optimize_template.yaml: Template for optimization configuration.
runs/: Directory where experiment results (models, configs) are saved.requirements.txt: Lists project dependencies.README.md: This file.RELEASES.md: Project release notes.
Prerequisites
- Python 3.x
- Git
Usage
Configure Your Run:
- Modify
configs/run_template.yamlor create a copy (e.g.,configs/my_experiment.yaml) and adjust the parameters. See the Customization section for details.
- Modify
(Optional) Configure Optimization:
- If you want to perform hyperparameter optimization, modify
configs/optimize_template.yamlor create a copy (e.g.,configs/my_optimization.yaml). Define thesearch_space,sampler, andpruner. See the Customization section.
- If you want to perform hyperparameter optimization, modify
Run the Experiment:
- **Single Run:**
```sh
python main.py --run_config configs/run_template.yaml
```
(Replace `run_template.yaml` with your specific run configuration file if needed).
- **Optimization Run:**
```sh
python main.py --run_config configs/run_template.yaml --optimize_config configs/optimize_template.yaml
```
(Replace file names as needed). This will use Optuna to search for the best hyperparameters based on your `optimize_template.yaml`.
- Analyze Results:
- Use the interactive analysis script:
sh python analyze.py - Follow the prompts to select the project, run group, and seed to load and analyze the model.
- Use the interactive analysis script:
Configuration Files
Run Configuration (run_template.yaml)
-
project: Project name (used for wandb and results saving). -
device: Device ('cpu', 'cuda:0', etc.). -
net: Path to the model class (e.g.,model.MLP). -
optimizer: Path to the optimizer class (e.g.,torch.optim.adamw.AdamW,splus.SPlus). -
scheduler: Path to the scheduler class (e.g.,hyperbolic_lr.ExpHyperbolicLR,torch.optim.lr_scheduler.CosineAnnealingLR). -
epochs: Number of training epochs. -
batch_size: Training batch size. -
seeds: List of random seeds for running the experiment multiple times. -
net_config: Dictionary of arguments passed to the model's__init__method. -
optimizer_config: Dictionary of arguments for the optimizer. -
scheduler_config: Dictionary of arguments for the scheduler. -
early_stopping_config: Configuration for early stopping.-
enabled:trueorfalse. -
patience: How many epochs to wait after last improvement. -
mode: 'min' or 'max'. -
min_delta: Minimum change to qualify as an improvement.
-
Optimization Configuration (optimize_template.yaml)
-
study_name: Name for the Optuna study. -
trials: Number of optimization trials to run. -
seed: Random seed for the optimization sampler. -
metric: Metric to optimize (e.g.,val_loss). -
direction: 'minimize' or 'maximize'. -
sampler: Optuna sampler configuration.-
name: Path to the sampler class (e.g.,optuna.samplers.TPESampler). -
kwargs: (Optional) Arguments for the sampler.
-
-
pruner: (Optional) Optuna pruner configuration.-
name: Path to the pruner class (e.g.,pruner.PFLPruner). -
kwargs: Arguments for the pruner.
-
-
search_space: Defines hyperparameters to search. Nested undernet_config,optimizer_config, etc.-
type: 'int', 'float', or 'categorical'. -
min,max: Range for numerical types. -
log:truefor logarithmic scale (float). -
step: Step size (int). -
choices: List of options (categorical).
-
Customization
This template is designed for flexibility. Here’s how to customize different parts:
1. Customizing Run Configurations
Modify the parameters in a run configuration YAML file (like configs/run_template.yaml) to change experiment settings.
Example: Let's create configs/run_mlp_small_fastlr.yaml based on run_template.yaml but with a smaller network and a different learning rate.
Original configs/run_template.yaml (simplified):
```yaml
configs/run_template.yaml
project: PyTorchTemplate device: cuda:0 net: model.MLP optimizer: torch.optim.adamw.AdamW scheduler: hyperboliclr.ExpHyperbolicLR epochs: 50 seeds: [89, 231, 928, 814, 269] netconfig: nodes: 64 # Original nodes layers: 4 optimizerconfig: lr: 1.e-3 # Original LR schedulerconfig: upperbound: 250 maxiter: 50 infimumlr: 1.e-5 ... ````
New configs/run_mlp_small_fastlr.yaml:
```yaml
configs/runmlpsmall_fastlr.yaml
project: PyTorchTemplateSmallMLP # Maybe change project name device: cuda:0 net: model.MLP optimizer: torch.optim.adamw.AdamW scheduler: hyperboliclr.ExpHyperbolicLR # Or change scheduler epochs: 50 seeds: [42, 123] # Use different seeds if desired netconfig: nodes: 32 # Changed nodes layers: 3 # Changed layers optimizerconfig: lr: 5.e-3 # Changed learning rate schedulerconfig: # Adjust scheduler params if needed, e.g., related to epochs or LR upperbound: 250 maxiter: 50 infimumlr: 1.e-5 ... # Keep or adjust other settings like earlystopping ```
Now you can run this specific configuration:
sh
python main.py --run_config configs/run_mlp_small_fastlr.yaml
2. Customizing Optimization Search Space
Modify the search_space section in your optimization configuration file (e.g., configs/optimize_template.yaml) to change which hyperparameters Optuna searches over and their ranges/choices.
Example: Adjusting the search space in configs/optimize_template.yaml.
Original search_space (simplified):
```yaml
configs/optimize_template.yaml
... searchspace: netconfig: nodes: type: categorical choices: [32, 64, 128] # Original choices layers: type: int min: 3 max: 5 # Original max optimizerconfig: lr: type: float min: 1.e-3 # Original min LR max: 1.e-2 log: true schedulerconfig: infimumlr: # Only searching infimumlr type: float min: 1.e-7 max: 1.e-4 log: true ... ```
Modified search_space:
```yaml
configs/optimize_template.yaml
... searchspace: netconfig: nodes: type: categorical choices: [64, 128, 256] # Changed choices for nodes layers: type: int min: 4 # Changed min layers max: 6 # Changed max layers optimizerconfig: lr: type: float min: 5.e-4 # Changed min LR max: 5.e-3 # Changed max LR log: true schedulerconfig: # Add search for upperbound upperbound: type: int min: 100 max: 300 step: 50 infimum_lr: type: float min: 1.e-6 # Changed range max: 1.e-5 log: true ... ```
This updated configuration will search over different node sizes, layer counts, learning rates, and scheduler parameters.
3. Using Different Optuna Samplers (e.g., GridSampler)
You can change the sampler used by Optuna by modifying the sampler section in configs/optimize_template.yaml.
Example: Switching from TPESampler to GridSampler.
Original sampler section:
```yaml
configs/optimize_template.yaml
... sampler: name: optuna.samplers.TPESampler #kwargs: # nstartuptrials: 10 ... ```
Using GridSampler:
```yaml
configs/optimize_template.yaml
... sampler: name: optuna.samplers.GridSampler # Changed sampler name # kwargs: {} # GridSampler often doesn't need kwargs here ...
IMPORTANT CONDITION for GridSampler:
All parameters defined in the 'search_space' MUST be of type 'categorical'.
GridSampler explores all combinations of the categorical choices.
If your search_space contains 'int' or 'float' types, using GridSampler
will cause an error based on the current implementation in config.py.
(See createsampler and gridsearchspace methods)
Example search_space compatible with GridSampler:
searchspace: netconfig: nodes: type: categorical choices: [64, 128] layers: type: categorical # Must be categorical choices: [3, 4] optimizerconfig: lr: type: categorical # Must be categorical choices: [1.e-3, 5.e-3] schedulerconfig: infimum_lr: type: categorical # Must be categorical choices: [1.e-5, 1.e-6] ... ```
Condition: To use GridSampler, ensure all parameters listed under search_space have type: categorical. The code automatically constructs the required format for GridSampler but only if this condition is met.
4. Adding Custom Models, Optimizers, Schedulers, Pruners
- Models: Create your model class (inheriting from
torch.nn.Module) inmodel.pyor a new Python file. Ensure its__init__method accepts a config dictionary (e.g.,net_configfrom the YAML) as the first argument. Update thenet:path in your run config YAML. - Optimizers/Schedulers: Implement your custom classes or use existing ones from
torch.optimor elsewhere (likehyperbolic_lr.py). Update theoptimizer:orscheduler:path and*_configdictionaries in the YAML. The template usesimportlibto load classes dynamically based on the paths provided. - Pruners: Create your pruner class (inheriting from
pruner.BasePruneror implementing the Optuna pruner interface) inpruner.pyor a new file. Update thepruner:section in the optimization YAML.
5. Customizing Data Loading
- Modify the
load_datafunction inutil.pyto load your specific dataset. It should return PyTorchDatasetobjects for training and validation.
6. Customizing the Training Loop
- Modify the
Trainerclass inutil.py. Adjust thetrain_epoch,val_epoch, andtrainmethods for your specific task, loss functions, or metrics. Ensure thetrainmethod returns the value specified as themetricin your optimization config if applicable.
Analysis Script (analyze.py)
The analyze.py script provides an interactive command-line interface to load and inspect results from completed runs.
- It uses helper functions from
util.py(likeselect_project,select_group,select_seed,load_model,load_study,load_best_model) to navigate the saved runs in theruns/directory. - You can easily extend the
mainfunction inanalyze.pyto perform more detailed analysis, plotting, or evaluation specific to your project needs.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is provided as a template and is intended to be freely used, modified, and distributed. Users of this template are encouraged to choose a license that best suits their specific project needs.
For the template itself:
- You are free to use, modify, and distribute this template.
- No attribution is required, although it is appreciated.
- The template is provided "as is", without warranty of any kind.
When using this template for your own project, please remember to:
- Remove this license section or replace it with your chosen license.
- Ensure all dependencies and libraries used in your project comply with their respective licenses.
For more information on choosing a license, visit https://choosealicense.com/.
Acknowledgments
This template includes copies of external libraries and tools, such as:
HyperbolicLR for hyperbolic curve based learning rate scheduling.
SPlus for SPlus optimizer.
Appendix
PFL (Predicted Final Loss) Pruner
### Overview The PFL pruner (`pruner.PFLPruner`) is a custom pruner inspired by techniques to predict the final performance of a training run based on early-stage metrics. It helps optimize hyperparameter search by early stopping unpromising trials based on their predicted final loss (`pfl`). ### Key Features - Maintains a list of the `top_k` best-performing completed trials based on their final validation loss. - For ongoing trials (after a warmup period), it predicts the final loss based on the current loss history. - It compares the current trial's predicted final loss (`pfl`) with the minimum `pfl` observed among the `top_k` completed trials. - Prunes the current trial if its predicted final loss is worse (lower, since `pfl` is -log10(loss)) than the worst `pfl` in the top-k list. - Supports multi-seed runs by averaging metrics across seeds for decision making. - Integrates with Optuna's study mechanism. ### Configuration In your `optimize_template.yaml`, configure the pruner under the `pruner` section: ```yaml pruner: name: pruner.PFLPruner # Path to the pruner class kwargs: n_startup_trials: 10 # Number of trials to complete before pruning starts n_warmup_epochs: 10 # Number of epochs within a trial before pruning is considered top_k: 10 # Number of best completed trials to keep track of target_epoch: 50 # The target epoch used for predicting final loss ``` ### How It Works 1. The first `n_startup_trials` run to completion without being pruned to establish baseline performance. 2. For subsequent trials, pruning is considered only after `n_warmup_epochs`. 3. The pruner calculates the average predicted final loss (`pfl`) for the current trial based on the loss history across its seeds. 4. It compares this `pfl` to the `pfl` values of the `top_k` trials that have already completed. 5. If the current trial's `pfl` is lower than the minimum `pfl` recorded among the top completed trials, the trial is pruned (as lower `pfl` indicates worse predicted performance). 6. When a trial completes, its final validation loss and `pfl` are considered for inclusion in the `top_k` list.Owner
- Name: Tae-Geun Kim
- Login: Axect
- Kind: user
- Location: Seoul, South Korea
- Company: Yonsei Univ.
- Website: https://axect.github.io
- Repositories: 21
- Profile: https://github.com/Axect
Ph.D student of particle physics & Rustacean
GitHub Events
Total
- Watch event: 2
- Push event: 39
- Fork event: 1
Last Year
- Watch event: 2
- Push event: 39
- Fork event: 1
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0