Recent Releases of torchdistill

torchdistill - Add a new text classification example, Interface & YAML util updates

Examples

  • Fix typos (PRs #500, #501)
  • Rename text_classification.py (PR #508)
  • Move GLUE-specific code to generallanguageunderstanding.py (PR #509)
  • Upgrade the text classification script, supporting multiple evaluation metrics with GoEmotions example (PR #510)

Interfaces

  • Add a new forward proc (PR #502)
  • Fix potential bugs (PR #507)

YAML util

  • Support call_method (PR #506)

Documentation

  • Update projects (PR #498)

Misc

  • Add repoordir (PR #493)
  • Update README (PRs #494, #496, #497, #499)
  • Simplify example (PR #495)
  • Update versions (PRs #492, #511)

- Python
Published by yoshitomo-matsubara 10 months ago

torchdistill - PyTorch 2.5 support, model migrations, end of Python 3.8 support

Python version

  • Update Python versions due to Python 3.8 EOL (PR #486)

Models

  • Migrate all bottleneck-injected models to sc2bench (PR #490)
  • Update with PyTorch Hub links (PR #489)
  • Update yaml example (PR #487)

Misc

  • Force-use pytorch hub when repoordir is given (PR #488)
  • Explicitly add evaluate to dependency (PR #483)
  • Update versions (PRs #480, #491)

- Python
Published by yoshitomo-matsubara about 1 year ago

torchdistill - A new KD method, new benchmark results, and updated YAML constructors

New method

  • Add KD with logits standardization (PR #460)

YAML configs

  • Fix the official config for SRD (Issue #471, PR #473)
  • Fix SRD config (Issue #471, PR #472)
  • Add os.path YAML constructors (PR #454)

Logs

  • Disable an auto-configuration for def_logger (Issue #465, PR #469)
  • Use warning (PR #468)

Documentation

  • Add a new benchmark (PR #464)
  • Update Projects page (PRs #456, #475)

Misc

  • Update README (PRs #461, #470)
  • Update a URL (PR #459)
  • Update GH Action vers (PRs #457, #458)
  • Update CITATION (PR #455)
  • Add a new DOI badge (PR #453)
  • Update version (PRs #452, #479)

- Python
Published by yoshitomo-matsubara over 1 year ago

torchdistill - New KD methods, updated YAML constructors, and low-level loss support

New methods

  • Add SRD method (PRs #436, #444, #446)
  • Add Knowledge Distillation from A Stronger Teacher method (PR #433)
  • Add Inter-Channel Correlation for Knowledge Distillation method (PR #432)

YAML constructor

  • Update functions in yaml_util (PR #447)
  • Fix docstrings and add importcallmethod & yaml constructor (PR #442)

Distillation/Training boxes

  • Enable auxiliary model wrapper builder to redesign input model (PR #437)

Registries

  • Add low-level registry and get functions (PR #426)

Documentation

  • Update benchmarks (PR #435)
  • Fix a typo (PR #424)

Examples

  • Replace dst with src (Issue https://github.com/roymiles/Simple-Recipe-Distillation/issues/1, PR #445)
  • Add Amazon SageMaker Studio Lab badges (PR #422)

Tests

  • Add a test case for importcallmethod (PR #443)
  • Add import test (PR #441)

Misc

  • Update citation info (PRs #438, #439, #440)
  • Update publication links (PR #430)
  • Update version (PR #425, #449, #451)
  • Update README (PRs #423, #434, #450)
  • Update image url (PR #421)

- Python
Published by yoshitomo-matsubara almost 2 years ago

torchdistill - New generation with new features and documentation

torchdistill v1.0.0 Release Notes

This major release supports PyTorch 2.0 and contains a lot of new features, documentation support, and breaking changes.

PyYAML configurations and executable scripts with torchdistill <= v0.3.3 should be considered "legacy" and are no longer supported by torchdistill >= v1.0.0. New PyYAML configurations and executable scripts are provided for the major release.

This release adds support for Python 3.10 and 3.11, and Python 3.7 is no longer supported.

Documentation

  • Update documents (PRs #400, #408)
  • Add docstrings (PRs #392, #393, #394, #395, #396, #397)
  • Add torchdistill logos (PRs #401, #402, #403)

Dependencies & Instantiation

  • Add getattr constructor (PR #325)
  • Make package arg optional (PR #322)
  • Enable dynamic module import/get/call (PR #319)
  • Add a function to import dependencies e.g., to register modules (PR #265)

Module registry

  • Add *args (PR #345)
  • Fix default value-related issues (PR #327)
  • No longer use lowered keys (PR #326, #332)
  • Disable lowering by default (PR #323)
  • Rename type/name key (PR #312)
  • Rename registry dicts and arguments for registry key (PR #269)
  • Raise errors when requested module keys are not registered (PR #263)
  • Enable naming modules to be registered (PR #262)

Distillation/Training boxes

  • Remove default forward_proc for transparency (PR #417)
  • Rename a forward_proc function (PR #414)
  • Simplify (D)DP wrapper init (PR #410)
  • Change the timing to print model setup info (PR #335)
  • Add an option to specify findunusedparameters for DDP (PR #334)
  • Do not touch teacher model by default (PR #333)
  • Training box does not have to inherit nn.Module class (PR #317)
  • Add interfaces package to core (PR #310)
  • Update forward interfaces (PR #307, #308)
  • Rename postprocess postepoch_process for consistency (PR #306)
  • Consider CosineAnnealingWarmRestarts in default post-epoch process functions (PR #305)
  • Make some common procedures in training box registrable/replaceable (PR #304)
  • Introduce {pre,post}-{epoch,forward} processes and registries (PR #274)
  • Rename post_forward functions (PR #272)
  • Make loss as kwarg (PR #273)

Forward hooks

  • Fix initialization issues in IO dict for SELFMODULEPATH (PR #328)

Dataset modules

  • Redesign split_dataset and remove unused functions (PR #360)
  • Update CRD dataset wrapper (PR #352)
  • Fix a bug (PR #351)
  • Add default args and kwargs (PR #347)
  • Add get_dataset (PR #324)

Loss modules

  • Fix a typo (PR #413, #415)
  • Add doc artifacts and an option to pass pre-instantiated loss module (PR #399)
  • Add DictLossWrapper (PR #337)
  • Rename an old function name PR #309)
  • Rename single loss middle-level loss (PR #300)
  • Explicitly define criterion wrapper (PR #298)
  • Change concepts of OrgLoss and org_term (PR #296)
  • Rename loss-related classes and functions (PR #294)
  • Add default forward process function and KDLoss back as a single loss (PR #275)
  • Remove org loss module and introduce self-module path (PR #271)

Model modules

  • Support parameter operations (Discussion #387, PR #388)
  • Replace pretrained with weights (PR #354)

Auxiliary model wrapper modules

  • Add findunusedparameters arg (PR #340)
  • Rename special in configs to auxiliarymodelwrapper (PR #291)
  • Rename special module for clarity (PR #276)

Optimizer/Scheduler modules

  • Fix bugs around optimizer/scheduler (PR #358)
  • epoch arg is deprecated for some LR schedulers (PR #338)

Examples

  • Revert legacy file paths to non-legacy ones (PR #419)
  • Update kwargs and scripts (PR #382)
  • Update yaml util and sample configs (CIFAR-10, CIFAR-100) for the next major release (PR #361)
  • Update sample script and configs (GLUE) for the next major release (PR #259)
  • --log was replaced with --run_log (PR #350)
  • dstckpt should be used when using -testonly (PR #349)
  • Simplify the semantic segmentation script (PR #339)
  • Move hardcoded-torchvision-specific code to local custom package (PR #331)
  • Update world_size, cudnn configs, and checkpoint message (PR #330)
  • Rename log argument due to the (abstract) conflict with torchrun (PR #329)
  • Restructure examples and export some example-specific packages (PR #320)
  • Add an option to disable torch.backend.cudnn.benchmark (PR #316)
  • Support stage-wise loading/saving checkpoints (PR #315)
  • Support srcckpt and dstckpt for initialization and saving checkpoints respectively (PR #314)
  • Use legacy configs and scripts tentatively (PR #292, #295)
  • Add legacy examples and configs (PR #289)

Configs

  • Declare forward_proc explicitly (PR #416)
  • Add configs used in NLP-OSS 2023 paper (PR #407)
  • Fix value based on log (PR #284)
  • Update sample configs (ILSVRC 2012, COCO 2017, and PASCAL VOC 2012) for the next major release (PR #357)
  • Update official configs for the next major release (PR #355)
  • Merge single/multistage directories (PR #346)
  • Rename variables (PR #344)
  • Rename "factor" "weight" (PR #302)
  • Restructure criterion (PR #301)
  • Consistently use "params" to indicate learnable parameters, not hyperparameters (PR #297)

Misc.

  • Add Google Analytics ID (PR #406)
  • Add sitemap.xml (PR #405)
  • Update timm repo (PR #375)
  • Add acknowledgments (PR #369)
  • Update file paths (PR #356)
  • Fix a typo and replace pretrained with weights (PR #353)
  • Remove the dict option as it is not intuitive for building transform(s) (PR #303)
  • Temporarily remove registry test (PR #293)
  • Add an important notice (PR #286)
  • Add read permission for content, following the new template (PR #284)
  • Refactor (PRs #268, #270, #283, #343)
  • Update README (PRs #252, #290, #299, #341, #342, #348, #364, #400, #409, #418)
  • Update versions (PRs #251, #391, #420)

Workflows

  • Add a GitHub Action for deploying Sphinx documentation (PR #404)

- Python
Published by yoshitomo-matsubara over 2 years ago

torchdistill - Updates, bug fixes, and end of apex support

Updates in APIs/scripts

  • Add square-sized random crop option (PR #224)
  • Replace torch.nograd() with torch.inferencemode() (PR #245)
  • Terminate apex support due to its maintenance mode (PRs #248, #249)

Bug fixes

  • Add a default value (Discussion #229, PR #230)
  • Fix a bug raised in torchvision (PR #231)
  • Fix a default parameter (PR #235)

Misc.

  • Fix a typo (PR #232)
  • Update Travis (PR #236)
  • Update README (PRs #228, #238, #240
  • Update versions (PRs #223, #250)

- Python
Published by yoshitomo-matsubara over 3 years ago

torchdistill - Minor bug fix and updates

Bug fix

  • Fix a potential bug in split_dataset (Issue #209, PR #210)

Misc.

  • Update GitHub workflow (PR #217)
  • Add local epoch for LambdaLR (PR #219)
  • Update versions (PRs #208, #220)

- Python
Published by yoshitomo-matsubara almost 4 years ago

torchdistill - Minor updates

Minor updates

  • Freeze module before rebuild if applicable (PR #205)
  • Refactor and improve result summary message (PR #206)
  • Update version (PRs #204, #207)

- Python
Published by yoshitomo-matsubara about 4 years ago

torchdistill - Bug fix

Bug fix

  • strict should not be used here (PR #202)

Minor update

  • Update version (PRs #201, #203)

- Python
Published by yoshitomo-matsubara about 4 years ago

torchdistill - Example and minor updates

Example updates

  • Restructure and make download=True (PR #190)
  • Make log_freq configurable for test (PR #191)
  • Refactor (PR #192)
  • Probably torch.cuda.synchronize() is no longer needed (PR #194)
  • Add an option to use teacher output (PR #195)
  • Replace nograd with inferencemode (PR #199)

Minor updates

  • Add strict arg (PR #193)
  • Add assert error message (PR #196)
  • Check if ckpt file path is string (PR #197)
  • Check if batch images are instance of Tensor (PR #198)
  • Update version (PRs #189, #200)

- Python
Published by yoshitomo-matsubara about 4 years ago

torchdistill - Add new features, PASCAL examples and pretrained models

New features

  • Add wrapped resize to enable specifying interpolation for resize (PR #182)
  • Add wrapped random crop resize to enable specifying interpolation for random crop resize (PR #183)
  • Enable to load ckpt containing only specific module and via URL (PR #187)

New examples and trained models

  • Add examples for PASCAL VOC 2012 (PRs #184, #186)
  • Update README (PR #185)
  • Add model weights of DeepLabv3 with ResNet-50/101 fine-tuned on PASCAL VOC 2012 (Segmentation)

| | mean IoU | global pixelwise acc | |-------------------------|---------:|---------------------:| | DeepLabv3 w/ ResNet-50 | 80.6 | 95.7 | | DeepLabv3 w/ ResNet-101 | 82.4 | 96.2 |

Model implementations are available in torchvision. These model weights are originally pretrained on COCO 2017 dataset (available in torchvision) and then fine-tuned on PASCAL VOC 2012 (Segmentation) dataset.

Minor updates

  • Add a version constant (PR #175)
  • Rename and add functions for ResNet-50 and ResNet-101 (PR #176)
  • Add CITATION file (PR #178)
  • Update version (PRs #174, #188)
  • Update README (PRs #179, #180)

- Python
Published by yoshitomo-matsubara about 4 years ago

torchdistill - Minor updates and bug fix to support PyTorch v1.10

Minor updates

  • Update version (PRs #161, #162, #173)
  • Update README (PRs #163, #164)
  • Add an option to log config (PR #169)

Bug fix

  • In PyTorch v.1.10, load_state_dict_from_url is no longer available in torchvision.models.utils (PR: #172)

- Python
Published by yoshitomo-matsubara over 4 years ago

torchdistill - Add KTAAD method and improve exampes

New method

  • Add knowledge translation and adaptation + affinity distillation for semantic segmentation (PR #158)

Minor updates

  • Update version (PRs #151, #160)
  • Update README (PRs #153, #159)
  • Stop training when facing NaN or Infinity (PR #157)

- Python
Published by yoshitomo-matsubara over 4 years ago

torchdistill - Add knowledge review method and new features

New method

  • Add knowledge review method (PRs #141, #145, #146)

The experimental result shown in README.md can be reproduced with this yaml file. The log and checkpoint file (including student model weights) are provided as part of Assets below.

New features

  • Make nn.ModuleList hookable (PR #139)
  • Support negative index in module path (PR #144)

Minor updates

  • Update version (PRs #137, #148)
  • Update README (PRs #138, #147)
  • Fix a typo (PR #142)

- Python
Published by yoshitomo-matsubara over 4 years ago

torchdistill - Minor updates and potential bug fix in package

Minor updates

  • Update version (PRs #128, #136)
  • Fix a typo (PR #130)
  • Make pin_memory configurable (PR #134)

Bug fix

  • Clear io_dict in pre-process (Issue #132, PR #135)

- Python
Published by yoshitomo-matsubara over 4 years ago

torchdistill - Bug fixes in package

Bug fixes

  • DistributedDataParallel is no longer allowed for wrapping models with no updatable parameters (Issue #122 PRs #124, #125)
  • Fix a bug in detecting collate function type (Issue #123 PR #126)

Misc

  • Update version (PR #127)

- Python
Published by yoshitomo-matsubara over 4 years ago

torchdistill - Update examples and support PyTorch v1.9.0

Examples

  • Improve log format (PR #111)
  • Tune hyperparameters for GLUE tasks (PRs #112, #113)
  • Add sample KD configs for GLUE tasks (PR #114)

Misc

  • Update notebooks (PR #115)
  • Update README (PR #116)
  • Update version (PRs #118, #120)
  • Support PyTorch v1.9.0 (PR #119)

- Python
Published by yoshitomo-matsubara over 4 years ago

torchdistill - Update HF support, examples and notebooks

Examples

  • Update GLUE example (PRs #97, #98, #99, #104, #106, #108)
  • Enable test prediction to make a submission for GLUE leaderboard (PR #102)
  • Add notebook (PRs #105, #109)

Bug fixes

  • Provide kwargs (PR #94)
  • Enable teacher to run in fp16 mode (PR #110)

Minor updates

  • Update README (PRs #93, #101, #102, #103 #107, #108)
  • Refactor / Fix typos (PRs #95, #96, #100, #101, #104)

- Python
Published by yoshitomo-matsubara almost 5 years ago

torchdistill - Support Hugging Face Transformers and Accelerate

New features and example

  • Introduce Hugging Face's Accelerate to better collaborate with their Transformers package (PR #91)
  • Introduce example of text classification (GLUE tasks) with Hugging Face's Transformers and datasets (PR #92)

Minor updates

  • Update README (PR #93)
  • Allow non-function collator and make filtering optimizer's params optional (PR #89)

- Python
Published by yoshitomo-matsubara almost 5 years ago

torchdistill - Example updates

Example updates

  • Add an example to show how to import models via PyTorch Hub (PR #83)
  • Add an option to set random seed for reproducibility (PR #85)
  • Add an example of segmentation model training (PR #86)

Restructuring

  • Refactor function util (PR #84)

Typo fixes

  • Fix typos in dataset util and examples (PR #88)

- Python
Published by yoshitomo-matsubara almost 5 years ago

torchdistill - Minor updates and bug fixes

Minor updates

  • Make IoU type selection model-free (PR #74)
  • Update loss string (PR #74)
  • Disable DDP when no params are updatable (PR #77)
  • Update README (PR #78)

Bug/Typo fixes

  • Fix typos in example commands (PR #76)
  • Fix typos in sample configs (PR #79)
  • Fix bugs for clip grad norm (PR #80)

- Python
Published by yoshitomo-matsubara almost 5 years ago

torchdistill - Minor updates and bug fixes

Minor updates

  • Update functions for object detection models (PR #59)
  • Update README (PRs #61 #62)

Minor bug fixes

  • Rename (PR #60)
  • Bug fixes (PR #73)

- Python
Published by yoshitomo-matsubara about 5 years ago

torchdistill - Support more detailed training configs and update official configs

Updated official README and configs

  • More detailed instructions (PRs #55, #56)
  • Restructured official configs (PR #55)
  • Updated FT config for ImageNet (PR #55)

Support detailed training configurations

  • Step-wise parameter update besides epoch-wise parameter update (PR #58)
  • Gradient accumulation (PR #58)
  • Max gradient norm (PR #58)

Bug/Typo fixes

  • Bug fixes (PRs #54, #57)
  • Typo fixes (PRs #53, #58)

- Python
Published by yoshitomo-matsubara about 5 years ago

torchdistill - Google Colab Examples and bug fixes

New examples

Bug fixes

  • Fixed a bug in init of DenseNet-BC (PR #48)
  • Resolved checkpoint name conflicts (PR #49)

- Python
Published by yoshitomo-matsubara about 5 years ago

torchdistill - TrainingBox, PyTorch Hub, random split, pretrained models for CIFAR-10 and CIFAR-100 datasets

New features

  • Added TrainingBox to train models without teachers (PR #39)
  • Supported PyTorch Hub in registry (PR #40)
  • Supported random split e.g., split training dataset into training and validation datasets (PR #41)
  • Added reimplemented models for CIFAR-10 and CIFAR-100 datasets (PR #41)

Pretrained models

Referred to the following repositories for training methods. - ResNet: https://github.com/facebookarchive/fb.resnet.torch - WRN (Wide ResNet): https://github.com/szagoruyko/wide-residual-networks - DenseNet-BC: https://github.com/liuzhuang13/DenseNet

Note that there are some accuracy gaps between these and those reported in their original studies.

| | CIFAR-10 | CIFAR-100 | |-------------------------------|---------:|----------:| | ResNet-20 | 91.92 | N/A | | ResNet-32 | 93.03 | N/A | | ResNet-44 | 93.20 | N/A | | ResNet-56 | 93.57 | N/A | | ResNet-110 | 93.50 | N/A | | WRN-40-4 | 95.24 | 79.44 | | WRN-28-10 | 95.53 | 81.27 | | WRN-16-8 | 94.76 | 79.26 | | DenseNet-BC (k=12, depth=100) | 95.53 | 77.14 |

- Python
Published by yoshitomo-matsubara about 5 years ago

torchdistill - Extended ForwardHookManager and bug fix

  • Extended ForwardHookManager (Issue #32 PR #33)
  • Fixed bugs around post_forward function caused by a gathering paradigm introduced to I/O dict (Issue #34 PR #35)

- Python
Published by yoshitomo-matsubara about 5 years ago

torchdistill - The first release of torchdistill

torchdistill

The first release of torchdistill with code and assets for "torchdistill: A Modular, Configuration-Driven Framework for Knowledge Distillation"

- Python
Published by yoshitomo-matsubara over 5 years ago