groupmap

GroupMap: beyond mean and variance matching for deep learning

https://github.com/aliutkus/groupmap

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.7%) to scientific vocabulary

Keywords

deeplearning machine-learning module normalization optimal-transport pytorch

Last synced: 6 months ago · JSON representation ·

Repository

GroupMap: beyond mean and variance matching for deep learning

Basic Info

Host: GitHub
Owner: aliutkus
License: mit
Language: Python
Default Branch: main
Homepage:
Size: 2.72 MB

Statistics

Stars: 10
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Topics

deeplearning machine-learning module normalization optimal-transport pytorch

Created over 3 years ago · Last pushed over 3 years ago

Metadata Files

Readme License Citation

Group Map: beyond mean and variance matching for deep learning

Define GroupMap, InstanceMap, LayerMap modules, to transform the input so that it follows a prescribed arbitrary distribution, like uniform, Gaussian, sparse, etc.

The main difference between the GroupMap, InstanceMap and LayerMap modules and their normalization-based counterparts GroupNorm, InstanceNorm and LayerNorm is they enforces the output to match a whole distribution instead of just mean and variance. :warning: In this simplified implementation, there is no tracking of the input statistics and the module always uses the batch statistics for mapping, both at training and test time.

:warning: In this simplified implementation, there is no tracking of the input statistics and the module always uses the batch statistics for mapping, both at training and test time, similarly to nn.LayerNorm, nn.GroupNorm, or the default behaviour of nn.InstanceNorm2d.

What it does

Let $x$ be the input tensor, of arbitrary shape (B, C, ...) and let $x_{nc\boldsymbol{f}}$ be one of its entries for sample $n$, channel $c$ and a tuple of indices $\boldsymbol{f}$ corresponding to features. For instance, for images, we would have 2D features $\boldsymbol{f}=(i,j)$ for the row and column of a pixel.

For each element of the input tensor, the following transformation is applied:

$y{nc\boldsymbol{f}}=Q\left(F{nc}\left(x{nc\boldsymbol{f}}\right)\right) * \gamma{c\boldsymbol{f}} + \beta_{c\boldsymbol{f}}$

Where:
* $\forall q\in[0, 1],~Q(q)\in\mathbb{R}$ is the target quantile function. It describes what the distribution of the output should be and is provided by the user. The GroupMap module guarantees that the output will have a distribution that matches this target. > Typically, $Q$ is the quantile function for a classical distribution like uniform, Gaussian, Cauchy, etc. $Q(0)$ is the minimum of the target distribution, $Q(0.5)$ its median, $Q(1)$ its maximum, etc. * $F{nc}(v)=\mathbb{P}(x{nc,\boldsymbol{f}}\leq v)\in[0, 1]$ is the input cumulative distribution function (cdf) for sample $n$ and channel $c$. It is estimated on data for sample $n$. Several behaviours are possible, depending on which part of $xn$ it is computed from. * It can be the cdf for just a particular channel $x{nc}$, then behaving like some optimal-transport version of InstanceNorm. * It can be computed and shared over all channels of sample $xn$ (as in LayerNorm) * It can be computed and shared over groups of channels (as in GroupNorm). > $F{nc}(v)=0$ if $v$ is the minimum of the input distribution, $0.5$ for the median, $1$ for the maximum, etc.).
* $\gamma{c\boldsymbol{f}}$ and $\beta{c\boldsymbol{f}}$ are parameters for an affine transform that may or may not be activated. If it is activated, it matches classical behaviour, i.e. we have $\gamma{c\boldsymbol{f}}=\gamma{c}$ and $\beta{c\boldsymbol{f}}=\beta{c}$ for InstanceMap and GroupMap, while elementwise parameters for LayerMap.

This formula corresponds to the classical increasing rearrangement method to optimally transport scalar input data distributed wrt a distribution to another scalar distribution, by mapping quantile to quantile (min to min, median to median, max to max, etc.)

Usage

Specifics

The usage of the modules offered by groupmap purposefully matches that of classical normalization modules, so that they may be used as a drop-in replacement. There are two main subtleties with respect to the normalization-based ones.

target quantiles. All modules offer a target_quantiles parameter, which must be a callable taking a Tensor of numbers betweer 0 and 1 as inputs, and returning a Tensor of same shape as output with the corresponding quantiles for the target distribution.

The Module offers several default target quantiles functions: * groupmap.uniform: defines the uniform distribution as $Q(q)=q$ * groupmap.gaussian: defines the Gaussian distribution as: $Q(q) = \sqrt{2}\text{erf}^{-1}(2q-1)$ (also known as the probit function). * groupmap.cauchy: defines the Cauchy distribution as $Q(q)=\tan(\pi(q-1/2))$.

Below is a quick description of the interface for quick reference. For a detailed description of the parameters to GroupMap, LayerMap and InstanceMap, please check the documentation for GroupNorm, LayerNorm and InstanceNorm, respectively.

eps. Instead of being a constant added to a denominator to avoid a division by zero as in $\star\text{Norm}$ modules, eps serves as the standard deviation for an actual random Gaussian noise that is added to the input. The consequence of doing so is to avoid duplicate values in the input, so that computation of the input CDF is well behaved. However, having a value $\epsilon\neq 0$ is not mandatory in mapping-based transformations.

:warning: all modules have eps=0 by default.

`GroupMap`

input distribution is computed on groups of channels. * num_groups: number of groups to separate the channels into * num_channels: the number of channels expected in input, of shape (N, C, ...) * target_quantiles as detailed above. default is groupmap.gaussian. * eps: as detailed above

`LayerMap`

input distribution is computed over each whole sample. * normalized_shape: shape of each sample * elementwise_affine: whether or not to activate elementwise affine transformation of the output. If so, weight and bias parameters are created with shape normalized_shape. * target_quantiles as detailed above. default is groupmap.gaussian.

`InstanceMap`

Input distribution is computed over each channel separately * num_features: number of channels C for an input of shape (N, C, ...) *affine: whether or not to apply a channelwise affine transform. *trackrunningstats,momentum: in this implementation, these parameters are ignored. Statistics are computed from the input signal *anyways*, both at training and test times. *target_quantiles: as detailed above. default isgroupmap.gaussian. *eps`: as detailed above.

Owner

Name: Antoine Liutkus
Login: aliutkus
Kind: user
Location: France
Company: @INRIA

Repositories: 7
Profile: https://github.com/aliutkus

Researcher at Inria

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: Liutkus
    given-names: Antoine
    orcid: https://orcid.org/0000-0002-3458-6498
title: "GroupMap: beyond mean and variance matching for deep learning"
version: 1.0
url: https://www.github.com/aliutkus/groupmap
date-released: 2022-09-22

GitHub Events

Total

Last Year

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science