groupmap
GroupMap: beyond mean and variance matching for deep learning
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.7%) to scientific vocabulary
Keywords
Repository
GroupMap: beyond mean and variance matching for deep learning
Basic Info
Statistics
- Stars: 10
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Group Map: beyond mean and variance matching for deep learning
Define GroupMap, InstanceMap, LayerMap modules, to transform the input so that it follows a prescribed arbitrary distribution, like uniform, Gaussian, sparse, etc.
The main difference between the
GroupMap,InstanceMapandLayerMapmodules and their normalization-based counterpartsGroupNorm,InstanceNormandLayerNormis they enforces the output to match a whole distribution instead of just mean and variance. :warning: In this simplified implementation, there is no tracking of the input statistics and the module always uses the batch statistics for mapping, both at training and test time.

:warning: In this simplified implementation, there is no tracking of the input statistics and the module always uses the batch statistics for mapping, both at training and test time, similarly to
nn.LayerNorm,nn.GroupNorm, or the default behaviour ofnn.InstanceNorm2d.
What it does
Let $x$ be the input tensor, of arbitrary shape (B, C, ...) and let $x_{nc\boldsymbol{f}}$ be one of its entries for sample $n$, channel $c$ and a tuple of indices $\boldsymbol{f}$ corresponding to features. For instance, for images, we would have 2D features $\boldsymbol{f}=(i,j)$ for the row and column of a pixel.
For each element of the input tensor, the following transformation is applied:
$y{nc\boldsymbol{f}}=Q\left(F{nc}\left(x{nc\boldsymbol{f}}\right)\right) * \gamma{c\boldsymbol{f}} + \beta_{c\boldsymbol{f}}$
Where:
* $\forall q\in[0, 1],~Q(q)\in\mathbb{R}$ is the target quantile function. It describes what the distribution of the output should be and is provided by the user. The GroupMap module guarantees that the output will have a distribution that matches this target.
> Typically, $Q$ is the quantile function for a classical distribution like uniform, Gaussian, Cauchy, etc.
$Q(0)$ is the minimum of the target distribution, $Q(0.5)$ its median, $Q(1)$ its maximum, etc.
* $F{nc}(v)=\mathbb{P}(x{nc,\boldsymbol{f}}\leq v)\in[0, 1]$ is the input cumulative distribution function (cdf) for sample $n$ and channel $c$.
It is estimated on data for sample $n$. Several behaviours are possible, depending on which part of $xn$ it is computed from.
* It can be the cdf for just a particular channel $x{nc}$, then behaving like some optimal-transport version of InstanceNorm.
* It can be computed and shared over all channels of sample $xn$ (as in LayerNorm)
* It can be computed and shared over groups of channels (as in GroupNorm).
> $F{nc}(v)=0$ if $v$ is the minimum of the input distribution, $0.5$ for the median, $1$ for the maximum, etc.).
* $\gamma{c\boldsymbol{f}}$ and $\beta{c\boldsymbol{f}}$ are parameters for an affine transform that may or may not be activated. If it is activated, it matches classical behaviour, i.e. we have $\gamma{c\boldsymbol{f}}=\gamma{c}$ and $\beta{c\boldsymbol{f}}=\beta{c}$ for InstanceMap and GroupMap, while elementwise parameters for LayerMap.
This formula corresponds to the classical increasing rearrangement method to optimally transport scalar input data distributed wrt a distribution to another scalar distribution, by mapping quantile to quantile (min to min, median to median, max to max, etc.)
Usage
Specifics
The usage of the modules offered by groupmap purposefully matches that of classical normalization modules, so that they may be used as a drop-in replacement. There are two main subtleties with respect to the normalization-based ones.
target quantiles. All modules offer a target_quantiles parameter, which must be a callable taking a Tensor of numbers betweer 0 and 1 as inputs, and returning a Tensor of same shape as output with the corresponding quantiles for the target distribution.
The Module offers several default target quantiles functions: *
groupmap.uniform: defines the uniform distribution as $Q(q)=q$ *groupmap.gaussian: defines the Gaussian distribution as: $Q(q) = \sqrt{2}\text{erf}^{-1}(2q-1)$ (also known as the probit function). *groupmap.cauchy: defines the Cauchy distribution as $Q(q)=\tan(\pi(q-1/2))$.
Below is a quick description of the interface for quick reference. For a detailed description of the parameters to GroupMap, LayerMap and InstanceMap, please check the documentation for GroupNorm, LayerNorm and InstanceNorm, respectively.
eps. Instead of being a constant added to a denominator to avoid a division by zero as in $\star\text{Norm}$ modules, eps serves as the standard deviation for an actual random Gaussian noise that is added to the input. The consequence of doing so is to avoid duplicate values in the input, so that computation of the input CDF is well behaved. However, having a value $\epsilon\neq 0$ is not mandatory in mapping-based transformations.
:warning: all modules have
eps=0by default.
GroupMap
input distribution is computed on groups of channels. *
num_groups: number of groups to separate the channels into *num_channels: the number of channels expected in input, of shape (N, C, ...) *target_quantilesas detailed above. default isgroupmap.gaussian. *eps: as detailed above
LayerMap
input distribution is computed over each whole sample. *
normalized_shape: shape of each sample *elementwise_affine: whether or not to activate elementwise affine transformation of the output. If so,weightandbiasparameters are created with shapenormalized_shape. *target_quantilesas detailed above. default isgroupmap.gaussian.
InstanceMap
Input distribution is computed over each channel separately *
num_features: number of channelsCfor an input of shape(N, C, ...) *affine: whether or not to apply a channelwise affine transform. *trackrunningstats,momentum: in this implementation, these parameters are ignored. Statistics are computed from the input signal *anyways*, both at training and test times. *target_quantiles: as detailed above. default isgroupmap.gaussian. *eps`: as detailed above.
Owner
- Name: Antoine Liutkus
- Login: aliutkus
- Kind: user
- Location: France
- Company: @INRIA
- Repositories: 7
- Profile: https://github.com/aliutkus
Researcher at Inria
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: Liutkus
given-names: Antoine
orcid: https://orcid.org/0000-0002-3458-6498
title: "GroupMap: beyond mean and variance matching for deep learning"
version: 1.0
url: https://www.github.com/aliutkus/groupmap
date-released: 2022-09-22
GitHub Events
Total
Last Year
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0