https://github.com/aimilefth/fxpytorch
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.2%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: aimilefth
- License: mit
- Language: Python
- Default Branch: main
- Size: 159 KB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
FxPyTorch: Fixed-Point Quantization for PyTorch Layers
FxPyTorch is a Python library that extends PyTorch's nn.Module system to support symmetric, linear fixed-point quantization. It provides tools for simulating fixed-point arithmetic where the scaling factor is constrained to be a power of 2, which is often efficient for hardware implementations. The library helps analyze quantization effects and prepare models for deployment on hardware with fixed-point capabilities.
Features
- Fixed-Point Layer Implementations:
-
FxPLinear: Fixed-point Linear layer. -
FxPLayerNorm: Fixed-point Layer Normalization. -
FxPMultiheadAttention: Fixed-point Multi-Head Attention. -
FxPTransformerEncoderLayer: Fixed-point Transformer Encoder Layer. -
FxPSoftmax: Fixed-point Softmax. -
FxPDropout: Fixed-point Dropout (quantizes input/output, dropout itself is standard).
-
- Flexible Quantization Configuration (Symmetric, Power-of-2 Scaling):
- Implements symmetric linear quantization around zero.
- Uses power-of-2 scaling factors (determined by
fractional_bits) for efficient hardware mapping (e.g., bit shifts instead of multiplications). - Define
total_bitsandfractional_bitsfor weights, biases, and activations. - Choose rounding methods (e.g.,
ROUND_SATURATE,TRUNC_SATURATE). - Pydantic-based configuration models (
QType,LinearQConfig, etc.) for validation and clarity.
- Helper Utilities:
-
set_high_precision_quant(): Configure layers for maximum precision (e.g., 24 fractional bits) given their dynamic range, adhering to the symmetric, power-of-2 scheme. -
set_no_overflow_quant(): Configure layers to use a specified total number of bits for parameters, automatically calculating fractional bits to prevent overflow based on weight/bias dynamic range, adhering to the symmetric, power-of-2 scheme.
-
- Transparent Base Layers:
- Includes "transparent" versions of standard PyTorch layers (
LinearTransparent,LayerNormTransparent, etc.) that act as drop-in replacements fornn.Moduleequivalents but include hooks for activation logging. These serve as the base for theFxPlayers.
- Includes "transparent" versions of standard PyTorch layers (
- Activation Logging:
-
ActivationLoggerutility to inspect intermediate tensor values and their quantized counterparts throughout the model.
-
Installation
Prerequisites
- Python (>=3.8 recommended)
- PyTorch (>=2.2.0 recommended, see
pyproject.tomlfor specific version) - Pydantic (>=2.0, see
pyproject.toml)
From Git (Recommended for development or as a submodule)
You can include FxPyTorch in your project as a Git submodule:
bash
git submodule add https://github.com/yourusername/FxPyTorch.git
Quick Start
```python import torch from FxPyTorch.fxp.fxplinear import FxPLinear, LinearQConfig from FxPyTorch.fxp.symmetricsquant import QType, QMethod
Define a quantization configuration for a linear layer
Example: 8-bit weights, 8-bit bias, 16-bit input/activation with 8 fractional bits
linearqconfig = LinearQConfig( input=QType(totalbits=16, fractionalbits=8, qmethod=QMethod.ROUNDSATURATE), weight=QType(totalbits=8, qmethod=QMethod.ROUNDSATURATE), # Fractional bits determined by setnooverflowquant bias=QType(totalbits=8, qmethod=QMethod.ROUNDSATURATE), # Fractional bits determined by setnooverflowquant activation=QType(totalbits=16, fractionalbits=8, qmethod=QMethod.ROUNDSATURATE) )
Create a fixed-point linear layer
fxplinearlayer = FxPLinear(infeatures=10, outfeatures=5, bias=True, qconfig=linearq_config)
Initialize weights (e.g., load from a pre-trained floating-point model)
fxplinearlayer.loadstatedict(...)
If totalbits for weights/bias are set but fractionalbits are not,
you can automatically determine fractional_bits to avoid overflow:
fxplinearlayer.setnooverflow_quant()
print("Quantization Config after setnooverflowquant:") print(fxplinearlayer.qconfig.modeldumpjson(indent=2))
Create dummy input
dummy_input = torch.randn(1, 10)
Forward pass (simulates fixed-point arithmetic)
apply_ste=True uses Straight-Through Estimator for gradients during training
output = fxplinearlayer(dummyinput, applyste=True) print("\nOutput:", output)
To get truly quantized weights (e.g., for export):
fxplinearlayer.quantizeweightsbias() print("\nQuantized Weight:", fxplinearlayer.weight.data) ```
See the tests/ directory for more detailed usage examples of different layers and quantization scenarios.
Core Concepts
-
QType: Defines the bit-width (total_bits,fractional_bits) andQMethodfor a specific tensor (input, weight, bias, activation). -
*QConfig(e.g.,LinearQConfig): A Pydantic model that groupsQTypeconfigurations for all relevant tensors within a specific layer type. -
FxP*layers: PyTorch modules that implement fixed-point behavior. They typically inherit from a corresponding*Transparentlayer.- If
q_configisNone, they behave like standard floating-point layers. - If
q_configis provided, they simulate quantization during the forward pass.
- If
-
set_no_overflow_quant(): A method onFxP*layers. Iftotal_bitsis specified in theQTypefor weights/biases, this method calculates the optimalfractional_bitsto maximize precision while ensuring the current weight/bias values do not overflow. -
set_high_precision_quant(): A method that configures weights/biases to use a high number of fractional bits (e.g., 24) and calculates thetotal_bitsneeded to represent their current dynamic range. -
quantize_weights_bias(): A method to permanently alter the layer's weight and bias tensors to their quantized values. Useful before exporting weights. -
ActivationLogger: A utility to log intermediate tensor values during the forward pass for debugging and analysis.
Modules
-
fxp/: Contains the fixed-point layer implementations and core quantization logic.-
symmetrics_quant.py: Core symmetric quantization functions andQType/QConfigbase. -
utils.py: Helper utilities likeValueRange. -
fxp_*.py: Specific fixed-point layer implementations.
-
-
transparent/: Contains "transparent" base layers that mirror standard PyTorch layers but include hooks for activation logging.-
activation_logger.py: TheActivationLoggerclass. -
trans_*.py: Specific transparent layer implementations.
-
-
tests/: Unit tests and usage examples.
TODO / Future Work
- [ ] Explore quantization schemes with non-power-of-2 scaling factors
- [ ] Add support for asymmetric quantization.
- [ ] More comprehensive testing scenarios.
- [ ] Detailed documentation for each module and function.
- [ ] Performance benchmarking.
- [ ] Examples of exporting quantized weights for specific hardware targets.
License
This project is licensed under the MIT License. <!-- Make sure to add a LICENSE file --> ```
Owner
- Login: aimilefth
- Kind: user
- Repositories: 1
- Profile: https://github.com/aimilefth
GitHub Events
Total
- Watch event: 1
- Push event: 5
- Public event: 1
- Fork event: 1
- Create event: 2
Last Year
- Watch event: 1
- Push event: 5
- Public event: 1
- Fork event: 1
- Create event: 2
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 0
- Total pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: 6 minutes
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: 6 minutes
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- panagbouras (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- pydantic >=2.11.3
- torch >=2.2.0