imdd-task

Short-reach Optical Communication: A Real-world Task for Neuromorphic Hardware

https://github.com/imdd-task/imdd-task

Keywords

benchmark dataset event-based neuromorphic optical-communication

Last synced: 6 months ago · JSON representation ·

Repository

Short-reach Optical Communication: A Real-world Task for Neuromorphic Hardware

Basic Info

Host: GitHub
Owner: imdd-task
Language: Python
Default Branch: master
Homepage:
Size: 1.3 MB

Statistics

Stars: 2
Watchers: 1
Forks: 1
Open Issues: 1
Releases: 0

Topics

benchmark dataset event-based neuromorphic optical-communication

Created about 1 year ago · Last pushed 11 months ago

Metadata Files

Readme Citation

Short-reach Optical Communications Datasets

Here, we describe an IM/DD task that is relevant to high-speed optical communication systems used in data centers. Compared to other machine learning-inspired benchmarks, the task offers several advantages. First, the dataset is inherently time-dependent, i.e., there is a time dimension that can be natively mapped to the dynamic evolution of SNNs. Second, small-scale SNNs can achieve the target accuracy required by technical communication standards. Third, due to the small scale and the defined target accuracy, the task facilitates the optimization for real-world aspects, such as energy efficiency, resource requirements, and system complexity.

The following description is widely taken from Arnold et al (2024), to which the reader is referred to for more details.

Communications

The goal of a communication system is to transmit a binary message $\boldsymbol{b}\in{0,1}^{MN}$ of length $MN$ from a transmitter to a receiver over a physical channel without error. To transmit the binary message, a group of ${M=\log2|\mathcal{X}|}$ bits is mapped to a transmit symbol ${x\in\mathcal{X}}$, where ${\mathcal{X} \subset \mathbb{R}}$ denotes the set of transmit symbols. Hence, transmission of $M \cdot N$ bits results in a sequence of $N$ transmits symbols ${\boldsymbol{x} \in \mathcal{X}^N}$. The transmit symbols $\boldsymbol{x}$ are disturbed by the channel, resulting in the channel output ${\boldsymbol{y} \in \mathbb{C}^N}$. Based on the observation of $\boldsymbol{y}$, the receiver outputs an estimate of the transmit bit sequence $\hat{\boldsymbol{b}}$. The metric to classify the performance of the communication system is the BER: ```math \mathrm{BER} = \frac{1}{MN} \sum{n=0}^{MN-1} 1\left( bn \neq \hat{b}n \right) \quad \text{with}\quad 1\left( bn \neq \hat{b}n \right) = \begin{cases} 0 \quad &\text{if} \quad bn = \hat{b}n\, , \ 1 \quad &\text{if} \quad bn \neq \hat{b}n \ \end{cases} \, . ```

Intensity modulation-direct detection (IM/DD) Link

The IM/DD system is commonly used, e.g. in datacenters. $M$ bits are grouped together and mapped to a real-valued transmit symbol $x$ using a pulse-amplitude modulation (PAM), where each of the ${m=2^M}$ symbols of the PAM represents an amplitude level. In the tasks, PAM 4-level with $m=4$ intensity levels is used, and pairs of $M=\log2 m = 2$ bits are mapped to one transmit symbol at index $q$ in $\mathcal{X}$. We denote the bits mapped to $x[k]$ as $b1[k]$ and $b_2[k]$:

| Symbol index $q$ | 1 | 2 | 3 | 4 | |---|:-:|:-:|---|---| | $b1[k]\,b2[k]$ | 00 | 01 | 11 | 10 |

The transmit sequence $\boldsymbol{x}$ is converted into a continuous-time signal, which is modulated onto the laser intensity. Then, the modulated light is coupled to the optical fiber. Inside the fiber, the frequency-dependent propagation speed results in chromatic dispersion (CD), and the symbols experience dispersion in the time domain. Hence, consecutive symbols overlap and interfere, yielding the so-called inter-symbol interference (ISI). The impact of CD on the $k$-th received symbol $y[k]$ can be narrowed down to the $n\mathrm{ISI}$ neighboring symbols. At the output of the optical fiber, the signal intensity is measured by a photo diode (PD), corresponding to a squaring of the electric field and introducing a non-linear distortion. Additionally, thermal noise in the trans-impedance amplifier following the PD is modeled as AWGN with variance $\sigma\mathrm{n}^2$. In summary, we can write

math y[k] = f\left( x\left[k - \Big\lfloor \frac{n_\mathrm{ISI}}{2} \Big\rfloor \right],\ldots,x[k],\ldots, x\left[k + \Big\lfloor \frac{n_\mathrm{ISI}}{2} \Big\rfloor \right] \right) + n[k] \quad \text{with} \quad n[k] \sim \mathcal{N}(0,\sigma_\mathrm{n}^2),

where $f(\cdot): \mathcal{X}^{n_{\mathrm{ISI}}}\rightarrow \mathbb{R}$ is an arbitrary function modeling both ISI and the non-linear distortion; $n[k]$ models the AWGN.

After the PD, the sequence of received values $\boldsymbol{y}$ is sampled over time. The sequence $\boldsymbol{y}$ is passed to the receiver, which implements both channel equalization and demapping. The receiver tries to mitigate the impact of the IM/DD link and recover the transmit bits $\boldsymbol{\hat{b}}$. For a more detailed description of the implementation of the IM/DD link, the interested reader is referred to [1].

To ensure reliable transmission in practical systems, at the output of the receiver, a target BER is defined. For a given link and noise power $\sigma_\mathrm{n}^2$, more powerful receivers can achieve a lower BER, resulting in a greater margin to the target BER. This margin can be used to either realize higher data rates, transmit over longer fibers, or use less transmit power.

Tasks

We define two different parametrizations of the IM/DD link as separate tasks. During training, the user is free to alter the parameters, e.g., the noise power $\sigma_\mathrm{n}^2$. However, during testing, only \texttt{n_taps} can be chosen freely.

LCD-Task

The low chromatic dispersion Task (LCD-Task) emphasizes non-linear impairment with little CD and thus weak ISI. Receivers require a smaller number of $n_\text{taps}$ consecutive input samples or shorter memory, making it an excellent dataset for exploring resource efficiency and novel algorithms with small-scale networks.

Parameters

| Parameter | Value | |---|:-:| | N | 10000 | | n_taps | 7 | | alphabet | [-3, -1, 1, 3] | | oversampling_factor | 3 | | baudrate | 112 GBd | | wavelength | 1270 mn | | dispersion_parameter | -5 ps/nm/km | | fiber_length | 4 km | | noise_power_db | -20 dB | | roll_off | 0.2 | | bias | 2.25 |

SSMF-Task

The standard single-mode fiber Task (SSMF-Task) models the standard single-mode fiber with a used wavelength of 1550 nm, which is widely used in practice due to its low attenuation and, thus, wide range. The IM/DD link suffers from severe ISI, increasing the number of required input samples $n_\text{taps}$ and hence the need for more powerful equalizers.

Parameters

| Parameter | Value | |---|:-:| | N | 10000 | | n_taps | 21 | | alphabet | [0, 1, $\sqrt{2}$, $\sqrt{3}$] | | oversampling_factor | 3 | | baudrate | 50 GBd | | wavelength | 1550 mn | | dispersion_parameter | -17 ps/nm/km | | fiber_length | 5 km | | noise_power_db | -20 dB | | roll_off | 0.2 | | bias | 0.25 |

Evaluation

To demonstrate the demapping performance, the achieved BER at a noise power of $\sigma\mathrm{n}^2=-20\text{\,dB}$ should be reported. The BER for various $\sigma\mathrm{n}^2$ should be plotted, showing the ability of the receiver to generalize over a wide range of $\sigma\mathrm{n}^2$. Incrementing the noise power $\sigma\mathrm{n}^2$ by $1\text{\,dB}$ provides a sufficiently fine resolution. To ensure statistical significance of the reported BER, for each $\sigma_\mathrm{n}^2$ and thus BER data point, new data should be transmitted until a minimum of 2000 bit error events are encountered. To measure the complexity of the model and anticipate its energy consumption, the number of trainable parameters and, in the case of SNN, the average number of spikes sent to the SNN and the spikes emitted by the neurons within the SNN per processing of one symbol are to be reported.

Using the Dataset

The PyTorch-based datasets corresponding to the defined tasks are imported as LCDDataset and SSMFDataset.

```python from torch.utils.data import DataLoader from IMDD import LCDDataset

Dataset

dataset = LCDDataset(bitwise=False) dataset.setntaps(ntap)

Data loader

dataloader = DataLoader(dataset, batchsize, shuffle=True) for (ychunk, q) in dataloader: ... # train ```

After completing one epoch (i.e., accessing all $\texttt{N}$ samples in the dataset), the dataset generates new data instead of reusing the same transmit samples, which can be disabled by setting continuous_sampling=False. The number of taps $n\text{taps}$ and the noise power $\sigma\text{n}^2$ are set by via set_n_taps and set_noise_power_db.

An IMDDModel in the dataset generates indices $q[k]$ of transmit symbols $\boldsymbol{x}$ in $\mathcal{X}$ and outputs the received chunks $\boldsymbol{y}^\text{c}[k]$, from which the pairs of bits $\hat{b}1[k]\hat{b}2[k]$ are estimated. The dataset returns a chunk tensor at index k of shape (n_taps,), representing the chunked received data together with the index $q[k]$ of shape (1,). Hence, the tensor holds all symbols required to process $y[k]$ at position n_taps//2. This allows shuffling and batching along the first dimension using a PyTorch Dataloader, resulting in y_chunk of shape (batch_size, n_taps) and q of shape (batch_size,). Each transmit symbol is labeled with its assigned bits, generated using the provided get_graylabels function, and accessed at the corresponding index q. By default, the dataset does not directly return bit-level labels to maintain flexibility. This design supports symbol-level receivers, allowing models to output class index $\hat{q}$ corresponding to the predicted symbol $\hat{x} \in \mathcal{X}$, which can be used together with $q$ in a cross-entropy loss function. For tasks requiring bit-level outputs, setting bit_level=True in the dataset configuration changes the format of $\boldsymbol{q}$ to (N, 2), where each entry contains binary values corresponding to the respective bits. To compute the BER based on either predicted symbols or bits, we provide a helper function bit_error_rate.

An arbitrarily parameterized IM/DD link can be created by passing an instance of IMDDParams to an IMDDModel. A dataset is created by instantiating IMDDDataset with the given parameters. This allows users to adapt the provided link implementations to suit their specific requirements.

```python from IMDD import IMDDDataset, IMDDParams

Parameterization

params = IMDDParams(...)

Link model

link = IMDDModel(params)

Dataset

dataset = IMDDDataset(params, bit_level=False) ```

Citation

If you use this dataset, please cite as: python ...

References

[1] E. Arnold, G. Böcherer, F. Strasser, E. Müller, P. Spilger, S. Billaudelle, et al., “Spiking neural network nonlinear demapping on neuromorphic hardware for IM/DD optical communication,” Journal of Lightwave Technology, vol. 41, no. 11, pp. 1–8, 2023. DOI: 10.1109/JLT.2023. 3252819.

Owner

Name: IM/DD Task
Login: imdd-task
Kind: organization

Repositories: 1
Profile: https://github.com/imdd-task

Short-reach Optical Communication: A Real-world Task for Neuromorphic Hardware

Citation (CITATION.cff)

cff-version: 1.2.0
title: >-
  Short-reach Optical Communications: A Real-world Task for
  Neuromorphic Hardware
message: >-
  If you use this dataset, please cite it using the metadata
  from this file.
type: dataset
authors:
  - given-names: Elias
    family-names: Arnold
  - given-names: Eike-Manuel
    family-names: Edelmann
  - given-names: Alexander
    family-names: von Bank
  - given-names: Eric
    family-names: Müller
  - given-names: Laurent
    family-names: Schmalen
  - given-names: Johannes
    family-names: Schemmel
identifiers:
  - type: doi
    value: 10.48550/arXiv.2412.03129
    description: Preprint
repository-code: 'https://github.com/imdd-task/imdd-task'
url: 'https://github.com/imdd-task/imdd-task'
abstract: >-
  Spiking neural networks (SNNs) emulated on dedicated
  neuromorphic accelerators promise to offer
  energy-efficient signal processing. However, the
  neuromorphic advantage over traditional algorithms still
  remains to be demonstrated in real-world applications.
  Here, we describe an intensity-modulation,
  direct-detection (IM/DD) task that is relevant to
  high-speed optical communication systems used in data
  centers. Compared to other machine learning-inspired
  benchmarks, the task offers several advantages. First, the
  dataset is inherently time-dependent, i.e., there is a
  time dimension that can be natively mapped to the dynamic
  evolution of SNNs. Second, small-scale SNNs can achieve
  the target accuracy required by technical communication
  standards. Third, due to the small scale and the defined
  target accuracy, the task facilitates the optimization for
  real-world aspects, such as energy efficiency, resource
  requirements, and system complexity.
keywords:
  - dataset
  - benchmark
  - neuromorphic
  - event-based
  - optical communication
version: 0.0.1
date-released: '2024-12-04'

GitHub Events

Total

Watch event: 3
Push event: 17
Pull request event: 1
Fork event: 1
Create event: 2

Last Year

Watch event: 3
Push event: 17
Pull request event: 1
Fork event: 1
Create event: 2

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 0
Total pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

AlexandervonBank (1)

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- pypi 21 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 1
Total maintainers: 1

pypi.org: imdd-task

The IMDD Task package provides an IM/DD channel implementation and two predefined dataset for training demappers / equalizers

Homepage: https://github.com/imdd-task/imdd-task
Documentation: https://imdd-task.readthedocs.io/
Latest release: 0.0.1
published about 1 year ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 21 Last month

Rankings

Dependent packages count: 9.9%

Forks count: 31.8%

Average: 34.8%

Stargazers count: 41.5%

Dependent repos count: 55.9%

Maintainers (1)

muffgaga

Last synced: 6 months ago

imdd-task

Science Score: 57.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Short-reach Optical Communications Datasets

Communications

Intensity modulation-direct detection (IM/DD) Link

Tasks

LCD-Task

Parameters

SSMF-Task

Parameters

Evaluation

Using the Dataset

Dataset

Data loader

Parameterization

Link model

Dataset

Citation

References

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: imdd-task

Rankings

Maintainers (1)

Dependencies