https://github.com/cn-upb/nfvdeep

NFVdeep: Deep Reinforcement Learning for Online Orchestration of Service Function Chains

https://github.com/cn-upb/nfvdeep

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: acm.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.6%) to scientific vocabulary

Keywords

deep-learning machine-learning networking nfv orchestration python python3 reinforcement-learning sfc stable-baselines virtualization vnf-management vnf-placement
Last synced: 5 months ago · JSON representation

Repository

NFVdeep: Deep Reinforcement Learning for Online Orchestration of Service Function Chains

Basic Info
  • Host: GitHub
  • Owner: CN-UPB
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 420 KB
Statistics
  • Stars: 31
  • Watchers: 6
  • Forks: 13
  • Open Issues: 0
  • Releases: 0
Topics
deep-learning machine-learning networking nfv orchestration python python3 reinforcement-learning sfc stable-baselines virtualization vnf-management vnf-placement
Created over 5 years ago · Last pushed over 3 years ago
Metadata Files
Readme License

README.md

GitHub CI

NFVdeep

Deep Reinforcement Learning for Online Orchestration of Service Function Chains

Disclaimer: This is an unofficial implementation, trying to reproduce the deep reinforcement learning approach described in the NFVdeep paper by Xiao et al. as part of a graduate student project. While the implemented agent did learn over time, we were not able to reproduce the same results stated in the paper.

Advisor: Stefan Schneider

Developers: Nils Rudminat, Stefan Werner

Setup

Assuming an Anaconda (version 4.8.4) distribution has already been installed on an Ubuntu 18.04 machine, the environment can simply be cloned via conda env create -f environment.yml. Depending on your system's setup, the installation of additional packages for RayTune and the applied Tensorflow version might be necessary.

Experiments

The script.py file serves as an interface to running either baseline or DRL agents on the NFVdeep environment with their default parameterization, i.e. whithout employing hyperparamter optimization. Here, you can specify the overlay topology and the network's resources, as well as properties of the arrival process. For instance, we may train an stable-baseline's PPO DRL agent on the abilene network with incoming requests arising from a Poisson process by executing: console python script.py --agent PPO --overlay <data path>/abilene.gpickle --requests <request path>/requests.json --output <output path> --logs <log path>

Hyperparameter Optimization

We employ distributed (single node) Bayesian Optimization with BoTorch and RayTune in order to faciliate scalable hyperparameter optimization for our Reinforcement Learning agent. Specifically, we first specify a parameter search space from whom agent configurations are first sampled and subsequently evaluated. Here, tune.py provides an interface to access our implementation's tuned DRL agents. Note, however, that absolute paths must be used, for instance by executing: console python tune.py --agent PPO --overlay <abs data path>/abilene.gpickle --requests <abs request path>/requests.json --output <abs output path> --logs <abs log path>

Retrieving Placement Decisions

The placement decisions for VNFs of arriving service requests are automatically tabulated in the placements.txt file (shown below) under the --output path, whereas recorded monitoring metrics such as the obtained reward or acceptance rate are logged to results.csv. For each episode, trial and arriving service function request, we tabulate its arrival time, time-to-live, bandwidth demands, maximum end-to-end latency, requested VNFs (CPUs & memory) as well as a list of taken placement decisions (node indices). If the list of placements is empty, the request was not embedded to the substrate network and NFVdeep used its in-build backtracking mechanism to release bound resources.

```` Episode Trial Arrival TTL Bandwidth Max Latency VNFs (CPUs & memory) Placements


    0        0       4495     73      71.2339          10000  [(7, 7.52), (10, 5.69)]  [0, 2]

````

Experimental Study

Our evaluation is primarily based on generating arrival times with respect to a Poisson process (exponential arrival and service times) and is only loosely based on the evaluation proposed in the original NFVdeep paper. The load of individual SFCs and VNFs is uniformly sampled within their bounds specified in the respective requests.json files. All results simulate the SFC embedding problem on the real-world Abline network topology.

Sampled Input Traffic

First, we train and evaluate under randomly sampled input traffic traces and compare among PPO also its tuned variant against two heuristic baselines, i.e. against the random placement policy and a greedy first fit ('FirstFit') heuristic.

drawing

Evidently, neither DRL agent matches the greedy baseline's performance in terms of the cumulated episode reward. However, both DRL agents improve upon random placement decisions and in few cases also accomplish competitive results.

Static Input Traffic

The exogenous input process has a significant influence on an episode's trajectory independent from the respective agent's placement decisions. Therefore, training with episodes subject to randomly generated input traffic might cause high variance in the reward signal and ultimately prohibit effective policy improvement. Therefore, we replay input traffic in our experiments with 'static' input.

drawing

In comparison to the previous evaluation setup, the (tuned) PPO policy manages to close the reward gap to the FirstFit baseline significantly even though the greedy heuristic ultimately remains superior.

Latency Constraints

Lastly, we demonstrate that NFVdeep is not provided with effective means to learn concepts related to latency. Specifically, the agent is not provided with information that specifies its last placement decision and cannot determine an informed decision that minimizes latency. Hence, we compare the performance in two related scenarios where only the maximum latency constraints for SFCs vary.

drawing

While the FirstFit baseline accomplishes similar performance in both scenarios (the imposed maximum latency does not pose significant constraints on the placements), we find that the DRL agent's performance deteriorates which is consistent with the prior hypothesis.

Owner

  • Name: Computer Networks group @ UPB
  • Login: CN-UPB
  • Kind: organization
  • Location: Paderborn, Germany

Computer Networks Group of Paderborn University

GitHub Events

Total
  • Watch event: 7
  • Fork event: 1
Last Year
  • Watch event: 7
  • Fork event: 1

Dependencies

.github/workflows/python-package.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite