MininetGym

A Live Demonstration of RL-Based Cybersecurity Training

Salvo Finistrella · Stefano Mariani · Franco Zambonelli

DISMI — University of Modena and Reggio Emilia

AAMAS 2026 · Paphos, Cyprus

SDN Emulation Reinforcement Learning Cybersecurity Multi-Agent

System Architecture

Web Dashboard Flask · SocketIO · Chart.js · Bootstrap

↓

Agent Manager Training · Evaluation · Metrics · PDF Export

↓

Tabular Agents
Q-Learning · SARSA

Deep RL (SB3)
DQN · PPO · A2C

↓

Gymnasium Environments Classification · Attack-Net · Attack-PerHost · MARL

↓

Mininet SDN Network OVS Switch · Hosts · IoT Devices · OpenFlow Monitor

↓

OpenDayLight
SDN Controller · Drop Rules

Attack Generator
UDP/TCP/ICMP Flood · Slowloris · SYN Flood

RL Cybersecurity Scenarios

Scenario 1

Traffic Classification

Classify live traffic: None · Ping · UDP · TCP
Obs: ℝ⁴ global aggregates · Actions: {0,1,2,3}
Reward: graded by distance from correct label

Alert only

Scenario 2

Network-Level Attack Detection

Binary: Normal vs. Attack at global network level
Obs: ℝ⁴ (pkts, Δpkts%, bytes, Δbytes%)
Actions: {Normal, Attack}

Alert only

Scenario 3 · Primary

Per-Host Attack Detection

Per-host: Normal / Victim / Attacker
Obs: ℝ⁸ per host (TX+RX pkt/byte+Δ)
PerHostScanWrapper — constant obs size

SDN Link Blocking

Scenario 4

Multi-Agent Hierarchical (MARL)

Coordinator + Host agents with message-bus comms
Coordinator: ℝ⁵ · Host: ℝ⁹ · Parallel threads

Distributed MARL

Live Demo Walkthrough

1

Introduction

Open the URL on any device — laptop or mobile. Single-page app: Configuration · Training · Results.

2

Configuration

Set topology (hosts & IoT), choose scenario, add agents (PPO, DQN, Q-Learning), tune hyperparameters.

3

Experiment

Live reward & accuracy charts via WebSocket. Host-status shows attacks & SDN blocking. Also on mobile.

4

Result Evaluation

Per-agent accuracy tables, confusion matrices, mitigation gauge. PDF export in one click.

Watch the Live Demo Video

▶

QR loading…

Scan to watch on your device

youtu.be/sSzUz6w-4H8

▶ Watch here

Step 3 — Experiment in Action

MininetGym Training Dashboard

Real-time WebSocket updates Host task monitor — attacks visible SDN drop rule triggered Responsive mobile UI

Step 4 — Result Evaluation

MininetGym Results Panel

Metrics comparison chart Radar chart — agent profiles Mean accuracy up to 84.47% PDF export available

Supported Agents & Key Metrics

Supported RL Agents

Q-Learning (log-bin discretization)
SARSA (on-policy)
DQN — Deep Q-Network
PPO — Proximal Policy Optimization
A2C — Advantage Actor-Critic
Supervised Agent (baseline)

Key Evaluation Metrics

Accuracy · Precision · Recall · F1

Mitigation Ratio SDN blocks / attacks

False Negative Rate penalty ×2

Attack Latency steps to block

Best mean accuracy 84.47%

MininetGym

Open-source · Browser-accessible · Reproducible

GitHub

github.com/dipi-unimore/mininet-gym

Contact

salvo.finistrella@unimore.it

Personal website

finix77.github.io

AAMAS 2026 · Paphos, Cyprus