MininetGym
A Live Demonstration of RL-Based Cybersecurity Training
Salvo Finistrella  ·  Stefano Mariani  ·  Franco Zambonelli
DISMI — University of Modena and Reggio Emilia
AAMAS 2026  ·  Paphos, Cyprus
SDN Emulation Reinforcement Learning Cybersecurity Multi-Agent
System Architecture
Web Dashboard Flask · SocketIO · Chart.js · Bootstrap
Agent Manager Training · Evaluation · Metrics · PDF Export
Tabular Agents
Q-Learning · SARSA
Deep RL (SB3)
DQN · PPO · A2C
Gymnasium Environments Classification · Attack-Net · Attack-PerHost · MARL
Mininet SDN Network OVS Switch · Hosts · IoT Devices · OpenFlow Monitor
OpenDayLight
SDN Controller · Drop Rules
Attack Generator
UDP/TCP/ICMP Flood · Slowloris · SYN Flood
RL Cybersecurity Scenarios
Scenario 1
Traffic Classification
Classify live traffic: None · Ping · UDP · TCP
Obs: ℝ4 global aggregates · Actions: {0,1,2,3}
Reward: graded by distance from correct label
Alert only
Scenario 2
Network-Level Attack Detection
Binary: Normal vs. Attack at global network level
Obs: ℝ4 (pkts, Δpkts%, bytes, Δbytes%)
Actions: {Normal, Attack}
Alert only
Scenario 3 · Primary
Per-Host Attack Detection
Per-host: Normal / Victim / Attacker
Obs: ℝ8 per host (TX+RX pkt/byte+Δ)
PerHostScanWrapper — constant obs size
SDN Link Blocking
Scenario 4
Multi-Agent Hierarchical (MARL)
Coordinator + Host agents with message-bus comms
Coordinator: ℝ5 · Host: ℝ9 · Parallel threads
Distributed MARL
Live Demo Walkthrough
1
Introduction
Open the URL on any device — laptop or mobile. Single-page app: Configuration · Training · Results.
2
Configuration
Set topology (hosts & IoT), choose scenario, add agents (PPO, DQN, Q-Learning), tune hyperparameters.
3
Experiment
Live reward & accuracy charts via WebSocket. Host-status shows attacks & SDN blocking. Also on mobile.
4
Result Evaluation
Per-agent accuracy tables, confusion matrices, mitigation gauge. PDF export in one click.
Watch the Live Demo Video
MininetGym Demo Video
QR loading…
Scan to watch on your device
youtu.be/sSzUz6w-4H8
▶ Watch here
Step 3 — Experiment in Action
MininetGym Training Dashboard
Training Dashboard
Training Mobile
Real-time WebSocket updates Host task monitor — attacks visible SDN drop rule triggered Responsive mobile UI
Step 4 — Result Evaluation
MininetGym Results Panel
Results Analysis
Results Mobile
Metrics comparison chart Radar chart — agent profiles Mean accuracy up to 84.47% PDF export available
Supported Agents & Key Metrics
Supported RL Agents
  • Q-Learning (log-bin discretization)
  • SARSA (on-policy)
  • DQN — Deep Q-Network
  • PPO — Proximal Policy Optimization
  • A2C — Advantage Actor-Critic
  • Supervised Agent (baseline)
Key Evaluation Metrics
Accuracy · Precision · Recall · F1
Mitigation Ratio SDN blocks / attacks
False Negative Rate penalty ×2
Attack Latency steps to block
Best mean accuracy 84.47%
MininetGym
Open-source · Browser-accessible · Reproducible
GitHub
github.com/dipi-unimore/mininet-gym
Contact
salvo.finistrella@unimore.it
Personal website
AAMAS 2026  ·  Paphos, Cyprus
Website  · MininetGym · AAMAS 2026 · Paphos, Cyprus · github.com/dipi-unimore/mininet-gym
Click or press Space for next slide