MininetGym
A Live Demonstration of RL-Based Cybersecurity Training
Salvo Finistrella · Stefano Mariani · Franco Zambonelli
DISMI — University of Modena and Reggio Emilia
AAMAS 2026 · Paphos, Cyprus
SDN Emulation
Reinforcement Learning
Cybersecurity
Multi-Agent
System Architecture
Web Dashboard Flask · SocketIO · Chart.js · Bootstrap
↓
Agent Manager Training · Evaluation · Metrics · PDF Export
↓
Tabular Agents
Q-Learning · SARSA
Deep RL (SB3)
DQN · PPO · A2C
↓
Gymnasium Environments Classification · Attack-Net · Attack-PerHost · MARL
↓
Mininet SDN Network OVS Switch · Hosts · IoT Devices · OpenFlow Monitor
↓
OpenDayLight
SDN Controller · Drop Rules
Attack Generator
UDP/TCP/ICMP Flood · Slowloris · SYN Flood
RL Cybersecurity Scenarios
Scenario 1
Traffic Classification
Classify live traffic: None · Ping · UDP · TCP
Obs: ℝ4 global aggregates · Actions: {0,1,2,3}
Reward: graded by distance from correct label
Alert only
Scenario 2
Network-Level Attack Detection
Binary: Normal vs. Attack at global network level
Obs: ℝ4 (pkts, Δpkts%, bytes, Δbytes%)
Actions: {Normal, Attack}
Alert only
Scenario 3 · Primary
Per-Host Attack Detection
Per-host: Normal / Victim / Attacker
Obs: ℝ8 per host (TX+RX pkt/byte+Δ)
PerHostScanWrapper — constant obs size
SDN Link Blocking
Scenario 4
Multi-Agent Hierarchical (MARL)
Coordinator + Host agents with message-bus comms
Coordinator: ℝ5 · Host: ℝ9 · Parallel threads
Distributed MARL
Live Demo Walkthrough
1
Introduction
Open the URL on any device — laptop or mobile. Single-page app: Configuration · Training · Results.
[ Landing Page ]
2
Configuration
Set topology (hosts & IoT), choose scenario, add agents (PPO, DQN, Q-Learning), tune hyperparameters.
[ Config Panel ]
3
Experiment
Live reward & accuracy charts via WebSocket. Host-status shows attacks & SDN blocking. Also on mobile.
[ Training Dashboard ]
4
Result Evaluation
Per-agent accuracy tables, confusion matrices, mitigation gauge. PDF export in one click.
[ Results Panel ]
Watch the Live Demo Video
Scan to watch on your device
youtu.be/sSzUz6w-4H8
▶ Watch here
Step 3 — Experiment in Action
MininetGym Training Dashboard
Real-time WebSocket updates
Host task monitor — attacks visible
SDN drop rule triggered
Responsive mobile UI
Step 4 — Result Evaluation
Metrics comparison chart
Radar chart — agent profiles
Mean accuracy up to 84.47%
PDF export available
Supported Agents & Key Metrics
Supported RL Agents
- Q-Learning (log-bin discretization)
- SARSA (on-policy)
- DQN — Deep Q-Network
- PPO — Proximal Policy Optimization
- A2C — Advantage Actor-Critic
- Supervised Agent (baseline)
Key Evaluation Metrics
Accuracy · Precision · Recall · F1
Mitigation Ratio
SDN blocks / attacks
False Negative Rate
penalty ×2
Attack Latency
steps to block
Best mean accuracy
84.47%
MininetGym
Open-source · Browser-accessible · Reproducible
AAMAS 2026 · Paphos, Cyprus
Website · MininetGym · AAMAS 2026 · Paphos, Cyprus · github.com/dipi-unimore/mininet-gym
→ Click or press Space for next slide