Optimizing Network Routing with Deep Reinforcement Learning and Graph Neural Networks
Problem Statement
Efficient routing optimization in optical networks is crucial for managing network resources. Existing solutions using Deep Reinforcement Learning (DRL) fail to generalize across different network topologies, limiting their deployability in real-world scenarios. Traditional methods like Integer Linear Programming (ILP) or heuristics are either too slow or suboptimal.
Objectives
- Develop a DRL-based agent that integrates Graph Neural Networks (GNNs) to generalize across unseen network topologies.
- Outperform state-of-the-art DRL solutions in terms of scalability, generalization, and performance on unseen topologies.
- Ensure low overhead and real-time decision-making capabilities.
Approach
- Hybrid Approach: Combined the strengths of DRL for decision-making and GNNs for handling graph-structured data.
- Designed a DRL agent with a GNN-based model for understanding the relationships between network nodes and links.
- Used a DQN (Deep Q-Network) algorithm to approximate the q-value function, allowing the agent to learn optimal routing policies.
Data and Features
Data: Experiments conducted on 180 synthetic network topologies and 232 real-world topologies, ensuring a diverse range of network structures.
Features:
- Link capacity: Represents the available bandwidth for traffic demands.
- Link betweenness: Indicates the centrality of links, helping the agent prioritize paths.
- Action space: Defined by k-shortest paths (k=4) for routing traffic demands between source-destination node pairs.
Model Architecture
- Graph Neural Network: Implemented using Message-Passing Neural Networks (MPNN), which propagate information between links in the network topology.
- Deep Reinforcement Learning: The agent utilizes DQN with GNN for q-value approximation. The GNN captures network topological information, while the DQN determines the best action based on the learned q-values.
Training Strategy
- Exploration-Exploitation: Used an ε-greedy strategy to balance exploration of new routing policies and exploitation of learned optimal paths.
- Optimizer: Stochastic Gradient Descent with momentum for efficient training.
- Regularization: Applied L2 regularization and dropout to avoid overfitting and improve model generalization.
- Message Passing: 7 steps of message passing were used to propagate information across the graph.
Experiment Results
- Generalization: The DRL+GNN agent outperformed traditional DRL-based solutions on unseen network topologies by 6.6% (on the Nsfnet topology) and by 45% in certain topologies like Geant2.
- Scalability: The agent demonstrated robust scalability, performing well even on larger topologies (up to 100 nodes) with a modest performance drop of only 15% compared to smaller networks.
- Real-Time Decision Making: Achieved routing decisions in milliseconds, with computation time scaling linearly with the network size.
Key Takeaways
- Superior Generalization: GNN integration enabled the DRL agent to perform effectively in unseen and larger network topologies, a significant improvement over previous DRL models.
- Scalable and Efficient: The agent’s decision-making process is highly scalable, with near real-time operation, making it suitable for production networks.
- Real-World Applicability: This project opens the door for deployment in production networks without requiring additional training or instrumentation, significantly reducing operational overhead.
The Github repository is here.
