.

Decentralized Autonomous Traffic Management through Corridor Networks

Presented at the Air Transportation Research and Development Symposium (ATRD Symposium) 2026

Also presented at the 3rd Annual Northeast Systems and Control Workshop (NESCW)
Princeton University, May 2026

This page covers our full corridor-network study, presented at the ATRD Symposium (preprint on arXiv below). An earlier, more limited version of this work — covering a single corridor, a two-corridor sequence, and a simple split, without zero-shot multi-corridor transfer — appeared at AIAA SciTech 2026; see citation below.

Jasmine Jerry Aloor
MIT

Aadarsh Govada*
University of Maryland

Hamsa Balakrishnan
MIT

*Aadarsh did this work as a collaborator at MIT

Abstract

As autonomous aircraft are introduced at scale and traffic density increases, centralized management becomes insufficient to coordinate the large numbers of crewed and uncrewed aircraft. Dedicated Advanced Air Mobility (AAM) corridors have been proposed for organizing high-density autonomous traffic flows. The desire to scalably provide autonomous aircraft flexibility in trajectory planning motivates the development of decentralized approaches to traffic management in AAM corridors.

In this work, we extend a multi-agent reinforcement learning (MARL) framework for aircraft navigation and self-separation to corridor networks. We test policies trained in a single-corridor setting on increasingly complex multi-corridor networks with combinations of merges and splits in a zero-shot manner. Experimental results demonstrate that learned behaviors transfer well to scenarios with varying traffic density, network geometry, and heterogeneous vehicle performance, without needing centralized coordination or model retraining.

Motivation

Advanced Air Mobility (AAM) is expected to introduce large numbers of autonomous aircraft in dense urban airspace, creating coordination challenges that centralized ATC cannot scale to address.
Dedicated AAM corridors organize aircraft into predefined routes, reducing coordination complexity while supporting decentralized decision-making.
Within corridors, aircraft must maintain separation, conform to corridor boundaries, and minimize congestion-induced delays — all without centralized scheduling.
MARL enables agents to discover coordination strategies through environmental interaction rather than hand-crafted rules.

Central Question: Can a policy trained in a single corridor coordinate traffic in a complex multi-corridor network — zero-shot?

Method

Aircraft Coordination as a Dec-POMDP

We formulate the multi-aircraft system as a decentralized partially observable Markov decision process (Dec-POMDP). Each agent observes a local neighborhood and selects actions based on state observations and graph-structured interaction information — no centralized coordinator, no global state.

Corridor Navigation Phases

Phase 1 (Pre-corridor): Agents approach the corridor, maintain inter-agent spacing, and merge as needed to enter.
Phase 2 (In-corridor): Agents maintain line formation and longitudinal spacing within the corridor lane.
Phase 3 (Post-corridor): Agents exit and disperse toward their individual downstream goals.

Key Methodological Advances

Rotation-invariant policy representation: Observations expressed in the agent's heading frame remove dependence on global orientation, enabling zero-shot generalization across corridor orientations.
Curriculum-based training: Reward terms are introduced progressively — early training emphasizes corridor adherence and phase progression; separation penalties are gradually raised as policies mature.
Zero-shot transfer: The trained policy is deployed directly in multi-corridor settings — merging, branching, and combined 18-corridor layouts — with no retraining or centralized scheduling.

Reward Structure

Separation maintenance: Penalty when inter-agent spacing falls below threshold; stronger penalty when aircraft are also closing in on each other.
Phase transition reward: Structured bonuses for correct phase transitions; penalties for skipped or reversed transitions.
Goal completion bonus: Terminal reward upon reaching the goal, only after fully traversing the assigned corridor sequence.

Evaluation Setup

Test Topologies (Zero-Shot)

Merge

Two upstream corridors merge into one downstream corridor. 3 corridors total. (40 agents shown)

Double Merge

Two sequential merging points with compounded interaction density. 5 corridors total. (10 agents shown)

Split-Merge

Aircraft diverge into parallel corridors then reconverge. 6 corridors total. (40 agents shown)

Combined Network

Merges, splits, and sequential chains totaling 18 corridors, 3 entries, 2 exits, 8 routes. (10 agents shown)

Performance Metrics

Conformance C%: Average percentage of time an aircraft remains within corridor boundaries during the in-corridor phase.
Completion S%: Percentage of aircraft that successfully navigate their full corridor sequence within the episode.
Average speed (knots): Total distance traveled divided by time taken.
Tactical intervention I%: Fraction of time an agent requires tactical deconfliction due to minimum separation violation.

Traffic demand is varied with 10, 20, 30, and 40 simultaneous aircraft. The heterogeneous case assigns 3 agents a reduced maximum speed of 140 knots (20% slower) while remaining agents retain 175 knots.

Results

18-Corridor Combined Network — Homogeneous Fleet

Table 1: Performance metrics for the combined corridor network (18 corridors). The learned decentralized policy maintains strong performance across all traffic levels.

\| # Aircraft	\| Conformance C% (↑)	\| Completion S% (↑)	\| Avg. Speed (kts) (↑)	\| Tactical Intervention I% (↓)
10	98%	99%	171.4	3.7%
20	98%	99%	171.9	4.2%
30	97%	98%	172.8	3.6%
40	96%	97%	173.2	3.3%

Heterogeneous Fleet

Faster agents learn to opportunistically overtake slower aircraft in inter-corridor gaps — adaptive speed harmonization emerges without explicit reward engineering. The presence of slower aircraft does not lead to persistent congestion; agents learn to extend their paths outside corridors to avoid boundary violations while overtaking.

3 of 10 agents capped at 140 knots (20% slower) overtaken opportunistically by faster agents.

Efficiency Summary

Average speed deviates from maximum by only 5.4% in the 40-agent scenario.
Actual distance traveled is within 8% of the shortest possible routes.
Peak throughput reaches 14 aircraft/min vs. a theoretical optimal of 15 aircraft/min.
Tactical intervention needed less than 5% of the time, even in congested 18-corridor scenarios with multiple merges.

Key Takeaway: A decentralized MARL policy trained in a single corridor achieves ≥96% conformance and ≥97% completion in an 18-corridor network with up to 40 aircraft — zero-shot, no retraining, no centralized coordination required.

NESCW 2026 Poster

Presented at the 3rd Annual Northeast Systems and Control Workshop, Princeton University, May 2026.

Scan to visit this project page

jaroan.github.io/jasminejerrya/AAM_Corridor_MARL.html

Conclusions

A rotation-invariant MARL policy trained in a single corridor generalizes zero-shot to 18-corridor networks with up to 40 aircraft.
Corridor conformance (≥96%) and completion rates (≥97%) remain strong even at the highest traffic densities tested.
Tactical interventions are needed less than 5% of the time, demonstrating that learned strategic behaviors handle the majority of coordination.
Emergent overtaking and speed harmonization arise in heterogeneous fleets without explicit reward engineering.
Structured airspace design and decentralized policy learning are complementary approaches to scalable AAM traffic management.

Contact

For any questions, please contact Jasmine.

Citation

If you find our work useful in your research, please consider citing the ATRD Symposium paper (full multi-corridor network study, preprint on arXiv):

@inproceedings{aloor2026corridor_networks,
  author        = {Aloor, Jasmine Jerry and Govada, Aadarsh and Balakrishnan, Hamsa},
  title         = {Decentralized Autonomous Traffic Management through Corridor Networks},
  year          = {2026},
  booktitle={Second US-Europe Air Transportation Research and Development Symposium (ATRDS)},
  url           = {https://arxiv.org/abs/2606.23585}
}

The earlier, more limited version of this work (single corridor, two-corridor sequence, and a simple split) appeared at AIAA SciTech 2026:

@inbook{aloor2026corridor,
  author    = {Aloor, Jasmine J. and Balakrishnan, Hamsa},
  title     = {Decentralized Coordination of Autonomous Traffic Through Advanced Air Mobility Corridors},
  booktitle = {AIAA SCITECH 2026 Forum},
  year      = {2026},
  doi       = {10.2514/6.2026-0236},
  url       = {https://arc.aiaa.org/doi/abs/10.2514/6.2026-0236}
}

Acknowledgments

The authors thank the MIT SuperCloud and Lincoln Laboratory Supercomputing Center for high-performance computing resources. This work was supported in part by NASA under grant #80NSSC23M0220 and the University Leadership Initiative (grants #80NSSC21M0071 and #80NSSC20M0163). J.J. Aloor was also supported in part by a MathWorks Fellowship. This research was also sponsored by the Department of the Air Force Artificial Intelligence Accelerator and was accomplished under Cooperative Agreement Number FA8750-19-2-1000. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Department of the Air Force or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein.

.

Decentralized Autonomous Traffic Management through Corridor Networks

Presented at the Air Transportation Research and Development Symposium (ATRD Symposium) 2026

Also presented at the 3rd Annual Northeast Systems and Control Workshop (NESCW)
Princeton University, May 2026

Jasmine Jerry Aloor
MIT

Aadarsh Govada*
University of Maryland

Hamsa Balakrishnan
MIT

Paper (ATRD)

Paper (earlier, AIAA)

Video

Code

Slides

Poster