Dynamic soaring in UAVs: a deep reinforcement learning approach

Akhtar, Mishma; MAQSOOD, ADNAN; Mir, Imran; GÜNGÖRDÜ, BARIŞ

doi:10.1017/aer.2026.10155

Dynamic soaring in UAVs: a deep reinforcement learning approach

Akhtar M., MAQSOOD A., Mir I., GÜNGÖRDÜ B.

AERONAUTICAL JOURNAL, 2026 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Basım Tarihi: 2026
Doi Numarası: 10.1017/aer.2026.10155
Dergi Adı: AERONAUTICAL JOURNAL
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC
Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

Dynamic soaring (DS) enables unmanned aerial vehicles (UAVs) to extend endurance by extracting energy from atmospheric wind gradients. While prior DS research has primarily focused on fixed-wing platforms using nonlinear optimal control and trajectory optimisation, these methods typically require solving computationally demanding optimisation problems online. In contrast, deep reinforcement learning (DRL) allows computationally intensive training to be performed offline, with real-time deployment requiring only lightweight policy inference. This study investigates autonomous dynamic soaring in a hybrid tricopter UAV, where the two forward-facing rotors provide limited thrust assistance and the rear rotor remains inactive during soaring. A six-degree-of-freedom nonlinear flight model is implemented in MATLAB/Simulink to capture aerodynamic forces and wind-gradient energy interactions. The DS task is formulated as a DRL problem, and three representative algorithms - DDPG, PPO and TRPO - are evaluated. Simulation results demonstrate distinct performance characteristics: proximal policy optimisation (PPO) yields the most stable and repeatable cycles, trust region policy optimisation (TRPO) produces smoother control inputs, and deep deterministic policy gradient (DDPG) converges rapidly but relies more heavily on propulsive thrust. Compared to DDPG, TRPO and PPO improve net energy gain by approximately 42.0% and 30.3%, respectively. These findings demonstrate the feasibility of DS in a tricopter-based hybrid UAV and highlight DRL as an effective framework for autonomous, energy-aware flight.