Reinforcement learning control for autorotation of a simple point-mass helicopter model


Tezin Türü: Yüksek Lisans

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Mühendislik Fakültesi, Havacılık ve Uzay Mühendisliği Bölümü, Türkiye

Tezin Onay Tarihi: 2018

Öğrenci: KADİRCAN KOPŞA

Danışman: ALİ TÜRKER KUTAY

Özet:

This study presents an application of an actor-critic reinforcement learning method to a simple point-mass model helicopter guidance problem during autorotation. A point-mass model of an OH-58A helicopter in autorotation was built. A reinforcement learning agent was trained by a model-free asynchronous actor-critic algorithm, where training episodes were parallelized on a multi-core CPU. Objective of the training was defined as achieving near-zero horizontal and vertical kinetic energies at the instant of touchdown. During each training episode, the agent was presented a reward at each discrete time-step according to a multi-conditional reward function. Reward function was programmed to output the negative of a weighted sum of squared vertical and horizontal velocities at touchdown. The agent consists of two separate neural network function approximators, namely the actor and the critic. The critic approximates the value of a set of states. The actor generates a set of actions given a set of states, sampled from a Gaussian distribution with mean values as output set of the actor network. Updates to the parameters of both networks were calculated from accumulated gradients during each episode and applied once per episode to improve training stability. RMSProp algorithm was used for optimization. Results achieved by the agent indicates that the method is successful at guiding the point-mass helicopter to the ground with minimal kinetic energy for most initial conditions. Controls generated by the reinforcement learning agent were found to be similar to a helicopter pilot’s technique.