9th International Conference on Control, Decision and Information Technologies, CoDIT 2023, Rome, İtalya, 3 - 06 Temmuz 2023, ss.1524-1529
This paper proposes three reinforcement learningbased guidance strategies to solve the problem of quadcopter guidance under non-ideal conditions such as target movement, measurement delay and system delay. The engagement scenario of the quadcopter and the target in the vertical plane is formalized as a Markov Decision Process (MDP) with a reward function that aims to intercept the target with high precision in the shortest time. Reinforcement learning-based guidance strategies which learn 1) directly the guidance command, 2) the biased command added to guidance command generated by classical guidance algorithms and 3) both the guidance gain and the biased command are proposed to deal with the quadcopter guidance problem. Numerical simulations under various nonideal conditions are performed to compare the performance of the proposed reinforcement learning-based guidance strategies with the classical closed-form guidance laws.