Hierarchical reinforcement Thompson composition

Tanık, Güven; ERTEKİN BOLELLİ, ŞEYDA

doi:10.1007/s00521-024-09732-9

Hierarchical reinforcement Thompson composition

Atıf İçin Kopyala

Tanık G. O., ERTEKİN BOLELLİ Ş.

Neural Computing and Applications, cilt.36, sa.20, ss.12317-12326, 2024 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 36 Sayı: 20
Basım Tarihi: 2024
Doi Numarası: 10.1007/s00521-024-09732-9
Dergi Adı: Neural Computing and Applications
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Applied Science & Technology Source, Biotechnology Research Abstracts, Compendex, Computer & Applied Sciences, Index Islamicus, INSPEC, zbMATH
Sayfa Sayıları: ss.12317-12326
Anahtar Kelimeler: Artificial intelligence agents, Reinforcement learning, Soft attention, Thompson sampling
Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

Modern real-world control problems call for continuous control domains and robust, sample efficient and explainable control frameworks. We are presenting a framework for recursively composing control skills to solve compositional and progressively complex tasks. The framework promotes reuse of skills, and as a result quick adaptability to new tasks. The decision tree can be observed, providing insight into the agents’ behavior. Furthermore, the skills can be transferred, modified or trained independently, which can simplify reward shaping and increase training speeds considerably. This paper is concerned with efficient composition of control algorithms using reinforcement learning and soft attention. Compositional and temporal abstraction is the key to improving learning and planning in reinforcement learning. Our Thompson sampling inspired soft-attention model is demonstrated to efficiently solve the composition problem.