Landmark Based Reward Shaping in Reinforcement Learning with Hidden States

DEMİR A., Cilden E., POLAT F.

18th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), Montreal, Kanada, 13 - 17 Mayıs 2019, ss.1922-1924, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Cilt numarası:
Basıldığı Şehir: Montreal
Basıldığı Ülke: Kanada
Sayfa Sayıları: ss.1922-1924
Anahtar Kelimeler: reward shaping, landmarks, reinforcement learning, hidden states
Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

While most of the work on reward shaping focuses on fully observable problems, there are very few studies that couple reward shaping with partial observability. Moreover, for problems with hidden states, where there is no prior information about the underlying states, reward shaping opportunities are unexplored. In this paper, we show that landmarks can be used to shape the rewards in reinforcement learning with hidden states. Proposed approach is empirically shown to improve the learning performance in terms of speed and quality.