Landmark Based Reward Shaping in Reinforcement Learning with Hidden States


DEMİR A., Cilden E., POLAT F.

18th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), Montreal, Kanada, 13 - 17 Mayıs 2019, ss.1922-1924 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası:
  • Basıldığı Şehir: Montreal
  • Basıldığı Ülke: Kanada
  • Sayfa Sayıları: ss.1922-1924
  • Anahtar Kelimeler: reward shaping, landmarks, reinforcement learning, hidden states
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

While most of the work on reward shaping focuses on fully observable problems, there are very few studies that couple reward shaping with partial observability. Moreover, for problems with hidden states, where there is no prior information about the underlying states, reward shaping opportunities are unexplored. In this paper, we show that landmarks can be used to shape the rewards in reinforcement learning with hidden states. Proposed approach is empirically shown to improve the learning performance in terms of speed and quality.