Landmark Based Reward Shaping in Reinforcement Learning with Hidden States


DEMİR A. , Cilden E., POLAT F.

18th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), Montreal, Kanada, 13 - 17 Mayıs 2019, ss.1922-1924 identifier identifier

  • Cilt numarası:
  • Basıldığı Şehir: Montreal
  • Basıldığı Ülke: Kanada
  • Sayfa Sayıları: ss.1922-1924

Özet

While most of the work on reward shaping focuses on fully observable problems, there are very few studies that couple reward shaping with partial observability. Moreover, for problems with hidden states, where there is no prior information about the underlying states, reward shaping opportunities are unexplored. In this paper, we show that landmarks can be used to shape the rewards in reinforcement learning with hidden states. Proposed approach is empirically shown to improve the learning performance in terms of speed and quality.