Landmark Based Reward Shaping in Reinforcement Learning with Hidden States


DEMİR A. , Cilden E., POLAT F.

18th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), Montreal, Canada, 13 - 17 May 2019, pp.1922-1924 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Volume:
  • City: Montreal
  • Country: Canada
  • Page Numbers: pp.1922-1924
  • Keywords: reward shaping, landmarks, reinforcement learning, hidden states

Abstract

While most of the work on reward shaping focuses on fully observable problems, there are very few studies that couple reward shaping with partial observability. Moreover, for problems with hidden states, where there is no prior information about the underlying states, reward shaping opportunities are unexplored. In this paper, we show that landmarks can be used to shape the rewards in reinforcement learning with hidden states. Proposed approach is empirically shown to improve the learning performance in terms of speed and quality.