Automatic landmark discovery for learning agents under partial observability


Demir A., Cilden E., Polat F.

KNOWLEDGE ENGINEERING REVIEW, vol.34, 2019 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Review
  • Volume: 34
  • Publication Date: 2019
  • Doi Number: 10.1017/s026988891900002x
  • Journal Name: KNOWLEDGE ENGINEERING REVIEW
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Middle East Technical University Affiliated: Yes

Abstract

In the reinforcement learning context, a landmark is a compact information which uniquely couples a state, for problems with hidden states. Landmarks are shown to support finding good memoryless policies for Partially Observable Markov Decision Processes (POMDP) which contain at least one landmark. SarsaLandmark, as an adaptation of Sarsa(lambda), is known to promise a better learning performance with the assumption that all landmarks of the problem are known in advance.