Automatic landmark discovery for learning agents under partial observability

Demir, ALPER; Cilden, Erkin; Polat, FARUK

doi:10.1017/s026988891900002x

Automatic landmark discovery for learning agents under partial observability

Demir A., Cilden E., Polat F.

KNOWLEDGE ENGINEERING REVIEW, cilt.34, 2019 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Derleme
Cilt numarası: 34
Basım Tarihi: 2019
Doi Numarası: 10.1017/s026988891900002x
Dergi Adı: KNOWLEDGE ENGINEERING REVIEW
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

In the reinforcement learning context, a landmark is a compact information which uniquely couples a state, for problems with hidden states. Landmarks are shown to support finding good memoryless policies for Partially Observable Markov Decision Processes (POMDP) which contain at least one landmark. SarsaLandmark, as an adaptation of Sarsa(lambda), is known to promise a better learning performance with the assumption that all landmarks of the problem are known in advance.