Positive impact of state similarity on reinforcement learning performance

Girgin, Sertan; Polat, FARUK; Alhaj, Reda

doi:10.1109/tsmcb.2007.899419

Positive impact of state similarity on reinforcement learning performance

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, cilt.37, sa.5, ss.1256-1270, 2007 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 37 Sayı: 5
Basım Tarihi: 2007
Doi Numarası: 10.1109/tsmcb.2007.899419
Dergi Adı: IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.1256-1270
Anahtar Kelimeler: action-value function, learning performance, optimal policies, reinforcement learning (RL), similarity function, state similarity
Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

In this paper, we propose a novel approach to identify states with similar subpolicies and show how they can be integrated into the reinforcement learning framework to improve learning performance. The method utilizes a specialized tree structure to identify common action sequences of states, which are derived from possible optimal policies, and defines a similarity function between two states based on the number of such sequences. Using this similarity function, updates on the action-value function of a state are reflected onto all similar states. This allows experience that is acquired during learning to be applied to a broader context. The effectiveness of the method is demonstrated empirically.