Option discovery in reinforcement learning using frequent common subsequences of actions


Girgin S., POLAT F.

International Conference on Computational Intelligence for Modelling, Control and Automation/International Conference on Intelligent Agents Web Technologies and International Commerce, Vienna, Avusturya, 28 - 30 Kasım 2005, ss.371-372 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Basıldığı Şehir: Vienna
  • Basıldığı Ülke: Avusturya
  • Sayfa Sayıları: ss.371-372
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

Temporally abstract actions, or options, facilitate learning in large and complex domains by exploiting sub-tasks and hierarchical structure of the problem formed by these sub-tasks. In this paper, we study automatic generation of options using common sub-sequences derived from the state transition histories collected as learning progresses. The standard Q-learning algorithm is extended to use generated options transparently, and effectiveness of the method is demostrated in Dietterich's Taxi domain.