Hierarchical Temporal Memory Based Autonomous Agent for Partially Observable Video Game Environments

Tezin Türü: Yüksek Lisans

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Enformatik Enstitüsü, Modelleme ve Simülasyon Anabilim Dalı, Türkiye

Tezin Onay Tarihi: 2017

Öğrenci: Ali Kaan Sungur

Asıl Danışman (Eş Danışmanlı Tezler İçin): ELİF SÜRER

Özet:

Believable non-player characters (NPC) can have a profound impact on the experience that a video game provides. This thesis presents an online, unsupervised and lifelong learning autonomous agent that the player can interact with. It has an architecture utilizing a combination of Hierarchical Temporal Memory and Temporal Difference Learning Lambda with the guidance of neurobiological research. The agent has a visual sensor with an online data stream. Input from this sensor feeds the architecture to model the surrounding environment. The goal of the agent is to learn rewarding sequences of behavior based on the stimulation it receives caused by its actions. It navigates in a procedurally generated three-dimensional environment and is in a continuous learning state adapting the synapses of its neural connectome. The architecture is also capable of being stored and loaded at any point allowing for persistent learning through multiple simulation sessions. The study presents the learning characteristics of the agent on a video game related learning task. We compared the data collected from the experiments with varying parameters along with providing the runtime and serialization performance. The proposed methodology results in an autonomous NPC that can learn rewarding behaviors without any supervision. Moreover, it is also capable of learning specific action sequences via player guidance. The result is a promising and novel NPC architecture that is also relatively open to incremental improvements through the relevant neurobiological studies and the advancements on the theory of Hierarchical Temporal Memory.