Automated Video Game Testing Using Synthetic and Humanlike Agents

Ariyurek S., Betin-Can A., SÜRER E.

IEEE TRANSACTIONS ON GAMES, vol.13, no.1, pp.50-67, 2021 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 13 Issue: 1
  • Publication Date: 2021
  • Doi Number: 10.1109/tg.2019.2947597
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Page Numbers: pp.50-67
  • Keywords: Games, Testing, Avatars, Water, Computer bugs, Sprites (computer), Monte Carlo methods, Automated game testing, graph coverage, inverse reinforcement learning (IRL), Monte Carlo tree search (MCTS), reinforcement learning (RL)
  • Middle East Technical University Affiliated: Yes


In this article, we present a new methodology that employs tester agents to automate video game testing. We introduce two types of agents-synthetic and humanlike-and two distinct approaches to create them. Our agents are derived from Sarsa and Monte Carlo tree search (MCTS) but focus on finding defects, while traditional game-playing agents focus on maximizing game scores. The synthetic agent uses test goals generated from game scenarios, and these goals are further modified to examine the effects of unintended game transitions. The humanlike agent uses test goals extracted by our proposed multiple greedy-policy inverse reinforcement learning (MGP-IRL) algorithm from tester trajectories. MGP-IRL captures multiple policies executed by human testers. We use our agents to produce test sequences, and run the game with these sequences. At each run, we use an automated test oracle to check for bugs. We analyze the proposed method in two parts-we compare the success of humanlike and synthetic agents in bug finding, and we evaluate the similarity between humanlike agents and human testers. We collected 427 trajectories from human testers using the General Video Game Artificial Intelligence (GVG-AI) framework and created three games with 12 levels that contain 45 bugs. Our experiments reveal that humanlike and synthetic agents compete with human testers' bug finding performances. Moreover, we show that MGP-IRL increases the humanlikeness of agents while improving the bug finding performance.