Online Antenna Tuning in Heterogeneous Cellular Networks With Deep Reinforcement Learning


Balevi E., Andrews J. G.

IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, cilt.5, sa.4, ss.1113-1124, 2019 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 5 Sayı: 4
  • Basım Tarihi: 2019
  • Doi Numarası: 10.1109/tccn.2019.2933420
  • Dergi Adı: IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.1113-1124
  • Anahtar Kelimeler: Deep reinforcement learning, online antenna tuning, Q-learning, HetNets, 5G
  • Orta Doğu Teknik Üniversitesi Adresli: Hayır

Özet

We aim to jointly optimize antenna tilt angle, and vertical and horizontal half-power beamwidths of the macrocells in a heterogeneous cellular network (HetNet). The interactions between the cells, most notably due to their coupled interference render this optimization prohibitively complex. Utilizing a single agent reinforcement learning (RL) algorithm for this optimization becomes quite suboptimum despite its scalability, whereas multi-agent RL algorithms yield better solutions at the expense of scalability. Hence, we propose a two-step compromise algorithm. Specifically, a multi-agent mean field RL algorithm is first utilized in the offline phase so as to transfer information as features for the second (online) phase single agent RL algorithm, which employs a deep neural network to learn users locations. This two-step approach is a practical solution for real deployments, which should automatically adapt to environmental changes in the network. Our results illustrate that the proposed algorithm approaches the performance of the multi-agent RL, which requires millions of trials, with hundreds of online trials, assuming relatively low environmental dynamics, and performs much better than a single agent RL. Furthermore, the proposed algorithm is compact and implementable, and empirically appears to provide a performance guarantee regardless of the amount of environmental dynamics.