Early-exit convolutional neural networks


Thesis Type: Postgraduate

Institution Of The Thesis: Orta Doğu Teknik Üniversitesi, Graduate School of Natural and Applied Sciences, Graduate School of Natural and Applied Sciences, Turkey

Approval Date: 2019

Student: Edanur Demir

Principal Supervisor (For Co-Supervisor Theses): EMRE AKBAŞ

Abstract:

This thesis is aimed at developing a method that reduces the computational cost of convolutional neural networks (CNN) during inference. Conventionally, the input data pass through a fixed neural network architecture. However, easy examples can be classified at early stages of processing and conventional networks do not take this into account. In this thesis, we introduce “Early-exit CNNs”, EENets for short, which adapt their computational cost based on the input by stopping the inference process at certain exit locations. In EENets, there are a number of exit blocks each of which consists of a confidence branch and a softmax branch. The confidence branch computes the confidence score of exiting (i.e. stopping the inference process) at that location; while the softmax branch outputs a classification probability vector. Both branches are learnable and they are independent of each other. During training of EENets, in addition to the classical classification loss, the computational cost of inference is taken into account as well. As a result, the network adapts its many confidence branches to the inputs so that less computation is spent for easy examples. Inference works as in conventional feed-forward networks, however, when the output of a confidence branch is larger than a certain threshold, the inference stops for that specific example. Through comprehensive experiments, we show that EENets significantly reduce the computational cost upto 2% of the original without degrading the testing accuracy. The idea of EENets is applicable to available CNN architectures such as ResNets. On MNIST, SVHN and CIFAR10 datasets, early-exit (EE) ResNets achieve similar accuracy with their non-EE versions while reducing the computational cost to 20% of the original.