Localization recall precision (LRP): A new performance metric for object detection


Creative Commons License

Öksüz K., Çam B. C., Akbaş E., Kalkan S.

15th European Conference on Computer Vision, ECCV 2018, Munich, Germany, 8 - 14 September 2018, pp.521-537 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1007/978-3-030-01234-2_31
  • City: Munich
  • Country: Germany
  • Page Numbers: pp.521-537
  • Keywords: Average precision, Object detection, Performance metric, Optimal threshold, Recall-precision
  • Middle East Technical University Affiliated: Yes

Abstract

Average precision (AP), the area under the recall-precision (RP) curve, is the standard performance measure for object detection. Despite its wide acceptance, it has a number of shortcomings, the most important of which are (i) the inability to distinguish very different RP curves, and (ii) the lack of directly measuring bounding box localization accuracy. In this paper, we propose "Localization Recall Precision (LRP) Error", a new metric specifically designed for object detection. LRP Error is composed of three components related to localization, false negative (FN) rate and false positive (FP) rate. Based on LRP, we introduce the "Optimal LRP" (oLRP), the minimum achievable LRP error representing the best achievable configuration of the detector in terms of recall-precision and the tightness of the boxes. In contrast to AP, which considers precisions over the entire recall domain, oLRP determines the "best" confidence score threshold for a class, which balances the trade-off between localization and recall-precision. In our experiments, we show that oLRP provides richer and more discriminative information than AP. We also demonstrate that the best confidence score thresholds vary significantly among classes and detectors. Moreover, we present LRP results of a simple online video object detector and show that the class-specific optimized thresholds increase the accuracy against the common approach of using a general threshold for all classes. Our experiments demonstrate that LRP is more competent than AP in capturing the performance of detectors. Our source code for PASCAL VOC AND MSCOCO datasets are provided at https://github.com/cancam/LRP.