Rank & Sort Loss for Object Detection and Instance Segmentation

Oksuz K., Cam B. C., AKBAŞ E., KALKAN S.

18th IEEE/CVF International Conference on Computer Vision (ICCV), ELECTR NETWORK, 11 - 17 October 2021, pp.2989-2998 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Doi Number: 10.1109/iccv48922.2021.00300
  • Page Numbers: pp.2989-2998
  • Middle East Technical University Affiliated: Yes


We propose Rank & Sort (RS) Loss, a ranking-based loss function to train deep object detection and instance segmentation methods (i.e. visual detectors). RS Loss supervises the classifier, a sub-network of these methods, to rank each positive above all negatives as well as to sort positives among themselves with respect to (wrt.) their localisation qualities (e.g. Intersection-over-Union - IoU). To tackle the non-differentiable nature of ranking and sorting, we reformulate the incorporation of error-driven update with backpropagation as Identity Update, which enables us to model our novel sorting error among positives. With RS Loss, we significantly simplify training: (i) Thanks to our sorting objective, the positives are prioritized by the classifier without an additional auxiliary head (e.g. for centerness, IoU, mask-IoU), (ii) due to its ranking-based nature, RS Loss is robust to class imbalance, and thus, no sampling heuristic is required, and (iii) we address the multi-task nature of visual detectors using tuning-free task-balancing coefficients. Using RS Loss, we train seven diverse visual detectors only by tuning the learning rate, and show that it consistently outperforms baselines: e.g. our RS Loss improves (i) Faster R-CNN by similar to 3 box AP and aLRP Loss (ranking-based baseline) by similar to 2 box AP on COCO dataset, (ii) Mask R-CNN with repeat factor sampling (RFS) by 3.5 mask AP (similar to 7 AP for rare classes) on LVIS dataset; and also outperforms all counterparts.