Cross-Modal Learning via Adversarial Loss and Covariate Shift for Enhanced Liver Segmentation

Ozkan, Savas; SELVER, MUSTAFA; Baydar, Bora; Kavur, Ali; Candemir, Cemre; AKAR, GÖZDE

doi:10.1109/tetci.2024.3369868

Cross-Modal Learning via Adversarial Loss and Covariate Shift for Enhanced Liver Segmentation

Ozkan S., SELVER M. A., Baydar B., Kavur A. E., Candemir C., AKAR G.

IEEE Transactions on Emerging Topics in Computational Intelligence, cilt.8, sa.4, ss.2723-2735, 2024 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 8 Sayı: 4
Basım Tarihi: 2024
Doi Numarası: 10.1109/tetci.2024.3369868
Dergi Adı: IEEE Transactions on Emerging Topics in Computational Intelligence
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.2723-2735
Anahtar Kelimeler: Cross-modal learning, CT, liver, MR, semantic segmentation
Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

Despite the widespread use of deep learning methods for semantic segmentation from single imaging modalities, their performance for exploiting multi-domain data still needs to improve. However, the decision-making process in radiology is often guided by data from multiple sources, such as pre-operative evaluation of living donated liver transplantation donors. In such cases, cross-modality performances of deep models become more important. Unfortunately, the domain-dependency of existing techniques limits their clinical acceptability, primarily confining their performance to individual domains. This issue is further formulated as a multi-source domain adaptation problem, which is an emerging field mainly due to the diverse pattern characteristics exhibited from cross-modality data. This paper presents a novel method that can learn robust representations from unpaired cross-modal (CT-MR) data by encapsulating distinct and shared patterns from multiple modalities. In our solution, the covariate shift property is maintained with structural modifications in our architecture. Also, an adversarial loss is adopted to boost the representation capacity. As a result, sparse and rich representations are obtained. Another superiority of our model is that no information about modalities is needed at the training or inference phase. Tests on unpaired CT and MR liver data obtained from the cross-modality task of the CHAOS grand challenge demonstrate that our approach achieves state-of-the-art results with a large margin in both individual metrics and overall scores.