Channel-Attentive Transformer-Based Multimodal Semantic Segmentation Model for Early Detection of Wheat Yellow Rust Disease


ÜLKÜ İ., AKAGÜNDÜZ E., TANRIÖVER Ö. Ö.

18th International Conference on Machine Vision, ICMV 2025, Paris, Fransa, 19 - 22 Ekim 2025, cilt.14114, (Tam Metin Bildiri) identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası: 14114
  • Doi Numarası: 10.1117/12.3090951
  • Basıldığı Şehir: Paris
  • Basıldığı Ülke: Fransa
  • Anahtar Kelimeler: Multimodal semantic segmentation, remote sensing, wheat yellow rust disease
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

Early detection of wheat yellow rust is vital for timely fungicide application before infections exceed 5% of all plants in the monitored plot. While RGB imagery offers high spatial detail, NIR sensing captures early biochemical changes—chlorophyll loss and water stress—undetectable in RGB alone. Therefore, we propose a multimodal semantic segmentation model that relies on Transformer architecture to fuse RGB and NIR modalities. Additionally, further to improve the Transformer-based model, adaptive channel re-weighting is incorporated through lightweight squeeze-and-excitation blocks. When evaluated on UAV-collected field data specifically curated for wheat yellow rust disease, our model achieves an IoU of 0.689, outperforming CNN-based multimodal baselines by 14.1% and the best NIR-only CNN-based model by 11.3%. These findings highlight the potential efficacy of channel-attentive multimodal Transformer architecture for precise wheat yellow rust monitoring.