Hybrid statistical and machine learning modeling of cognitive neuroscience data


Çakar S., Gökalp Yavuz F.

Journal of Applied Statistics, cilt.51, sa.6, ss.1076-1097, 2024 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 51 Sayı: 6
  • Basım Tarihi: 2024
  • Doi Numarası: 10.1080/02664763.2023.2176834
  • Dergi Adı: Journal of Applied Statistics
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, ABI/INFORM, Aerospace Database, Business Source Elite, Business Source Premier, CAB Abstracts, Veterinary Science Database, zbMATH
  • Sayfa Sayıları: ss.1076-1097
  • Anahtar Kelimeler: Machine learning, mixed model, n-back data, cognitive studies, fNIRS, MEDIAL TEMPORAL-LOBE, BRAIN, CONNECTIVITY, CORTEX
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

The nested data structure is prevalent for cognitive measure experiments due to repeatedly taken observations from different brain locations within subjects. The analysis methods used for this data type should consider the dependency structure among the repeated measurements. However, the dependency assumption is mainly ignored in the cognitive neuroscience data analysis literature. We consider both statistical, and machine learning methods extended to repeated data analysis and compare distinct algorithms in terms of their advantage and disadvantages. Unlike basic algorithm comparison studies, this article analyzes novel neuroscience data considering the dependency structure for the first time with several statistical and machine learning methods and their hybrid forms. In addition, the fitting performances of different algorithms are compared using contaminated data sets, and the cross-validation approach. One of our findings suggests that the GLMM tree, including random term indices indicating the location of functional near-infrared spectroscopy optodes nested within experimental units, shows the best predictive performance with the lowest MSE, RMSE, and MAE model performance metrics. However, there is a trade-off between accuracy and speed since this algorithm is required the highest computational time.