TRANSLATIONAL PSYCHIATRY, cilt.9, 2019 (SCI-Expanded)
In the present study, to improve the predictive performance of a model and its reproducibility when applied to an independent data set, we investigated the use of multimodel inference to predict the probability of having a complex psychiatric disorder. We formed training and test sets using proteomic data (147 peptides from 77 proteins) from two-independent collections of first-onset drug-naive schizophrenia patients and controls. A set of prediction models was produced by applying lasso regression with repeated tenfold cross-validation to the training set. We used feature extraction and model averaging across the set of models to form two prediction models. The resulting models clearly demonstrated the utility of a multimodel based approach to make good (training set AUC > 0.80) and reproducible predictions (test set AUC > 0.80) for the probability of having schizophrenia. Moreover, we identified four proteins (five peptides) whose effect on the probability of having schizophrenia was modified by sex, one of which was a novel potential biomarker of schizophrenia, foetal haemoglobin. The evidence of effect modification suggests that future schizophrenia studies should be conducted in males and females separately. Future biomarker studies should consider adopting a multimodel approach and going beyond the main effects of features.