17th International Symposium on Health Informatics and Bioinformatics, İstanbul, Türkiye, 18 Aralık 2024, ss.107, (Özet Bildiri)
Deciphering Sequence Variations and Splicing Sensitivity: Predictive Analysis of PSI in SRRM4 Response Groups Ümit Sude Böler1, Burçak Otlu1,* 1Department of Health Informatics, Graduate School of Informatics, Middle East Technical University, Ankara, Turkey Presenting Author: sude.boler@metu.edu.tr *Corresponding Author: burcako@metu.edu.tr Microexons, short exonic sequences between 3 and 42 nucleotides, are essential for regulating protein function, particularly in the nervous system. Their abnormal inclusion has been associated with diseases like autism and cancer. This study explores how sequence variations in the upstream intron, microexon variant, and downstream intron aÉect splicing eÉiciency under the regulation of SRRM4, a splicing factor crucial for neural tissue specificity. Leveraging data from a Massively Parallel Alternative Splicing Assay (MaPSy) with over 17,500 variants, we assessed splicing sensitivity by measuring Percent Spliced In (PSI) across four SRRM4 expression conditions. We examined key sequence features, including exon and intron length, UGC mutations, and splice site strength, using them to build predictive models for splicing outcomes. Our findings show that deep learning models, particularly those using Conv1D and LSTM layers, outperform traditional methods, with the best model explaining up to 99% of PSI variance. Motif analysis further revealed specific sequence motifs likely to influence PSI, adding to our understanding of SRRM4 responsiveness. In future work, we aim to expand motif discovery to identify significant k-mers across diÉerent SRRM4 conditions, refine LSTM models, and explore additional deep learning architectures. To address potential overfitting, we plan to perform rigorous hyperparameter tuning, apply cross-validation, and incorporate regularization techniques to improve model robustness. These combined eÉorts deepen our understanding of splicing regulation’s molecular underpinnings and oÉer potential for therapeutic interventions in splicing- related neurological disorders. Keywords: Microexon, Massively Parallel Splicing assays