Modeling of various biological networks via LCMARS


AYYILDIZ DEMİRCİ E., PURUTÇUOĞLU GAZİ V.

JOURNAL OF COMPUTATIONAL SCIENCE, cilt.28, ss.148-154, 2018 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 28
  • Basım Tarihi: 2018
  • Doi Numarası: 10.1016/j.jocs.2018.08.009
  • Dergi Adı: JOURNAL OF COMPUTATIONAL SCIENCE
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.148-154
  • Anahtar Kelimeler: Conic multivariate adaptive regression splines, Gaussian graphical model, Sparse networks
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

In system biology, the interactions between components such as genes, proteins, can be represented by a network. To understand the molecular mechanism of complex biological systems, construction of their networks plays a crucial role. However, estimation of these biological networks is a challenging problem because of their high dimensional and sparse structures. Several statistical methods are proposed to overcome this issue. The Conic Multivariate Adaptive Regression Splines (CMARS) is one of the recent nonparametric methods developed for high dimensional and correlated data. This model is suggested to improve the performance of the Multivariate Adaptive Regression Spline (MARS) approach which is a complex model under the generalized additive models. From previous studies, it has been shown that MARS can be a promising model for the description of steady-state activations of biological networks if it is modified as a lasso-type regression via the main effects. In this study, we convert the full description of CMARS as a loop-based approach, so-called LCMARS, by including both main and second-order interaction effects since this description has performed better in benchmark real datasets. Here, we generate various scenarios based on distinct distributions and dimensions to compare the performance of LCMARS with MARS and Gaussian Graphical Model (GGM) in terms of accuracy measures via Monte Carlo runs. Additionally, different real biological datasets are used to observe the performance of underlying methods. (C) 2018 Elsevier B.V. All rights reserved.