Transformations of Data in Deterministic Modelling of Biological Networks


Agraz M., PURUTÇUOĞLU GAZİ V.

3rd International Conference on Applied Mathematics and Approximation Theory (AMAT), Ankara, Türkiye, 18 - 21 Mayıs 2015, cilt.441, ss.343-356 identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası: 441
  • Doi Numarası: 10.1007/978-3-319-30322-2_24
  • Basıldığı Şehir: Ankara
  • Basıldığı Ülke: Türkiye
  • Sayfa Sayıları: ss.343-356
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

The Gaussian graphical model (GGM) is a probabilistic modelling approach used in the system biology to represent the relationship between genes with an undirected graph. In graphical models, the genes and their interactions are denoted by nodes and the edges between nodes. Hereby, in this model, it is assumed that the structure of the system can be described by the inverse of the covariance matrix, Theta, which is also called as the precision, when the observations are formulated via a lasso regression under the multivariate normality assumption of states. There are several approaches to estimate Theta in GGM. The most well-known ones are the neighborhood selection algorithm and the graphical lasso (glasso) approach. On the other hand, the multivariate adaptive regression splines (MARS) is a non-parametric regression technique to model nonlinear and highly dependent data successfully. From previous simulation studies, it has been found that MARS can be a strong alternative of GGM if the model is constructed similar to a lasso model and the interaction terms in the optimal model are ignored to get comparable results with respect to the GGM findings. Moreover, it has been detected that the major challenge in both modelling approaches is the high sparsity of Theta due to the possible non-linear interactions between genes, in particular, when the dimensions of the networks are realistically large. In this study, as the novelty, we suggest the Bernstein operators, namely, Bernstein and Szasz polynomials, in the raw data before any lasso type of modelling and associated inference approaches. Because from the findings via GGM with small and moderately large systems, we have observed that the Bernstein polynomials can increase the accuracy of the estimates. Hence, in this work, we perform these operators firstly into the most well-known inference approaches used in GGM under realistically large networks. Then, we investigate the assessment of these transformations for the MARS modelling as the alternative of GGM again under the same large complexity. By this way, we aim to propose these transformation techniques for all sorts of modellings under the steady-state condition of the protein-protein interaction networks in order to get more accurate estimates without any computational cost. In the evaluation of the results, we compare the precision and F-measures of the simulated datasets.