Parallel computing in linear mixed models


Yavuz F., Schloerke B.

COMPUTATIONAL STATISTICS, vol.35, no.3, pp.1273-1289, 2020 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 35 Issue: 3
  • Publication Date: 2020
  • Doi Number: 10.1007/s00180-019-00950-7
  • Journal Name: COMPUTATIONAL STATISTICS
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, ABI/INFORM, zbMATH, Civil Engineering Abstracts
  • Page Numbers: pp.1273-1289
  • Middle East Technical University Affiliated: Yes

Abstract

In this study, we propose a parallel programming method for linear mixed models (LMM) generated from big data. A commonly used algorithm, expectation maximization (EM), is preferred for its use of maximum likelihood estimations, as the estimations are stable and simple. However, EM has a high computation cost. In our proposed method, we use a divide and recombine to split the data into smaller subsets, running the algorithm steps in parallel on multiple local cores and combining the results. The proposed method is used to fit LMM with dense and sparse parameters and for large number of observations. It is faster than the classical approach and generalizes for big data. Supplementary sources for the proposed method are available in the R packagelmmpar.