SOFT COMPUTING, cilt.28, sa.1, ss.205-215, 2024 (SCI-Expanded)
For a few decades, parallelization in statistical computing has been an increasing trend, and researchers have put significant effort into converting or adjusting known statistical methods and algorithms in parallel. The main reasons for the transition to parallel processes are the rapid growth in the size and the volume of data and the accelerated hardware developments. Divide and (re)combine (DnR) is one of the parallelization methods that allows the existing data or method to be implemented by dividing it into smaller pieces. It is possible to use the DnR method in most regression methods to reveal the relationship between the data. Although several libraries have been created in existing programming languages for many regression methods, such an approach is not yet used for kernel regression. However, it should be kept in mind that the kernel regression calculation method takes a relatively long time. For this reason, parallelization would be a handy strategy to decrease the calculation time in kernel regression. In this study, we aim to demonstrate how time efficiency is achieved using DnR methods for kernel regression with the help of several parallelization strategies in R. The results indicate that the computation time can be reduced proportionally with a trade-off between time and accuracy.