COVID-19 forecasting using shifted Gaussian Mixture Model with similarity-based estimation

Külah E., Çetinkaya Y. M., Özer A. G., Alemdar H.

EXPERT SYSTEMS WITH APPLICATIONS, vol.214, 2023 (SCI-Expanded) identifier identifier identifier

  • Publication Type: Article / Article
  • Volume: 214
  • Publication Date: 2023
  • Doi Number: 10.1016/j.eswa.2022.119034
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Aerospace Database, Applied Science & Technology Source, Communication Abstracts, Computer & Applied Sciences, INSPEC, Metadex, Public Affairs Index, Civil Engineering Abstracts
  • Keywords: COVID-19, Gaussian mixture models, Time-series data, Similarity-based estimation, Trend similarity score, WAVES
  • Middle East Technical University Affiliated: Yes


The COVID-19 pandemic has caused a pronounced disturbance in the social environments and economies of many countries worldwide. Credible forecasting methods to predict the pandemic's progress can allow countries to control the disease's spread and decrease the number of severe cases. This study presents a novel approach, called the Shifted Gaussian Mixture Model with Similarity-based Estimation (SGSE), that forecasts the future of a specific country's daily new case values by examining similar behavior in other countries. The model uses daily new case values collected since the pandemic began and finds countries with similar trends using a specific time offset. The daily new case values data between the first day and (today-N)th day are transformed by employing the Gaussian Mixture Model (GMM) and, subsequently, a new vector of features is obtained for each country. Using these feature vectors, countries that show similar statistics in the past are found for any forecasted country. The future of the corresponding country is forecasted by taking the mean of the time-series plots after the offset points of similar countries are calculated. A brand new metric called a trend similarity score, which calculates the similarity between forecasted and actual values is also presented in this study. While the SGSE trend similarity score median varies between 0.903-0.947, based on the selection of the distance metric, the ARIMA model yields only 0.642. The performance of the SGSE was compared in seven European countries using four different public projects submitted to The European COVID-19 Forecast Hub. The SGSE gives the most accurate forecasts compared to all other models. The test sets' results show that trends and plateaus are predicted accurately for many countries.