Chocolate Sample Classification by Principal Component Analysis of Preprocessed Terahertz Transmission Spectra


Khodasevich M., Lyakhnovich A., ERİKLİOĞLU H.

Journal of Applied Spectroscopy, vol.89, no.2, pp.251-255, 2022 (Journal Indexed in SCI Expanded) identifier

  • Publication Type: Article / Article
  • Volume: 89 Issue: 2
  • Publication Date: 2022
  • Doi Number: 10.1007/s10812-022-01351-3
  • Title of Journal : Journal of Applied Spectroscopy
  • Page Numbers: pp.251-255
  • Keywords: baseline, cluster analysis, principal component analysis, support vector machine, terahertz time-domain spectroscopy

Abstract

© 2022, Springer Science+Business Media, LLC, part of Springer Nature.The efficiency of chocolate sample classification by type and manufacturer is demonstrated using a spectralprint method and THz transmission spectra. Their baselines are determined using the adaptive iteratively reweighted penalized least squares (airPLS) method to suppress noise and the Fabry–Perot effect. The classification was carried out by constructing a low-dimensional space of the principal components of the baselines and applying cluster analysis methods in this space. The precision and recall values of the classification of chocolate samples by the k-means, classification and regression tree, and hierarchical cluster analysis methods are 0.85 and 0.83, 0.91 and 0.90, and 0.94 and 0.93, respectively. The support vector machine is successfully applied to two cases where pairwise classification is most problematic.