Chocolate Sample Classification by Principal Component Analysis of Preprocessed Terahertz Transmission Spectra

Khodasevich M., Lyakhnovich A., ERİKLİOĞLU H.

Journal of Applied Spectroscopy, vol.89, no.2, pp.251-255, 2022 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 89 Issue: 2
  • Publication Date: 2022
  • Doi Number: 10.1007/s10812-022-01351-3
  • Journal Name: Journal of Applied Spectroscopy
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Chemical Abstracts Core, INSPEC
  • Page Numbers: pp.251-255
  • Keywords: terahertz time-domain spectroscopy, principal component analysis, baseline, cluster analysis, support vector machine
  • Middle East Technical University Affiliated: Yes


© 2022, Springer Science+Business Media, LLC, part of Springer Nature.The efficiency of chocolate sample classification by type and manufacturer is demonstrated using a spectralprint method and THz transmission spectra. Their baselines are determined using the adaptive iteratively reweighted penalized least squares (airPLS) method to suppress noise and the Fabry–Perot effect. The classification was carried out by constructing a low-dimensional space of the principal components of the baselines and applying cluster analysis methods in this space. The precision and recall values of the classification of chocolate samples by the k-means, classification and regression tree, and hierarchical cluster analysis methods are 0.85 and 0.83, 0.91 and 0.90, and 0.94 and 0.93, respectively. The support vector machine is successfully applied to two cases where pairwise classification is most problematic.