A robust machine learning framework for predicting the higher heating value of poultry litter using proximate analysis


Eren B., Uzun S., Özdemir S.

BIOMASS & BIOENERGY, cilt.211, 2026 (SCI-Expanded, Scopus) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 211
  • Basım Tarihi: 2026
  • Doi Numarası: 10.1016/j.biombioe.2026.109153
  • Dergi Adı: BIOMASS & BIOENERGY
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, BIOSIS, Compendex, Environment Index, Geobase, INSPEC
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

Accurate estimation of the higher heating value (HHV) of poultry litter is critical for its valorization as a renewable energy feedstock. However, the heterogeneity of poultry litter and data scarcity pose challenges for predictive modeling. In this study, a robust machine learning framework was developed to predict HHV using dry-basis proximate analysis parameters (volatile matter, fixed carbon, and ash). Unlike previous studies that rely on standard datasets, this work introduces a systematic comparison of four modeling scenarios to isolate the effects of data augmentation and hyperparameter optimization. Gaussian noise-based augmentation was formulated to expand the training space, while GridSearch cross-validation optimized model parameters. Five algorithms (KNN, RF, Extra Trees, LGBM, XGBoost) were evaluated and compared against a Multiple Linear Regression (MLR) baseline. Results indicated that models on the original dataset suffered from overfitting (Test R-2 < 0.50) and failed to significantly outperform the MLR baseline (R-2 = 0.42). However, data augmentation significantly boosted generalization. Crucially, evaluation was performed on a non-augmented, held-out test set to ensure real-world reliability. The optimized Extra Trees (ET) model achieved the best performance (Test R-2 > 0.89, RMSE = 0.80 MJ/kg), outperforming the MLR baseline (R-2 = 0.64) and other ML models. Feature importance and SHAP interaction analysis confirmed that ash content is the dominant inhibitor of energy density, aligning with thermochemical principles. The proposed framework offers a rapid, low-cost alternative to expensive elemental analysis for industrial energy assessment.