Occupancy modeling using population statistics and machine learning for urban residential built environment


Iseri O. K., GÜRSEL DİNO İ., KALKAN S.

Energy and Buildings, cilt.357, 2026 (SCI-Expanded, Scopus) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 357
  • Basım Tarihi: 2026
  • Doi Numarası: 10.1016/j.enbuild.2026.117155
  • Dergi Adı: Energy and Buildings
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, Environment Index, INSPEC, Public Affairs Index, Urban Studies Abstracts
  • Anahtar Kelimeler: Building occupancy modeling, Building performance modeling, Data-driven models, Deep learning modeling, LSTM, Population statistics, Time-series datasets, Transformers
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

Occupancy modeling aims to represent the diversity of occupant behavior in buildings, thus, accurate modeling of occupancy is essential for understanding household dynamics and energy-related interactions in residential buildings, as well as for their use in building performance applications. The central contribution of this research is a novel, data-driven methodology for generating high-fidelity occupancy data called CENTUS. Our approach synthesizes official population statistics from ISTAT with nuanced behavioral patterns analyzed using advanced deep learning architectures (LSTM and Transformer). This framework enables the comprehensive, year-long classification of household occupancy across both temporal and non-temporal attributes. Through unified multitask learning that integrates sequential columns with demographic attributes, these models can simultaneously classify multiple occupancy attributes with superior accuracy and broader coverage compared to traditional deterministic and stochastic approaches. Our approach delivers three key advantages: ensures privacy protection through ethically sourced public institutional data; enables cross-national compatibility; and supports flexible scaling from individual residential units to neighborhood-level analysis via multiple modeling strategies including Argmax classification, SoftMax distributions, and temperature-controlled sampling.