A Bayesian Approach to Learning Scoring Systems


ERTEKİN BOLELLİ Ş., Rudin C.

BIG DATA, cilt.3, sa.4, ss.267-276, 2015 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 3 Sayı: 4
  • Basım Tarihi: 2015
  • Doi Numarası: 10.1089/big.2015.0033
  • Dergi Adı: BIG DATA
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.267-276
  • Anahtar Kelimeler: data mining, machine learning, predictive analytics, SUPPORT VECTOR MACHINES, ACUTE PHYSIOLOGY SCORE, TIMI RISK SCORE, CLASSIFICATION-SYSTEM, ATRIAL-FIBRILLATION, HOSPITAL MORTALITY, PREDICTING STROKE, RULE EXTRACTION, MODELS, APACHE
  • Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

We present a Bayesian method for building scoring systems, which are linear models with coefficients that have very few significant digits. Usually the construction of scoring systems involve manual efforthumans invent the full scoring system without using data, or they choose how logistic regression coefficients should be scaled and rounded to produce a scoring system. These kinds of heuristics lead to suboptimal solutions. Our approach is different in that humans need only specify the prior over what the coefficients should look like, and the scoring system is learned from data. For this approach, we provide a Metropolis-Hastings sampler that tends to pull the coefficient values toward their natural scale. Empirically, the proposed method achieves a high degree of interpretability of the models while maintaining competitive generalization performances.