Sample size determination for logistic regression


Motrenko A., Strijov V., Weber G.

JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, vol.255, pp.743-752, 2014 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 255
  • Publication Date: 2014
  • Doi Number: 10.1016/j.cam.2013.06.031
  • Journal Name: JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Page Numbers: pp.743-752
  • Keywords: Logistic regression, Sample size, Feature selection, Bayesian inference, Kullback-Leibler divergence
  • Middle East Technical University Affiliated: Yes

Abstract

The problem of sample size estimation is important in medical applications, especially in cases of expensive measurements of immune biomarkers. This paper describes the problem of logistic regression analysis with the sample size determination algorithms, namely the methods of univariate statistics, logistics regression, cross-validation and Bayesian inference. The authors, treating the regression model parameters as a multivariate variable, propose to estimate the sample size using the distance between parameter distribution functions on cross-validated data sets. Herewith, the authors give a new contribution to data mining and statistical learning, supported by applied mathematics. (C) 2013 Elsevier B.V. All rights reserved.