Assessing genetic heterogeneity within bacterial species isolated from gastrointestinal and environmental samples: How many isolates does it take?

Creative Commons License

Dopfer D., Buist W., Soyer Y., Munoz M. A., Zadoks R. N., Geue L., ...More

APPLIED AND ENVIRONMENTAL MICROBIOLOGY, vol.74, no.11, pp.3490-3496, 2008 (SCI-Expanded) identifier identifier identifier

  • Publication Type: Article / Article
  • Volume: 74 Issue: 11
  • Publication Date: 2008
  • Doi Number: 10.1128/aem.02789-07
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Page Numbers: pp.3490-3496
  • Middle East Technical University Affiliated: No


Strain typing of bacterial isolates is increasingly used to identify sources of infection or product contamination and to elucidate routes of transmission of pathogens or spoilage organisms. Usually, the number of bacterial isolates belonging to the same species that is analyzed per sample is determined by convention, convenience, laboratory capacity, or financial resources. Statistical considerations and knowledge of the heterogeneity of bacterial populations in various sources can be used to determine the number of isolates per sample that is actually needed to address specific research questions. We present data for intestinal Escherichia coli, Listeria monocytogenes, Klebsiella pneumoniae, and Streptococcus uberis from gastrointestinal, fecal, or soil samples characterized by ribotyping, pulsed-field gel electrophoresis, and PCR-based strain-typing methods. In contrast to previous studies, all calculations were performed with a single computer program, employing software that is freely available and with in-depth explanation of the choice and derivation of prior distributions. Also, some of the model assumptions were relaxed to allow analysis of the special case of two (groups of) strains that are observed with different probabilities. Sample size calculations, with a Bayesian method of inference, show that from 2 to 20 isolates per sample need to be characterized to detect all strains that are present in a sample with 95% certainty. Such high numbers of isolates per sample are rarely typed in real life due to financial or logistic constraints. This implies that investigators are not gaining maximal information on strain heterogeneity and that sources and transmission pathways may go undetected.