JOURNAL OF CHEMICAL INFORMATION AND MODELING, vol.54, no.8, pp.2200-2213, 2014 (SCI-Expanded)
In a first step toward the development of an efficient and accurate protocol to estimate amino acids' pK(a)'s in proteins, we present in this work how to reproduce the pK(a)'s of alcohol and thiol based residues (namely tyrosine, serine, and cysteine) in aqueous solution from the knowledge of the experimental pK(a)'s of phenols, alcohols, and thiols. Our protocol is based on the linear relationship between computed atomic charges of the anionic form of the molecules (being either phenolates, alkoxides, or thiolates) and their respective experimental pK(a) values. It is tested with different environment approaches (gas phase or continuum solvent-based approaches), with five distinct atomic charge models (Mulliken, Lowdin, NPA, Merz-Kollman, and CHelpG), and with nine different DFT functionals combined with 16 different basis sets. Moreover, the capability of semiempirical methods (AM1, RM1, PM3, and PM6) to also predict pK(a)'s of thiols, phenols, and alcohols is analyzed. From our benchmarks, the best combination to reproduce experimental pK(a)'s is to compute NPA atomic charge using the CPCM model at the B3LYP/3-21G and M062X/6-311G levels for alcohols (R-2 = 0.995) and thiols (R-2 = 0.986), respectively. The applicability of the suggested protocol is tested with tyrosine and cysteine amino acids, and precise pK(a) predictions are obtained. The stability of the amino acid pK(a)'s with respect to geometrical changes is also tested by MM-MD and DFT-MD calculations. Considering its strong accuracy and its high computational efficiency, these pK(a) prediction calculations using atomic charges indicate a promising method for predicting amino acids' pK(a) in a protein environment.