Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling

Creative Commons License

Gupta H. V., Kling H., Yilmaz K. K., Martinez G. F.

JOURNAL OF HYDROLOGY, vol.377, pp.80-91, 2009 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 377
  • Publication Date: 2009
  • Doi Number: 10.1016/j.jhydrol.2009.08.003
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Page Numbers: pp.80-91
  • Keywords: Mean squared error, Nash-Sutcliffe efficiency, Model performance evaluation, Calibration, Multiple criteria, Criteria decomposition, AUTOMATIC CALIBRATION, PARAMETER-ESTIMATION, NASH VALUES, CATCHMENT, REGIONALIZATION, VALIDATION, EFFICIENCY, MULTIPLE, INDEX
  • Middle East Technical University Affiliated: Yes


The mean squared error (MSE) and the related normalization, the Nash-Sutcliffe efficiency (NSE), are the two criteria most widely used for calibration and evaluation of hydrological models with observed data. Here, we present a diagnostically interesting decomposition of NSE (and hence MSE), which facilitates analysis of the relative importance of its different components in the context of hydrological modelling, and show how model calibration problems can arise due to interactions among these components. The analysis is illustrated by calibrating a simple conceptual precipitation-runoff model to daily data for a number of Austrian basins having a broad range of hydro-meteorological characteristics. Evaluation of the results clearly demonstrates the problems that can be associated with any calibration based on the NSE (or MSE) criterion. While we propose and test an alternative criterion that can help to reduce model calibration problems, the primary purpose of this Study is not to present an improved measure of model performance. Instead, we seek to show that there are systematic problems inherent with any optimization based on formulations related to the MSE. The analysis and results have implications to the manner in which we calibrate and evaluate environmental models, we discuss these and suggest possible ways forward that may move us towards an improved and diagnostically meaningful approach to model performance evaluation and identification. (C) 2009 Elsevier B.V. All rights reserved.