In this paper, we consider a problem of sampling a Wiener process, with samples forwarded to a remote estimator via a channel that consists of a queue with random delay. The estimator reconstructs a real-time estimate of the signal from causally received samples. Motivated by recent research on age of-information, we study the optimal sampling strategy that minimizes the mean square estimation error subject to a sampling frequency constraint. We prove that the optimal sampling strategy is a threshold policy, and find the optimal threshold. This threshold is determined by the sampling frequency constraint and how much the Wiener process varies during the channel delay. An interesting consequence is that even in the absence of the sampling frequency constraint, the optimal strategy is not zero-wait sampling in which a new sample is taken once the previous sample is delivered; rather, it is optimal to wait for a non-zero amount of time after the previous sample is delivered, and then take the next sample. Further, if the sampling times are independent of the observed Wiener process, the optimal sampling problem reduces to an age-of-information optimization problem that has been recently solved. Our comparisons show that the estimation error of the optimal sampling policy is much smaller than those of age-optimal sampling, zero-wait sampling, and classic uniform sampling.