Optical Flow (OF) information is used in higher level vision tasks in a variety of computer vision applications. However, its use in resource constrained applications such as small-scale mobile robotic platforms is limited because of the high computational complexity involved. The inability to compute the OF vector field in real-time is the main drawback which prevents these applications to efficiently utilize some successful techniques from the computer vision literature. In this work, we present the design and implementation of a high performance FPGA hardware with a small footprint and low power consumption that computes OF at a speed exceeding real-time performance. A well known OF algorithm by Horn and Schunck is selected for this baseline implementation. A detailed multiple-criteria performance analysis of the proposed hardware is presented with respect to computation speed, resource usage, power consumption and accuracy compared to a PC based floating-point implementation. The implemented hardware computes OF vector field on 256 x 256 pixels images in 3.89 ms i.e. 257 fps. Overall, the proposed implementation achieves a superior performance in terms of speed, power consumption and compactness while there is minimal loss of accuracy. We also make the FPGA design source available in full for research and academic use. (c) 2013 Elsevier B.V. All rights reserved.