In this study, an efficient, robust algorithm for automatic target detection and tracking is introduced. Procedure starts with a detection phase. Proposed method uses two alternatives for the detection phase, namely maximally stable extremal regions detector and Canny edge detector. After detection, regions of interest are evaluated and eliminated according to their compactness and effective saliency. The detection process is repeated for a predetermined number of pyramid levels where each level processes a downsampled version of input image to achieve scale invariance. Then, temporal consistency for detections from all scales is evaluated and target likelihood map is constructed using kernel density estimation in order to merge all target hypotheses. Finally, outstanding targets are selected from target likelihood map and tracking is achieved by minimizing spatial distance between the selected targets in consecutive frames.