In this study, a fully automatic surveillance system for indoor environments which is capable of tracking multiple objects using both visible and thermal band images is proposed. These two modalities are fused to track people and the objects they carry separately using their heat signatures and the owners of the belongings are determined. Fusion of complementary information from different modalities (for example, thermal images are not affected by shadows and there is no thermal reflection or halo effect in visible images) is shown to result in better object detection performance. We use adaptive background modeling and local intensity operation for object detection and the mean-shift tracking algorithm for fully automatic tracking. Trackers are refreshed to resolve potential problems which may occur due to the changes in object's size, shape and to handle occlusion-split and to detect newly emerging objects as well as objects that leave the scene. The proposed scheme is applied to the abandoned object detection problem and the results are compared with the state of art methods. The results show that the proposed method facilitate individual tracking of objects for various applications, and provide lower false alarm rates compared to the state of art methods when applied to the abandoned object detection problem.