Multimedia sensors enable monitoring applications to obtain more accurate and detailed information. However, the development of efficient and lightweight solutions for managing data traffic over wireless multimedia sensor networks (WMSNs) has become vital because of the excessive volume of data produced by multimedia sensors. As part of this motivation, this paper proposes a fusion-based WMSN framework that reduces the amount of data to be transmitted over the network by intra-node processing. This framework explores three main issues: 1) the design of a wireless multimedia sensor (WMS) node to detect objects using machine learning techniques; 2) a method for increasing the accuracy while reducing the amount of information transmitted by the WMS nodes to the base station, and; 3) a new cluster-based routing algorithm for the WMSNs that consumes less power than the currently used algorithms. In this context, a WMS node is designed and implemented using commercially available components. In order to reduce the amount of information to be transmitted to the base station and thereby extend the lifetime of a WMSN, a method for detecting and classifying objects on three different layers has been developed. A new energy-efficient cluster-based routing algorithm is developed to transfer the collected information/data to the sink. The proposed framework and the cluster-based routing algorithm are applied to our WMS nodes and tested experimentally. The results of the experiments clearly demonstrate the feasibility of the proposed WMSN architecture in the real-world surveillance applications.