In this paper, we first discuss the essential requirements for a fingerprint (perceptual hash)-based distributed video identification system in peer-to-peer (P2P) networks in comparison with traditional central database implementations of fingerprints. This discussion reveals that first, fingerprint sizes of existing video fingerprint methods are not compatible with the cache sizes of current P2P clients; second, fingerprint extraction durations during a query are not at tolerable levels for a user in the network; third, the repetitive patterns in the extracted fingerprints avoid the uniform distribution of storage and traffic load among the peers; and finally, the existing methods do not provide a solution to synchronize the fingerprint extraction from the shared video and queried video. In order to solve the mentioned requirements, we propose a baseline method using only the difference of video framemeans, which decreases the fingerprint sizes to typical cache sizes, by increasing the granularity levels from seconds to minutes. We then develop a novel algorithm which utilizes reference points on one-dimensional frame mean sequence for the synchronization of fingerprint extraction. This algorithm is extended with a hierarchical decoding approach based on Gaussian scales, which only decodes a subset of video frames without needing a full decoding. Finally, an analysis on the effect of design parameters to the fingerprint probability distribution is performed to avoid repetitive patterns. Our ultimate solution reduces the fingerprint sizes into kilobytes, extraction time to seconds, and search duration into milliseconds, and achieves about 90% detection rates with 1-4 min granularities, while enabling a fair distribution of storage load among the peers at the same time.