Transformers are known as one of the most important equipment in power system transmission and distribution networks. Partial Discharge (PD) measurement and PRPD interpretation is a powerful tool to monitor the situation and evaluating the risk of power transformer failure. Multiple discharge sources affect the accuracy of interpretation of PRPD. This paper presents a PD signal separation algorithm based on extraction of high level image features. The time- frequency S transform (ST) is applied to the PD signal waveforms, acquired by digital detection instruments at 100 MS/s. The resultant ST matrix is then converted to gray scale image from which high level features are extracted using Bag of Words (BoW). Principle component analysis (PCA) transform is applied to BoW feature to reduce the dimension of features. Calinski- Harabasz criterion is calculated to identify the number of active sources and then Gaussian mixture model (GMM) clustering is used to discover clusters in the feature space. The proposed separation algorithm is examined with mixed current impulse signals acquired from PD experiments on artificial multi- defect models. The separation results indicate that the proposed algorithm is effective for identifying the number of active sources and separating mixed PD signals initiated from multiple sources.