We motivate and describe the application of Hierarchical Dirichlet Process (HDP) models to the "soft" biclustering of gene expression data, in which we obtain modules (biclusters) where the affiliation of genes and samples with the modules are weighted, instead of being hard memberships. As a distinct contribution, we propose a method which HDP is informed with prior beliefs, significantly increasing the quality of the biclustering in terms of both the correctness of the number of modules inferred, and the precision of these modules, especially when evidence is sparse. We outline two such informed priors; one based on co-expression relationships inherent in the data, the other based on an externally provided regulatory network. We validate these results and compare the performance of our approach to Weighted Gene Correlation Network Analysis (WGCNA), another model that features weighted modules. We have, to this end, performed experiments on semi-synthetic data. The results show that HDP, with the addition of a well-informed prior, is able to capture the correct number of modules with increased accuracy. Furthermore, the model becomes robust to changes in the strength of the prior. We conclude by discussing these results and the benefits provided by our approach for gene expression analysis and network validation.