BIOCHEMICAL ENGINEERING JOURNAL, cilt.188, 2022 (SCI-Expanded)
Designing promoter architectures hinges on genomic and functional annotation. Saccharomyces cerevisiae is the first model yeast whose databases host genomic and functional annotation information. To predict transcription factors (TFs) regulating central pathways in yeasts, we first introduce S. cerevisiae cis -acting DNA sequences/sites (cADSs) curation pipeline (Sc-cADSs-CP). The promoters of the genes involved in the central pathways of S. cerevisiae were retrieved from the genome sequences. We processed the binding frequency matrices of TFs with the following two criteria. First, we extracted cADSs based on the TF motifs in the TRANSFAC database; then, if there were more than one frequency matrix for a TF, the longest one with the maximum sensitivity was used. Next, we developed the direct scanning algorithm ScanAlgo (uses the Biopython-library), scanning DNA motifs for pairing cross-species alignments. We used the tools Sc-cADSs-CP and ScanAlgo to predict master TFs for Pichia pastoris, which lacks extensive functional annotation studies. The phylogenetic footprinting results were obtained by aligning the scanned S. cerevisiae promoters against orthologous P. pastoris promoters. The predicted cADSs were summed into position weight matrices unique to P. pastoris. We annotated 116 TFs based on the phylo-genetic footprinting predictions of cADSs regulating the central pathways in P. pastoris. The presented meth-odology with the tools Sc-cADSs-CP and ScanAlgo enables the prediction of master TFs and cADSs in yeasts.