RNA-Seq of Phytoplasmas Reveals Clues about Transcription Start Sites and Consensus Promoter Sequences

Transcription initiation is the principal step in the regulation of bacterial gene expression. It primarily depends on the interaction between bacterial transcription factors, sigma factors, and promoter sequence elements located upstream of transcription start sites (TSSs). Most bacteria have multiple sigma factors and corresponding promoter sequence elements to regulate gene expression in response to various environmental factors (Gruber and Gross, 2003). Recent high-throughput RNA sequencing (RNA-Seq) technology has enabled genome-wide identification of TSSs and prediction of the consensus promoter sequences in many culturable bacteria (Mendoza-Vargas et al., 2009; Sharma et al., 2010; Filiatrault et al., 2011; Schlüter et al., 2013). This approach is also useful to find unknown open reading frames (ORFs) (Mendoza-Vargas et al., 2009; Filiatrault et al., 2011), and to detect non-coding RNAs (ncRNAs), which might have roles in the regulation of gene expression (Sharma et al., 2010; Filiatrault et al., 2011; Schlüter et al., 2013). However, the RNA-Seq approach for TSS annotation has not been applied to unculturable bacteria, due to the inefficiency in reading sufficient RNA of these bacteria among large pools of host or environmental RNA.

Phytoplasmas (class Mollicutes, genus “Candidatus [Ca.] Phytoplasma”) are the bacterial plant pathogens that cause yield losses of various crops (The IRPCM Phytoplasma/Spiroplasma Working Team–Phytoplasma taxonomy group, 2004; Oshima et al., 2013; Maejima et al., 2014). Phytoplasmas are the obligate intracellular parasites that reside in the phloem tissues of infected plants and are transmitted by insect vectors in a persistent manner. Phytoplasma genomes are small (600–1,000 kb) and AT-rich (21–28%), and contain a limited number of genes (500–1,100 genes) (Oshima et al., 2004; Bai et al., 2006; Kube et al., 2008; Tran-Nguyen et al., 2008; Andersen et al., 2013). Phytoplasmas possess slightly larger genomes than mycoplasmas, which are closely related bacteria, due to repeated gene sequences called potential mobile units (PMUs) (Bai et al., 2006; Arashida et al., 2008). While phytoplasmas have lost more metabolic pathway genes than mycoplasmas, they possess multiple copies of transporter-related genes in PMUs for the absorption of nutrients from their host cells and adaptation to two different host cell environments (Oshima et al., 2004, 2013). We previously revealed that at least one-third of the genes of “Ca. P. asteris” onion yellows strain (OY-M) are differentially expressed in plants and insects (Oshima et al., 2011). However, the regulatory mechanisms of gene expression in phytoplasmas are poorly understood.

Complete genome sequence analyses of phytoplasmas revealed the presence of two types of sigma factors, RpoD and FliA (Oshima et al., 2004; Bai et al., 2006; Kube et al., 2008; Tran-Nguyen et al., 2008; Andersen et al., 2013). We previously reported that RpoD of OY-M regulates several housekeeping, virulence, and host–phytoplasma interaction genes of OY-M, using an in vitro transcription assay, and determined the consensus promoter sequence for RpoD (Miura et al., 2015). Although hundreds of candidate promoter sequences regulated by RpoD were predicted in the OY-M genome, their promoter activities in vivo remain to be elucidated. Moreover, the presence of RpoD-independent promoter elements is unknown. In this study, we applied RNA-Seq technology for genome-wide identification of TSSs and promoter elements of OY-M phytoplasma.

* Abstract

Phytoplasmas are obligate intracellular parasitic bacteria that infect both plants and insects. We previously identified the sigma factor RpoD-dependent consensus promoter sequence of phytoplasma. However, the genome-wide landscape of RNA transcripts, including non-coding RNAs (ncRNAs) and RpoD-independent promoter elements, was still unknown. In this study, we performed an improved RNA sequencing analysis for genome-wide identification of the transcription start sites (TSSs) and the consensus promoter sequences. We constructed cDNA libraries using a random adenine/thymine hexamer primer, in addition to a conventional random hexamer primer, for efficient sequencing of 5′-termini of AT-rich phytoplasma RNAs. We identified 231 TSSs, which were classified into four categories: mRNA TSSs, internal sense TSSs, antisense TSSs (asTSSs), and orphan TSSs (oTSSs). The presence of asTSSs and oTSSs indicated the genome-wide transcription of ncRNAs, which might act as regulatory ncRNAs in phytoplasmas. This is the first description of genome-wide phytoplasma ncRNAs. Using a de novo motif discovery program, we identified two consensus motif sequences located upstream of the TSSs. While one was almost identical to the RpoD-dependent consensus promoter sequence, the other was an unidentified novel motif, which might be recognized by another transcription initiation factor. These findings are valuable for understanding the regulatory mechanism of phytoplasma gene expression.

To access references and this article in its entirety click here.

DNA and Cell Biology, published by Mary Ann Liebert, Inc., is the trusted source for authoritative, peer-reviewed reporting on the latest research in the field of molecular biology. The above article was first published in the December 2017 issue of DNA and Cell Biology with the title “Genome-Wide Analysis of the Transcription Start Sites and Promoter Motifs of Phytoplasmas". The views expressed here are those of the authors and are not necessarily those of DNA and Cell Biology, Mary Ann Liebert, Inc., publishers, or their affiliates. No endorsement of any entity or technology is implied.