Global Analysis of RNA Secondary Structure in Two Metazoans

Fan Li1,2,3,*, Qi Zheng1,2,*, Paul Ryvkin3, Isabelle Dragomir1, Yaanik Desai4, Subhadra Aiyer4, Otto Valladares5, Jamie Yang1,2, Shelly Bambina6, Leah R. Sabin6, John L. Murray2,7, Todd Lamitina2,8, Arjun Raj2,4, Sara Cherry2,6, Li-San Wang2,3,5,9,10, and Brian D. Gregory1,2,3 [link][pdf]

1Department of Biology
2Penn Genome Frontiers Institute
3Genomics and Computational Biology Graduate Group
4Department of Bioengineering
5Department of Pathology and Laboratory Medicine, School of Medicine
6Department of Microbiology, School of Medicine
7Department of Genetics, School of Medicine
8Department of Physiology, School of Medicine
9Institute on Aging, School of Medicine
10Penn Center for Bioinformatics, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
*These authors contributed equally to this work.


The secondary structure of RNA is necessary for its maturation, regulation, processing, and function. However, the global influence of RNA folding in eukaryotes is still unclear. Here, we use a high-throughput, sequencing-based, structure-mapping approach to identify the paired (double-stranded RNA [dsRNA]) and unpaired (single-stranded RNA [ssRNA]) components of the Drosophila melanogaster and Caenorhabditis elegans transcriptomes, which allows us to identify conserved features of RNA secondary structure in metazoans. From this analysis, we find that ssRNAs and dsRNAs are significantly correlated with specific epigenetic modifications. Additionally, we find key structural patterns across protein-coding transcripts that indicate that RNA folding demarcates regions of protein translation and likely affects microRNA-mediated regulation of mRNAs in animals. Finally, we identify and characterize 546 mRNAs whose folding pattern is significantly correlated between these metazoans, suggesting that their structure has some function. Overall, our findings provide a global assessment of RNA folding in animals.

Genome browser

All of our sequencing data, structure models, and ortholog comparisons are available for browsing using the AnnoJ genome browser:

Supplemental data

Structure scores for all protein-coding transcripts

Description: These files contain normalized structure scores (log-ratio of dsRNA to ssRNA reads) for all protein-coding transcripts.

File format: Gzip archive of individual *.ratio files for each transcript. Each file contains the structure scores at each position in the mature mRNA transcript (that is, without introns).



If you have any questions or comments, please contact:

Mailing addresses:

204G Carolyn Lynch Laboratory
433 S. University Ave.
Department of Biology
University of Pennsylvania
Philadelphia, PA 19104 USA
Gregory Lab homepage
1424 Blockley Hall
423 Guardian Drive
Penn Center for Bioinformatics
University of Pennsylvania
Philadelphia, PA 19104 USA
Wang Lab homepage