Biology:DIMPL

From HandWiki

DIMPL (Discovery of Intergenic Motifs PipeLine) is a bioinformatic pipeline that enables the extraction and selection of bacterial GC-rich intergenic regions (IGRs) that are enriched for structured non-coding RNAs (ncRNAs).[1] The method of enriching bacterial IGRs for ncRNA motif discovery was first reported for a study in "Genome-wide discovery of structured noncoding RNAs in bacteria".[2] DIMPL pipeline automates the process of total genome analysis by extracting IGRs, filtering them by length and nucleic acid composition, and collecting the data necessary to identify candidate motifs and assign their possible functions. DIMPL pipeline provides reproducible techniques for identifying genomic regions enriched for ncRNA through support vector machine (SVM) classifiers. It can be used to look for nucleic acid and protein motifs, including riboswitch-like elements, upstream open reading frames (uORFs), short open reading frames (sORFs), ribosomal protein leader sequences, selfish genetic elements and other structured RNA motifs of unknown function.

DIMPL uses various sequence analysis resources, including:

  • Rfam database,[3] as a reference of known RNA families
  • BLASTX search tool,[4] to eliminate unannotated protein coding regions
  • INFERNAL package,[5][6] to search the IGSs sequences
  • CMfinder,[7] to look for possible RNA secondary structure features
  • R-scape software[8] and R2R drawing algorithm,[9] to generate the consensus model
  • RNAcode,[10] to look for the presence of coding regions
  • GenomeView,[11] to visualize the genetic context of the RNA motif

RNA motifs discovered using DIMPL include HMP-PP riboswitch, icd-II ncRNA motif, carA ncRNA motif, ldh2 ncRNA motif,[12] among others.

References

  1. Brewer, Kenneth I.; Gaffield, Glenn J.; Puri, Malavika; Breaker, Ronald R. (2021-09-15). "DIMPL: a bioinformatics pipeline for the discovery of structured noncoding RNA motifs in bacteria". Bioinformatics 38 (2): 533–535. doi:10.1093/bioinformatics/btab624. ISSN 1367-4811. PMID 34524415. 
  2. Stav, Shira; Atilho, Ruben M.; Mirihana Arachchilage, Gayan; Nguyen, Giahoa; Higgs, Gadareth; Breaker, Ronald R. (2019-03-22). "Genome-wide discovery of structured noncoding RNAs in bacteria". BMC Microbiology 19 (1): 66. doi:10.1186/s12866-019-1433-7. ISSN 1471-2180. PMID 30902049. 
  3. Kalvari, Ioanna; Nawrocki, Eric P.; Argasinska, Joanna; Quinones-Olvera, Natalia; Finn, Robert D.; Bateman, Alex; Petrov, Anton I. (2018-06-05). "Non-Coding RNA Analysis Using the Rfam Database". Current Protocols in Bioinformatics 62 (1): e51. doi:10.1002/cpbi.51. ISSN 1934-340X. PMID 29927072. 
  4. Camacho, Christiam; Coulouris, George; Avagyan, Vahram; Ma, Ning; Papadopoulos, Jason; Bealer, Kevin; Madden, Thomas L. (2009-12-15). "BLAST+: architecture and applications". BMC Bioinformatics 10: 421. doi:10.1186/1471-2105-10-421. ISSN 1471-2105. PMID 20003500. 
  5. Mandiwanza, Tafadzwa; Kaliaperumal, Chandrasekaran; Mulligan, Linda; Ryan, Elizabeth; Looby, Seamus; Caird, John; Brett, Francesca (2017-02-20). "Child with radiologically recurrent thalamic tumor". Brain Pathology 27 (2): 239–240. doi:10.1111/bpa.12490. ISSN 1015-6305. PMID 28217956. 
  6. Nawrocki, Eric P.; Eddy, Sean R. (2013-11-15). "Infernal 1.1: 100-fold faster RNA homology searches". Bioinformatics 29 (22): 2933–2935. doi:10.1093/bioinformatics/btt509. ISSN 1367-4811. PMID 24008419. 
  7. Yao, Zizhen; Weinberg, Zasha; Ruzzo, Walter L. (2006-02-15). "CMfinder--a covariance model based RNA motif finding algorithm". Bioinformatics 22 (4): 445–452. doi:10.1093/bioinformatics/btk008. ISSN 1367-4803. PMID 16357030. 
  8. Rivas, Elena; Clements, Jody; Eddy, Sean R. (2016-11-07). "A statistical test for conserved RNA structure shows lack of evidence for structure in lncRNAs". Nature Methods 14 (1): 45–48. doi:10.1038/nmeth.4066. ISSN 1548-7105. PMID 27819659. 
  9. Weinberg, Zasha; Breaker, Ronald R. (2011-01-04). "R2R--software to speed the depiction of aesthetic consensus RNA secondary structures". BMC Bioinformatics 12: 3. doi:10.1186/1471-2105-12-3. ISSN 1471-2105. PMID 21205310. 
  10. Washietl, Stefan; Findeiss, Sven; Müller, Stephan A.; Kalkhof, Stefan; von Bergen, Martin; Hofacker, Ivo L.; Stadler, Peter F.; Goldman, Nick (2011-02-28). "RNAcode: robust discrimination of coding and noncoding regions in comparative sequence data". RNA 17 (4): 578–594. doi:10.1261/rna.2536111. ISSN 1469-9001. PMID 21357752. 
  11. Abeel, Thomas; Van Parys, Thomas; Saeys, Yvan; Galagan, James; Van de Peer, Yves (2011-11-18). "GenomeView: a next-generation genome browser". Nucleic Acids Research 40 (2): e12. doi:10.1093/nar/gkr995. ISSN 1362-4962. PMID 22102585. 
  12. Brewer, Kenneth I.; Greenlee, Etienne B.; Higgs, Gadareth; Yu, Diane; Mirihana Arachchilage, Gayan; Chen, Xi; King, Nicholas; White, Neil et al. (2021-05-10). "Comprehensive discovery of novel structured noncoding RNAs in 26 bacterial genomes". RNA Biology 18 (12): 2417–2432. doi:10.1080/15476286.2021.1917891. ISSN 1555-8584. PMID 33970790.