Software:Serratus (virology)
Stable release | v210110
/ January 10th 2023 |
---|---|
Operating system | Linux, web-based |
Type | Bioinformatics |
License | code, GPLv3. data, cc0 |
Website | serratus |
Serratus is a large scale viroinformatics platform for uncovering the total genetic diversity of Earth's virome. Originating with the goal of uncovering novel coronaviruses[1] that may have been incidentally sequenced by other researchers, the project expanded to encompass all RNA viruses, those which encode a viral RNA-dependent RNA polymerase (RdRp).
By the end of 2020 there were approximately 15,000 distinct RNA virus sequences known from public databases, measured by the number of distinct RdRp (greater than 10% difference in amino acid sequence). Using a bioinformatics workflow optimized for large-scale cloud computing, the research team analyzed 5.7 million freely available sequencing datasets (20.4 petabytes of raw data) in the Sequence Read Archive (SRA) in only 11 days and a computing cost of US$23,900.[2] This analysis yielded 132,000 novel viral RdRp, representing nearly an order of magnitude increase in the known genetic diversity of RNA viruses.[3]
Within the database, RNA viruses are classified according to their RdRp palmprint,[4] a type of molecular barcode. The palmprint can be used as a computationally efficient index for the identification of which SRA sequencing runs contain a particular RNA virus. Such an index allows for targeted analysis of raw sequencing datasets from which novel RNA viruses can be characterized.[5]
All Serratus data are freely-available under the INDSC release policy.
References
- ↑ Pennisi, Elizabeth. "New dangers? Computers uncover 100,000 novel viruses in old genetic data" (in en). www.science.org (Science). https://www.science.org/content/article/new-dangers-computers-uncover-100-000-novel-viruses-old-genetic-data.
- ↑ Pelley, Lauren. "Supercomputer helps Canadian researcher uncover thousands of viruses that could cause human diseases". CBC. https://www.cbc.ca/news/health/supercomputer-virus-study-disease-1.6345158.
- ↑ Edgar RC, Taylor J, Lin V, Altman T, Barbera P, Meleshko D (2022). "Petabase-scale sequence alignment catalyses viral discovery.". Nature 602 (7895): 142–147. doi:10.1038/s41586-021-04332-2. PMID 35082445. Bibcode: 2022Natur.602..142E.
- ↑ Babaian, Artem; Edgar, Robert (13 October 2022). "Ribovirus classification by a polymerase barcode sequence" (in en). PeerJ 10: e14055. doi:10.7717/peerj.14055. ISSN 2167-8359. PMID 36258794.
- ↑ Cabrera Mederos, Dariel; Debat, Humberto; Torres, Carolina; Portal, Orelvis; Jaramillo Zapata, Margarita; Trucco, Verónica; Flores, Ceferino; Ortiz, Claudio et al. (October 2022). "An Unwanted Association: The Threat to Papaya Crops by a Novel Potexvirus in Northwest Argentina" (in en). Viruses 14 (10): 2297. doi:10.3390/v14102297. ISSN 1999-4915. PMID 36298852.
External links