Software:List of gene prediction software

From HandWiki
Short description: Wikipedia list article


This is a list of software tools and web portals used for gene prediction.

Name Description Species References
FINDER Automated software package to annotate eukaryotic genes from RNA-Seq data and associated protein sequences Eukaryotes [1]
FragGeneScan Predicting genes in complete genomes and sequencing Reads Prokaryotes, Metagenomes [2]
ATGpr Identifies translational initiation sites in cDNA sequences [3]
PRODIGAL Its name stands for Prokaryotic Dynamic Programming Genefinding Algorithm. It is based on log-likelihood functions and does not use Hidden or Interpolated Markov Models. Prokaryotes, Metagenomes (metaProdigal) [4]
AUGUSTUS Eukaryote gene predictor Eukaryotes [5]
BGF Hidden Markov model (HMM) and dynamic programming based ab initio gene prediction program
DIOGENES Fast detection of coding regions in short genome sequences
Dragon Promoter Finder Program to recognize vertebrate RNA polymerase II promoters
EUGENE Integrative gene finding Eukaryotes [6]
FGENESH HMM-based gene structure prediction: multiple genes, both chains Eukaryotes [7]
FRAMED Find genes and frameshift in G+C rich prokaryote sequences Prokaryotes [8]
GeMoMa Homology-based gene prediction based on amino acid and intron position conservation as well as RNA-Seq data [9][10]
GENIUS Links ORFs in complete genomes to protein 3D structures
geneid Program to predict genes, exons, splice sites, and other signals along DNA sequences Eukaryotes [11]
GENEPARSER Parse DNA sequences into introns and exons
GeneMark Family of self-training gene prediction programs Prokaryotes, Eukaryotes,

Metagenomes

[12][13][14][15]
GeneTack Predicts genes with frameshifts in prokaryote genomes Prokaryotes [16]
GENOMESCAN Predicts locations and exon-intron structures of genes in genome sequences from a variety of organisms
GENSCAN Finds genes using Fourier transform [17]
GLIMMER Finds genes in microbial DNA Prokaryotes
GLIMMERHMM Eukaryotic gene-finding system Eukaryotes [18]
GrailEXP Predicts exons, genes, promoters, polyas, CpG islands, EST similarities, and repeat elements in DNA sequence
mGene Support-vector machine (SVM) based system to find genes Eukaryotes [19]
mGene.ngs SVM based system to find genes using heterogeneous information: RNA-seq, tiling arrays Eukaryotes [20]
MORGAN Decision tree system to find genes in vertebrate DNA Eukaryotes
NIX Web tool to combine results from different programs: GRAIL, FEX, HEXON, MZEF, GENEMARK, GENEFINDER, FGENE, BLAST, POLYAH, REPEATMASKER, TRNASCAN
NNPP Neural network promoter prediction
NNSPLICE Neural network splice site prediction
ORF FINDER Graphical analysis tool to find all open reading frames
Regulatory Sequence Analysis Tools Series of modular computer programs to detect regulatory signals in non-coding sequences
PHANOTATE A tool to annotate phage genomes. Phages [21]
Regulatory Sequence Analysis Tools Series of modular computer programs to detect regulatory signals in non-coding sequences
SPLICEPREDICTOR Method to identify potential splice sites in (plant) pre-mRNA by sequence inspection using Bayesian statistical models Eukaryotes
VEIL Hidden Markov model to find genes in vertebrate DNA Server Eukaryotes

See also

References

  1. "FINDER: an automated software package to annotate eukaryotic genes from RNA-Seq data and associated protein sequences". BMC Bioinformatics 44 (9): e89. Apr 2021. doi:10.1186/s12859-021-04120-9. PMID 33879057. 
  2. "FragGeneScan: predicting genes in short and error-prone reads". Nucleic Acids Research 38 (20): e191. November 2010. doi:10.1093/nar/gkq747. PMID 20805240. 
  3. "Prediction of Translation Initiation ATG". http://atgpr.dbcls.jp. 
  4. "Prodigal: prokaryotic gene recognition and translation initiation site identification". BMC Bioinformatics 11: 119. March 2010. doi:10.1186/1471-2105-11-119. PMID 20211023. 
  5. "A novel hybrid gene prediction method employing protein multiple sequence alignments". Bioinformatics 27 (6): 757–63. March 2011. doi:10.1093/bioinformatics/btr010. PMID 21216780. 
  6. "Genome annotation in plants and fungi: EuGene as a model platform.". Current Bioinformatics 3 (2): 87–97. May 2008. doi:10.2174/157489308784340702. https://www.ingentaconnect.com/content/ben/cbio/2008/00000003/00000002/art00003. 
  7. "Ab initio gene finding in Drosophila genomic DNA". Genome Research 10 (4): 516–22. April 2000. doi:10.1101/gr.10.4.516. PMID 10779491. 
  8. "FrameD: A flexible program for quality check and gene prediction in prokaryotic genomes and noisy matured eukaryotic sequences". Nucleic Acids Research 31 (13): 3738–41. July 2003. doi:10.1093/nar/gkg610. PMID 12824407. 
  9. "Using intron position conservation for homology-based gene prediction". Nucleic Acids Research 44 (9): e89. May 2016. doi:10.1186/s12859-018-2203-5. PMID 26893356. 
  10. "Combining RNA-seq data and homology-based gene prediction for plants, animals and fungi". BMC Bioinformatics 19 (1): 189. May 2018. doi:10.1093/nar/gkw092. PMID 29843602. 
  11. Blanco, Enrique; Parra, Genís; Guigó, Roderic (June 2007), "Using geneid to Identify Genes" (in en), Current Protocols in Bioinformatics (John Wiley & Sons, Inc.) Chapter 4: 4.3.1–4.3.28, doi:10.1002/0471250953.bi0403s18, ISBN 978-0471250951, PMID 18428791 
  12. "GeneMark.hmm: new solutions for gene finding". Nucleic Acids Research 26 (4): 1107–15. February 1998. doi:10.1093/nar/26.4.1107. PMID 9461475. 
  13. "GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions". Nucleic Acids Research 29 (12): 2607–18. June 2001. doi:10.1093/nar/29.12.2607. PMID 11410670. 
  14. "Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm". Nucleic Acids Research 42 (15): e119. September 2014. doi:10.1093/nar/gku557. PMID 24990371. 
  15. "Ab initio gene identification in metagenomic sequences". Nucleic Acids Research 38 (12): e132. July 2010. doi:10.1093/nar/gkq275. PMID 20403810. 
  16. "Genetack: frameshift identification in protein-coding sequences by the Viterbi algorithm". Journal of Bioinformatics and Computational Biology 8 (3): 535–51. June 2010. doi:10.1142/S0219720010004847. PMID 20556861. 
  17. "Prediction of complete gene structures in human genomic DNA". Journal of Molecular Biology 268 (1): 78–94. April 1997. doi:10.1006/jmbi.1997.0951. PMID 9149143. 
  18. "TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders". Bioinformatics 20 (16): 2878–9. November 2004. doi:10.1093/bioinformatics/bth315. PMID 15145805. 
  19. "mGene: accurate SVM-based gene finding with an application to nematode genomes". Genome Research 19 (11): 2133–43. November 2009. doi:10.1101/gr.090597.108. PMID 19564452. 
  20. "Multiple reference genomes and transcriptomes for Arabidopsis thaliana". Nature 477 (7365): 419–23. August 2011. doi:10.1038/nature10414. PMID 21874022. Bibcode2011Natur.477..419G. 
  21. McNair, Katelyn; Zhou, Carol; Dinsdale, Elizabeth A.; Souza, Brian; Edwards, Robert A. (2019-11-01). "PHANOTATE: a novel approach to gene identification in phage genomes" (in en). Bioinformatics 35 (22): 4537–4542. doi:10.1093/bioinformatics/btz265. ISSN 1367-4803. https://academic.oup.com/bioinformatics/article/35/22/4537/5480131.