Biology:G-value paradox

From HandWiki
Short description: Genetic paradox

The G-value paradox arises from the lack of correlation between the number of protein-coding genes among eukaryotes and their relative biological complexity. The microscopic nematode Caenorhabditis elegans, for example, is composed of only a thousand cells but has about the same number of genes as a human.[1][2] Researchers suggest resolution of the paradox may lie in mechanisms such as alternative splicing and complex gene regulation that make the genes of humans and other complex eukaryotes relatively more productive.[3]

DNA and biological complexity

The lack of correlation between the morphological complexity of eukaryotes and the amount of genetic information they carry has long puzzled researchers.[4] The sheer amount of DNA in an organism, measured by the mass of DNA present in the nucleus or the number of constituent nucleotide pairs, varies by several orders of magnitude among eukaryotes and often is unrelated to an organism’s size or developmental complexity.[5] One amoeba has 200 times more DNA per cell than humans,[6] and even insects and plants within the same genus can vary dramatically in their quantity of DNA.[7] This C-value paradox troubled genome scientists for many years.

Eventually, researchers recognized that not all DNA contributes directly to the production of proteins and other biological functions.[8] Susumu Ohno coined the phrase “junk DNA” to describe these nonfunctional swaths of DNA.[9] They include introns, genetic sequences that are removed after transcription into mRNA and thus are not translated into proteins;[4][10] transposable elements that are mobile fragments of DNA, most of which are nonfunctional in humans;[8][11] and pseudogenes, nonfunctional DNA sequences that originated from functional genes.[12] The share of the human genome that may be considered “junk” remains controversial. Estimates reach as low as 8%[13] and as high as 80%,[14] with one researcher arguing that there is a fixed ceiling of 15% imposed by the genome’s genetic load.[15] (Prokaryotes, which have little "junk" DNA by comparison, exhibit a fairly close relationship between genome size and biological functionality).[16]

In any case, the assumption was that once the C-paradox was swept away and the focus shifted to the number of protein-coding genes, the anticipated correlation between genetic information and biological complexity in eukaryotes would emerge.[3] Unfortunately, the G-value paradox simply picked up where the C-value paradox left off, because the discrepancy persisted when comparisons were narrowed to just protein-coding genes.[3][17]

G-value paradox

Estimates of the number of coding genes in the human genome reached upwards of 100,000 prior to the human genome project,[18] but since have dwindled to as low as 19,000 following completion of that massive sequencing effort and subsequent refinements.[1] By comparison, the microscopic water flea Daphnia pulex has about 31,000 genes;[19] the nematode C. elegans about 19,700;[2] the fruit fly (Drosophila melanogaster) about 14,000;[20] the zebrafish (Danio rerio), 26,000;[21] and the small flowering plant Arabidopsis thaliana, 27,000.[22] Plants in general tend to have more genes than other eukaryotes.[23] One explanation is their higher incidence of gene and whole genome duplication and retention of those additional genes, due in part to their development of a large collection of defensive secondary metabolites.[23]

The apparent disconnect between the number of genes in a species and its biological complexity was dubbed the G-value paradox.[3] While the C-value paradox unraveled with the discovery of massive sequences of noncoding DNA, resolution of the G-value paradox appears to rest on differences in genome productivity. Humans and other complex eukaryotes simply may be able to more with what they have, genetically speaking.

Among the mechanisms cited for this greater productivity are more sophisticated transcriptional controls,[24] multifunctional proteins, more interaction between protein products, alternative splicing[25] and post-translational modifications that may produce several protein products from the same genetic raw material.[3][24] In addition, thousands of non-coding RNAs that are transcribed from DNA but not translated into protein have emerged as important regulators of gene expression and development in humans and other eukaryotes.[26] They include short RNA sequences, such as microRNAs (miRNAs), small interfering RNAs (siRNAs) and Piwi-interacting RNAs (piRNAs),[26] and long, non-coding RNAs (lncRNA) that may regulate gene expression at different stages of development.[27] Some researchers suggest that instead of the number of genes the focus now should shift to gene interactions and the network of genetic regulatory mechanisms that allow them to support a variety of biological activities.[28][24] These transitions have taken analysis of genetic complexity from the C-value to the G-value to what some refer to as the I-value, a measure of the total information contained in a genome.[3]

Defining complexity

One of the challenges in the long debate over the mismatch between genome size and biological complexity has been ambiguity in defining complexity. Is it the number of cell types in an organism, the sophistication of its nervous system or the number of different proteins it produces?[17] By some definitions, the greater complexity of humans compared to other organisms may be illusory.[29] Even once complexity is defined, some researchers argue complexity in function does not necessarily require the same complexity in process. Evolution is not a paragon of efficiency but travels a crooked path that leads to a more cumbersome genome than is necessary in some species.[30]

References

  1. 1.0 1.1 "Multiple evidence strands suggest that there may be as few as 19,000 human protein-coding genes". Human Molecular Genetics 23 (22): 5866–78. November 2014. doi:10.1093/hmg/ddu309. PMID 24939910. 
  2. 2.0 2.1 "Genomics in C. elegans: so many genes, such a little worm". Genome Research 15 (12): 1651–60. December 2005. doi:10.1101/gr.3729105. PMID 16339362. 
  3. 3.0 3.1 3.2 3.3 3.4 3.5 "The g-value paradox". Evolution & Development 4 (2): 73–5. 2002. doi:10.1046/j.1525-142X.2002.01069.x. PMID 12004964. 
  4. 4.0 4.1 "Chromosome structure and the C-value paradox". The Journal of Cell Biology 91 (3 Pt 2): 3s–14s. December 1981. doi:10.1083/jcb.91.3.3s. PMID 7033242. 
  5. "Nuclear volume control by nucleoskeletal DNA, selection for cell volume and cell growth rate, and the solution of the DNA C-value paradox". Journal of Cell Science 34: 247–78. December 1978. doi:10.1242/jcs.34.1.247. PMID 372199. 
  6. "Algae: amounts of DNA and organic carbon in single cells". Science 163 (3862): 87–8. January 1969. doi:10.1126/science.163.3862.87. PMID 5812598. Bibcode1969Sci...163...87H. 
  7. "The genetic organization of chromosomes". Annual Review of Genetics 5 (1): 237–56. 1971. doi:10.1146/annurev.ge.05.120171.001321. PMID 16097657. 
  8. 8.0 8.1 "Synergy between sequence and size in large-scale genomics". Nature Reviews. Genetics 6 (9): 699–708. September 2005. doi:10.1038/nrg1674. PMID 16151375. 
  9. Ohno, S. (1972). "So much "junk" DNA in our genome". Brookhaven Symp. Biol. 23: 366–370. PMID 5065367. 
  10. "Genes-in-pieces revisited". Science 228 (4701): 823–4. May 1985. doi:10.1126/science.4001923. PMID 4001923. Bibcode1985Sci...228..823G. 
  11. "Selfish DNA: the ultimate parasite". Nature 284 (5757): 604–7. April 1980. doi:10.1038/284604a0. PMID 7366731. Bibcode1980Natur.284..604O. 
  12. "Pseudogenes: are they "junk" or functional DNA?". Annual Review of Genetics 37 (1): 123–51. 2003. doi:10.1146/annurev.genet.37.040103.103949. PMID 14616058. 
  13. Schierup, Mikkel H., ed (July 2014). "8.2% of the Human genome is constrained: variation in rates of turnover across functional element classes in the human lineage". PLOS Genetics 10 (7): e1004525. doi:10.1371/journal.pgen.1004525. PMID 25057982. 
  14. "An integrated encyclopedia of DNA elements in the human genome". Nature 489 (7414): 57–74. September 2012. doi:10.1038/nature11247. PMID 22955616. Bibcode2012Natur.489...57T. 
  15. Martin, Bill, ed (July 2017). "An Upper Limit on the Functional Fraction of the Human Genome". Genome Biology and Evolution 9 (7): 1880–1885. doi:10.1093/gbe/evx121. PMID 28854598. 
  16. "The relationship between non-protein-coding DNA and eukaryotic complexity". BioEssays 29 (3): 288–99. March 2007. doi:10.1002/bies.20544. PMID 17295292. 
  17. 17.0 17.1 "Gene number. What if there are only 30,000 human genes?". Science 291 (5507): 1255–7. February 2001. doi:10.1126/science.1058969. PMID 11233450. 
  18. "How many genes in the human genome?". Nature Genetics 7 (3): 345–6. July 1994. doi:10.1038/ng0794-345. PMID 7920649. 
  19. "The ecoresponsive genome of Daphnia pulex". Science 331 (6017): 555–61. February 2011. doi:10.1126/science.1197761. PMID 21292972. Bibcode2011Sci...331..555C. 
  20. "Genetics on the Fly: A Primer on the Drosophila Model System". Genetics 201 (3): 815–42. November 2015. doi:10.1534/genetics.115.183392. PMID 26564900. 
  21. "The zebrafish reference genome sequence and its relationship to the human genome". Nature 496 (7446): 498–503. April 2013. doi:10.1038/nature12111. PMID 23594743. Bibcode2013Natur.496..498H. 
  22. "The Arabidopsis Information Resource (TAIR): gene structure and function annotation". Nucleic Acids Research 36 (Database issue): D1009-14. January 2008. doi:10.1093/nar/gkm965. PMID 17986450. 
  23. 23.0 23.1 "How many genes are there in plants (... and why are they there)?". Current Opinion in Plant Biology 10 (2): 199–203. April 2007. doi:10.1016/j.pbi.2007.01.004. PMID 17289424. 
  24. 24.0 24.1 24.2 "The evolution of transcriptional regulation in eukaryotes". Molecular Biology and Evolution 20 (9): 1377–419. September 2003. doi:10.1093/molbev/msg140. PMID 12777501. 
  25. "Species-specific variation of alternative splicing and transcriptional initiation in six eukaryotes". Gene 364: 53–62. December 2005. doi:10.1016/j.gene.2005.07.027. PMID 16219431. 
  26. 26.0 26.1 "Origin and evolution of the metazoan non-coding regulatory genome". Developmental Biology 427 (2): 193–202. July 2017. doi:10.1016/j.ydbio.2016.11.013. PMID 27880868. 
  27. "Challenges in the analysis of long noncoding RNA functionality". FEBS Letters 590 (15): 2342–53. August 2016. doi:10.1002/1873-3468.12308. PMID 27417130. 
  28. "Molecular biology and evolution. Can genes explain biological complexity?". Science 292 (5520): 1315–6. May 2001. doi:10.1126/science.1060852. PMID 11360989. 
  29. "Perspective Metazoan Complexity and Evolution: Is There a Trend?". Evolution; International Journal of Organic Evolution 50 (2): 477–492. April 1996. doi:10.1111/j.1558-5646.1996.tb03861.x. PMID 28568940. 
  30. "Evolution and tinkering". Science 196 (4295): 1161–6. June 1977. doi:10.1126/science.860134. PMID 860134. Bibcode1977Sci...196.1161J.