Biology:C5orf34

From HandWiki
Short description: Protein-coding gene in the species Homo sapiens

C5orf34 (chromosome 5 open reading frame 34) is a protein that in humans is encoded by the C5orf34 gene (5p12).[1][2]

C5orf34 is conserved in mammals, birds and reptiles with the most distant ancestor being the Burmese python, Python bivittatus. The C5orf34 protein contains two mammalian conserved domains: DUF 4520 and DUF 4524. The protein is also predicted to have a polo-box domain (PBD) of polo-like kinase 4 (plk4), which has predicted conservation in distant orthologs from the clade Aves.[3][4]

Gene

Human chromosomal position of C5orf34 gene on the short arm of chromosome 5.[5]

C5orf34 is located on the negative DNA strand of the short arm of chromosome 6 at locus 12. The gene is 28,744 base pairs long and spans from base pair 43,486,701 to base pair 43,515,445. The gene produces a single transcript of 2,540 base pairs long and encodes for 638 amino acids.[1][2][6]

Gene neighborhood

The gene PAIP1 is found on the negative strand just downstream of C5orf34 and is a member of the polyadenylate-binding family. PAIP1 extends from base pairs 43,526,267 to 43,557,419.[7] CCL28 is found downstream on the negative strand and extends from base pairs 43378052 to 43413837.[8]

Gene expression

There indication of multiple sources that suggest, in humans, C5orf34 protein is expressed non-ubiquitously in select tissues at low/moderate levels, with the most abundant expression in the tissues of the stomach, small intestine, testis, skeletal muscle and heart muscle.[9][10] A study of Rho kinase inhibitor effect on primary cell lines also showed that C5orf34 is expressed in dermal fibroblasts of normal human tissue samples.[11]

Promoter

The promoter region for C5orf34 is predicted to be between 43515079 and 43515773 and spans 695 base pairs.[12]

Protein

C5orf34 consists of 638 amino acids, has a weight of 72.7 kDa and an isoelectric point of 7.77 in humans.[1][13][14]

Function

Although the precise function of C5orf34 in humans remains unknown, there is evidentiary support based on structure that it is involved in kinase-related cellular functions.[15] In addition, C5orf34 is predicted to be nuclear, thus it has potential involvement in gene regulation and cell proliferation seeing as these are two primary signal transduction pathways involve nuclear kinase proteins.[16][17]

A schematic representation of conserved domains and phosphorylated amino acid residues in human C5orf34. The red diamond projections are conserved phosphoserine sites and the grey diamond projections are conserved phosphothreonine sites.[5]

Structure

In humans, C5orf34 contains two domains of unknown function, DUF 4520 (pfam 15016) and DUF 4524 (pfam 150125), found between residues 6-153 and 444–539, respectively. The protein is serine and threonine rich. The charge distribution of the protein is equally dispersed per there are no positive or negative charge clusters sequestered within the protein.[13]

The predicted secondary structures of the human protein were assessed by multiple bioinformatic tools. All of the programs predicted the protein's structure to consist of alpha helices, extended strands, random coils and beta turns. The Phyre2 server provided a predicted human protein structure that indicated domains of plk polo-box of the serine/threonine-protein kinase plk4. The server predicted with 96.8% confidence of 20% coverage (130 residues) of the protein. The coverage exhibited residues of the conserved polo-box domain and the two DUF domains. The protein was predominantly soluble, with an average hydrophobicity of -0.478.[15][18][19]

Post-translational modifications

There is extensive, predicted phosphorylation of C5orf34, with 32 phosphoserines and 7 phosphothreonines being conserved in orthologs of the human C5orf34 protein. This analysis indicates C5orf34 as a phosphoprotein and supports structural predictions of it being a kinase protein. The protein contains only one nuclear export signal residue, found at 481-L; however the NES score was found to be low at 0.515. Structural analysis of the protein indicated it was sequestered in the nucleus with an 87% probability.[17][20][21]

Interacting proteins

Databases of protein interactions (MINT, STRING, IntAct, and BioGRID) have not identified any interactions with C5orf34.

Homology and evolution

C5orf34 is highly conserved in primates and mammals and moderately conserved in reptiles. The furthest conserved ortholog is in Python bivittatus, or the Burmese python. Below is a selected list of orthologs to demonstrate the homology of this gene with relation to the reference sequence in Homo sapiens.

Orthologous space

151 organisms have been predicted orthologs with C5orf34.[2] The most distant ortholog is the Burmese python, which diverged from humans 296 million years ago, indicating C5orf34 developed in reptiles and birds.[3][22]

Table of C5orf34 orthologs

Scientific Name Common

Name

Date of Divergence from Humans (MYA)[23] NCBI Protein Accession # Protein Length (amino acids) Sequence Similarity (%)
Homo sapiens Human 0 NP_001076895.1 638 100
Gorilla gorilla Gorilla 8.8 XP_004058945.1 636 92
Camelus ferus Bactrian Camel 97.4 XP_006191979.1 640 84
Panthera tigris altaica Siberian Tiger 97.4 XP_007095478.1 638 83
Sus scrofa Wild Boar 97.4 XP_003133971.3 441 80
Bos Tarus Cattle 97.4 NP_001076895.1 638 80
Erinaceus europaeus European Hedgehog 97.4 XP_007517686.1 632 69
Mus Musculus House Mouse 91 BAE28742.1 382 75
Monodelphis domestica Gray Short-tailed Opossum 176.1 XP_007487459.1 512 62
Chelonia mydas Green Turtle 324.5 XP_007052886.1 638 51
Aptendodytes forsteri Emperor Penguin 324.5 XP_009272830.1 647 48
Gallus gallus Chicken 324.5 XP_424782.3 669 48
Python bivittatus Burmese python 324.5 XP_007430528.1 649 46

[3]

Paralogous space

There are no predicted paralogs for C5orf34 in both humans and mice.[3]

Conserved regions

Multiple sequence alignments indicated amino acid residue conservation throughout the C5orf34 protein in an array of orthologs, with the most highly conserved regions at both N-terminus and C-terminus where the DUF are located. DUF 4520 (pfam 15016) was found to be conserved in C-terminus and DUF 4524 (pfam 150125) was found to be conserved in the N-terminus. Also, the polo-box domain of plk4 was found to be conserved in the C-terminus in a multiple sequence alignment in both strict and distant orthologs.[22]

References

  1. 1.0 1.1 1.2 "NCBI Protein". https://www.ncbi.nlm.nih.gov/protein/Q96MH7.2. Retrieved 2015-05-09. 
  2. 2.0 2.1 2.2 "NCBI Gene". https://www.ncbi.nlm.nih.gov/gene/375444. Retrieved 2015-05-09. 
  3. 3.0 3.1 3.2 3.3 "NCBI Blast". http://blast.ncbi.nlm.nih.gov/Blast.cgi. Retrieved 2015-05-09. 
  4. Sillibourne, James E.; Bornens, Michel (2010-09-29). "Polo-like kinase 4: the odd one out of the family". Cell Division 5 (1): 25. doi:10.1186/1747-1028-5-25. ISSN 1747-1028. PMID 20920249. 
  5. 5.0 5.1 Castro, Edouard. "PROSITE". http://prosite.expasy.org/cgi-bin/prosite/mydomains/. Retrieved 2015-05-10. 
  6. "Ensembl Genome Browser". http://www.ensembl.org/index.html. Retrieved 2015-05-09. 
  7. "NCBI Gene". https://www.ncbi.nlm.nih.gov/gene/10605. Retrieved 2015-05-09. 
  8. "NCBI Gene". https://www.ncbi.nlm.nih.gov/gene/56477. Retrieved 2015-05-09. 
  9. "Tissue expression of C5orf34 - Summary - The Human Protein Atlas". http://www.proteinatlas.org/ENSG00000172244-C5orf34/tissue. Retrieved 2015-05-09. 
  10. "NCBI GeoProfile". https://www.ncbi.nlm.nih.gov/geo/tools/profileGraph.cgi?ID=GDS424:54364_at. Retrieved 2015-05-09. 
  11. Boerma, Marjan; Fu, Qiang; Wang, Junru; Loose, David S.; Bartolozzi, Alessandra; Ellis, James L.; McGonigle, Sharon; Paradise, Elsa et al. (2008). "Comparative gene expression profiling in three primary human cell lines after treatment with a novel inhibitor of Rho kinase or atorvastatin". Blood Coagulation & Fibrinolysis 19 (7): 709–718. doi:10.1097/MBC.0b013e32830b2891. PMID 18832915. 
  12. "Genomatix: Annotation & Analysis". https://www.genomatix.de/cgi-bin//eldorado/eldorado.pl?s=3d3c68311f8e7379e43e09dce48b7eed;SHOW_ANNOTATION=TempSeq_P8qm9dKp;ELDORADO_VERSION=E28R1306. Retrieved 2015-05-09. 
  13. 13.0 13.1 "Statistical Analysis of PS (SAPS)". Subramaniam, Shankar. http://workbench.sdsc.edu/CGI/BW.cgi#!. Retrieved 5 May 2015. [yes|permanent dead link|dead link}}]
  14. "ExPASy - Compute pI/Mw tool". http://web.expasy.org/compute_pi/. Retrieved 2015-05-09. 
  15. 15.0 15.1 "Phyre Investigator output for C5orf34__ with c1umwB_". http://www.sbg.bio.ic.ac.uk/phyre2/phyre2_output/18fc8a68c6a4caae/investigator/c1umwB_.1/summary.html. Retrieved 2015-05-09. [yes|permanent dead link|dead link}}]
  16. Matthews, Harry R.; Huebner, Verena D. (1984-03-01). "Nuclear protein kinases". Molecular and Cellular Biochemistry 59 (1–2): 81–99. doi:10.1007/BF00231306. ISSN 0300-8177. PMID 6323962. 
  17. 17.0 17.1 "PSORT II server". http://www.genscript.com/cgi-bin/tools/psort2.pl. Retrieved 2015-05-09. 
  18. UCBL, Institut. "NPS@ : SOPMA secondary structure prediction". https://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_sopma.html. Retrieved 2015-05-09. 
  19. Sobhani, Armin. "PELE - Protein Energy Landscape Exploration - Web Server". https://pele.bsc.es/pele.wt. Retrieved 2015-05-09. 
  20. "NetPhos 2.0 Server". http://www.cbs.dtu.dk/services/NetPhos/. Retrieved 2015-05-09. 
  21. "NetNES 1.1 Server". http://www.cbs.dtu.dk/services/NetNES/. Retrieved 2015-05-09. 
  22. 22.0 22.1 "CLUSTALW". Subramaniam, Shankar. 5 May 2015. http://workbench.sdsc.edu/CGI/BW.cgi#!. [yes|permanent dead link|dead link}}]
  23. "TimeTree :: The Timescale of Life". http://www.timetree.org/. Retrieved 2015-05-10.