Biology:CCDC138

From HandWiki
Short description: Protein found in humans

Coiled-coil domain-containing protein 138, also known as CCDC138, is a human protein encoded by the CCDC138 gene. The exact function of CCDC138 is unknown.


A representation of the 3D structure of the protein myoglobin showing turquoise α-helices.
Generic protein structure example


Gene

The CCDC138 gene can be found at the positive strand of chromosome 2.[1]

Locus

The CCDC138 gene is located at the long(q) arm of chromosome 2 at locus 12.13,[2] or in short 2q12.3. It can be found at location 108,786,752-108,876,591.[3] The DNA sequence is 89,840bp long.

The red line shows the CCDC138 locus on chromosome 2q12.3.

Common aliases

CCDC138 is the only established common alias.

Homology and evolution

Paralogs

No paralogs of CCDC138 have been identified.

Orthologs

CCDC138 is conserved in various organisms as shown in the table below.

Scientific name Common name Date of divergence from human lineage[4] Sequence length Sequence identity to human RNA/protein Sequence similarity to human RNA/protein
Mus musculus House mouse 92.3 MYA 2466 bp 77% 65.7%
Columbia livia Rock dove 296 MYA 1863 bp 61% 59.4%
Xenopus laevis African clawed frog 371.2 MYA 2634 bp 52% 40.1%
Anolis carolinensis Red-throated anole 296 MYA 9588 bp 78% 46.3%
Latimeria chalumnae West Indian Ocean coelacanth 414.9 MYA 1838 bp 71% 38.1%
Strongylocentrotus purpuratus Purple sea urchin 742.9 MYA 2047 bp 59% 14.5%
Ciona intestinalis Vase tunicate 722.5 MYA 2420 bp 56% 23.8%
Aplysia californica California sea slug 782.7 MYA 2103 bp 49% 17.2%
Hydra vulgaris Fresh-water polyp 855.3 MYA 1482 bp 46% 13.7%
Chrysemys picta bellii Western painted turtle 296 MYA 700 bp 36% 15.9%
Alligator mississippiensis American alligator 296 MYA 2089 bp 76% 38.4%
Melopsittacus undulatus Budgerigar 296 MYA 1764 bp 73% 40.3%
Taeniopygia guttata Zebra finch 296 MYA 1980 bp 75% 44.2%
Lepisosteus oculatus Spotted gar 400.1 MYA 1269 bp 65% 30.9%
Saccoglossus kowalevskii Acorn worm 661.2 MYA 2515 bp 42% 24.1%
Branchiostoma floridae Lancelet 713.2 YA 1758 bp 55% 27.0%
Maylandia zebra Zebra mbuna fish 400.1 MYA 4815 bp 58% 21.4%
Trichoplax adhaerens Trichoplax 800 MYA 1605 bp 56% 10.3%
Pelodiscus sinensis Chinese softshell turtle 296 MYA 2895 bp 77% 33.9%
Falco cherrug Saker falcon 296 MYA 1866 bp 73% 48.9%

Distant homologs

The most distant homolog detected or predicted is Trichoplax adhaerans. It has a conserved CCDC138 gene and has evolved 800 MYA before the human lineage.

Homologous domains

Among the orthologs stated above, there are various homologous regions that are conserved shown in figures below.

CCDC138 multiple sequence alignment showing conserved regions.

600px|center|CCDC138 multiple sequence alignment showing conserved regions.

CCDC138 multiple sequence alignment showing conserved regions.

Green colors shows completely conserved residues, yellow color shows identical residues, cyan color shows similar residues, white color shows different residues.

Phylogeny

The observed phylogeny of the CCDC138 gene of the above mentioned orthologs recapitulates the evolutionary history.[5]

CCDC138 rooted phylogeny tree

The figure above shows the evolutionary relationship of CCDC138 in the orthologs.

Protein

The CCDC138 protein is predated to have a molecular weight of 76.2Kda[6] and an isoelectric point of 8.614.[7] Compositional analysis shows that there is a low usage of the AGP grouping in CCDC138, and there are no positive, negative or mixed charge clusters. The protein has no ER retention motif in the C-terminus and no RNA binding motif.[8] It has also been predicted to be a soluble nuclear protein with a leucine zipper pattern (PS00029) at position 205 onwards with a sequence LQKRERFLLEREQLLFRHENAL.[8]

Primary sequence and variants/isoforms

There are two isoforms of the CCDC138 protein. The primary isoform has 665 amino acids[9] while the secondary isoform has 577 amino acids,[9] and is missing 88 amino acids at the C-terminus.

Pairwise sequence alignment comparing isoforms 1 and 2 of the CCDC138 protein.

Figure shows the pairwise sequence alignment comparing the primary isoform (Isoform 1) to the secondary isoform (Isoform 2).

Domain and motifs

A domain of unknown function (DUF2317) on the protein at location 212 – 315 has been characterized in bacteria. TMHMM[10] and TMAP[11] suggests that there are no predicted transmembrane domain. SOSUI[12] further predicts that CCDC138 is a soluble protein with no transmembrane domain.

Post-translational modifications

According to SUMOplot Analysis Program,[13] there are 7 predicted sumoylation at lysine residues K7, K207, K336, K374, K383, K521, and K591. NetPhos[14] predicts that there are 44 phosphorylations sites, including 29 serine residues, 10 threonine residues, and 5 tyrosine residues. There are no further post-translational modifications as predicted by NetNGlyc,[15] NetOGlyc,[16] SignalP,[17] Sulfinator,[18] and Myristoylator.[19]

Secondary structure

The CCDC138 protein contains multiple alpha helixes, beta sheets and coiled-coils as predicted by PELE, CHOFAS, and GOR4.

CCDC138 secondary structure as predicted by PELE

Yellow shows coiled-coil, blue shows alpha helix, and red shows beta sheet. The majority of the sequence are coiled-coils and alpha helixes.

3° and 4° structures

There are no predicted 3° and 4° Structures for the CCDC138 protein. However, there is a similar structure that has a 29% identity.[20] The predicted structure is Chain A, crystal structure analysis of Clpb, a protein that encodes an ATP-dependent protease and chaperone. This protein has an aligned-length of 144 amino acids, and the alignment is located at the domain of unknown function of CCDC138.

Chain A, crystal analysis structure of Clpb

Expression

The gene is expressed at low levels in almost all human tissues, but higher levels have been seen in certain cancer tissues. CCDC138 is a soluble protein that is pre diced to localise in the nucleus of a cell.

Promoter

The promoter region of CCDC138 is shown as figure below.

Promoter region of CCDC138 with labeled transcription factor binding sites

Expression

Microarray-assessed tissue expression patterns through GEO profiles show that CCDC138 is expressed in moderate levels in various tissues including peripheral blood lymphocyte, fetal thymus, thymus, testis, ovary, feral brain, colon, mammary gland, and bone marrow.[21]

Microarray-assessed tissue expression patterns shown in GEO profile.

Transcript variants

There are two most significant alternative transcript variants for CCDC138 mRNA. The first variant as shown in the figure below has been found in lung, blood, and human embryonic stem cells.[22] The second variant has been found in adenocarcinoma, prostate, lung, and primary lung epithelial cells.[23]

Transcript variants of CCDC138

First transcript shows the complete mRNA transcript. Second transcript is the first variant, while the thirst transcript is the second variant.[24]

Function and biochemistry

The exact function of CCDC138 is yet to be known.

Interacting proteins

The CCDC138 protein has been found to interact with ubiquitin C,[25] a protein involved in ubiquination and eventually protein degradation.

Transcription factors that might bind to regulatory sequence

The table below shows some transcription factors that have been predicted by Genomatix that binds to the regulatory sequence of the CCDC138 gene.[26]

Detailed family information Detailed matrix information Tissue
GC-Box factors SP1/GC Stimulating protein 1, ubiquitous zinc finger transcription factor Ubiquitous
Peroxisome proliferator-activated receptor Peroxisome proliferator-activated receptor gamma, DR1 sites Adipose Tissue, Connective Tissue, Digestive System, Liver
MYT1 C2HC zinc finger protein MyT1 zinc finger transcription factor involved in primary neurogenesis Central Nervous System, Nervous System, Neuroglia, Neurons
NGFI-B response elements, nur subfamily of nuclear receptors Monomers of the nur subfamily of nuclear receptors (nur77, nurr1, nor-1) Brain, Central Nervous System, Endocrine System, Immune System, Leydig Cells, Nervous System, Neurons, Testis, Thymus Gland, Urogenital System
Krueppel-like transcription factors Core promoter-binding protein (CPBP) with 3 Krueppel-type zinc fingers (KLF6, ZF9) Blood cells, bone marrow cells, digestive system, embryonic structures, Erythrocytes, Hematopoietic System, liver
Grainyhead-like transcription factors Grainyhead-like 3 (sister-of-mammalian grainyhead - SOM) Embryonic Structures, Integumentary System
CTCF and BORIS gene family, transcriptional regulators with 11 highly conserved zinc finger domains Insulator protein CTCF (CCCTC-binding factor) Blood Cells, Embryonic Structures, Endocrine System, Erythrocytes, Germ Cells, Testis, Urogenital System
Core promoter motif ten elements Human motif ten element -
Abdominal-B type homeodomain transcription factors Homeobox C13 / Hox-3gamma Bone Marrow Cells, Bone and Bones, Central Nervous System, Connective Tissue, Embryonic Structures, Hematopoietic System, Integumentary System, Kidney, Nervous System, Neurons, Prostate, Skeleton, Spinal Cord, Urogenital System
E2F-myc activator/cell cycle regulator E2F transcription factor 2 Ubiquitous
PAX-3 binding sites Pax-3 paired domain protein, expressed in embryogenesis, mutations correlate to Waardenburg Syndrome Embryonic structures, muscle, skeletal, muscles
ZF5 POZ domain zinc finger ZF5 POZ domain zinc finger, zinc finger protein 161 -
Vertebrate TATA binding protein factor Cellular and viral TATA box elements -
CCAAT binding factors Avian C-type LTR CCAAT box Ubiquitous
Ccaat/Enhancer Binding Protein CCAAT/enhancer binding protein alpha Adipose Tissue, Bone Marrow Cells, Connective Tissue, Digestive System, Hematopoietic System, Immune System, Liver, Myeloid Cells, Phagocytes
Activator-, mediator- and TBP-dependent core promoter element for RNA polymerase II transcription from TATA-less promoters X gene core promoter element 1 -

Clinical significance

CCDC138 has been identified as one of the many genes involved in initiating term labor in myometrium.[27]

References

  1. "Homo sapiens coiled-coil domain containing 138 (CCDC138), mRNA". https://www.ncbi.nlm.nih.gov/nuccore/NM_144978.1. 
  2. "CCDC138 - GeneCards". https://www.genecards.org/cgi-bin/carddisp.pl?gene=CCDC138&search=CCDC138. 
  3. "CCDC138 coiled-coil domain containing 138 [ Homo sapiens (human) "]. https://www.ncbi.nlm.nih.gov/gene?Db=gene&Cmd=DetailsSearch&Term=165055. 
  4. "TimeTree :: The Timescale of Life". http://www.timetree.org/index.php. 
  5. "CLUSTALW". http://seqtool.sdsc.edu/CGI/BW.cgi#!. [yes|permanent dead link|dead link}}]
  6. "SAPS". http://seqtool.sdsc.edu/CGI/BW.cgi#!. [yes|permanent dead link|dead link}}]
  7. "pI". http://seqtool.sdsc.edu/CGI/BW.cgi#!. [yes|permanent dead link|dead link}}]
  8. 8.0 8.1 "PSORT WWW Server". http://psort.hgc.jp. 
  9. 9.0 9.1 "Coiled-coil domain-containing protein 138 - CCDC138 - Homo sapiens (Human)". https://www.uniprot.org/uniprot/Q96M89. 
  10. "TMHMM". http://seqtool.sdsc.edu/CGI/BW.cgi#!. [yes|permanent dead link|dead link}}]
  11. "TMAP". http://seqtool.sdsc.edu/CGI/BW.cgi#!. [yes|permanent dead link|dead link}}]
  12. "Classification and Secondary Structure Prediction of Membrane Proteins". http://harrier.nagahama-i-bio.ac.jp/sosui/. 
  13. "SUMOplot™ Analysis Program". http://www.abgent.com/sumoplot. 
  14. "NetPhos 2.0 Server". http://www.cbs.dtu.dk/services/NetPhos/. 
  15. "NetNGlyc 1.0 Server". http://www.cbs.dtu.dk/services/NetNGlyc/. 
  16. "NetOGlyc 4.0 Server". http://www.cbs.dtu.dk/services/NetOGlyc/. 
  17. "SignalP 4.1 Server". http://www.cbs.dtu.dk/services/SignalP/. 
  18. "The Sulfinator". http://web.expasy.org/sulfinator/. 
  19. "Myristoylator". http://web.expasy.org/myristoylator/. 
  20. "CBLAST". https://www.ncbi.nlm.nih.gov/Structure/cblast/cblast.cgi. 
  21. "GDS3113 / 172084 / CCDC138". https://www.ncbi.nlm.nih.gov/geo/tools/profileGraph.cgi?ID=GDS3113:172084. 
  22. "AHomo sapiens coiled-coil domain containing 138 (65.7 kD) (CCDC138) alternative variant dAug10, complete mRNA.". https://www.ncbi.nlm.nih.gov/ieb/research/acembly/av.cgi?db=human&term=CCDC138&submit=Go. 
  23. "Homo sapiens coiled-coil domain containing 138 (47.0 kD) (CCDC138) alternative variant fAug10, complete mRNA.". https://www.ncbi.nlm.nih.gov/ieb/research/acembly/av.cgi?db=human&term=CCDC138&submit=Go. 
  24. "AceView: Gene:CCDC138, a Comprehensive Annotation of Human, Mouse and Worm Genes with mRNAs or ESTs". https://www.ncbi.nlm.nih.gov/ieb/research/acembly/av.cgi?db=human&term=CCDC138&submit=Go. 
  25. "CCDC138 protein (Homo sapiens) - STRING network view". http://string-db.org/newstring_cgi/show_network_section.pl. 
  26. "Transcription Factors". http://www.genomatix.de/cgi-bin//gx_query/generate_detail_page.p1. [yes|permanent dead link|dead link}}]
  27. "Human effector/initiator gene sets that regulate myometrial contractility during term and preterm labor". American Journal of Obstetrics and Gynecology 202 (5): 474.e1–20. May 2010. doi:10.1016/j.ajog.2010.02.034. PMID 20452493. 

External links