Biology:GOLGA8H

From HandWiki
A representation of the 3D structure of the protein myoglobin showing turquoise α-helices.
Generic protein structure example

Golgin subfamily A member 8H, also known as GOLGA8H, is a protein that in Homo sapiens is encoded by the GOLGA8H gene. Function of the GOLGA8H involves a process that is carried out at the cellular level which results in the assembly, arrangement of constituent parts, or disassembly of the Golgi apparatus.

Gene

Aliases

The most common aliases for GOLGA8H are the following:[1]

  • Golgin A8 Family Member H 2 3 5 or Member H 2
  • Golgi Autoantigen, Golgin Subfamily A, 6-Like 11 2 3
  • Golgin Subfamily A Member 8H 3 4
  • Golgin Subfamily A Member 8-Like Protein 1 3
  • GOLGA6L11

Prevalence and Location

GOLGA8H, when compared to many other genes, exists in many different places that span multiple chromosomes:.[2] NCBI lists the gene’s location on the long (q) arm on Chromosome 15 in the q13.2 region, from 30,604,030 - 30,617,827 (13,798 nt in length)[1]

Location of GOLGA8H on chromosome 15 q13.2 region

In actuality, when running the FASTA protein sequence of GOLGA8H on BLAT (the BLAST-Like Alignment Tool), it is found to exist in 85 or 87 different locations (depending on an individual’s sex chromosomes).[2] 81 copies of the protein exist on chromosome 15, one copy each on chromosomes 7, 9,10, and 12, and two copies on the Y chromosome[2]

Copies of GOLGA8H in Homo Sapiens

Chromosome: Number of Copies:
Chromosome 7 1
Chromosome 9 1
Chromosome 10 1
Chromosome 15 81
Y Chromosome 2
TOTAL:
85 (XX chromosomes*)
87 (XY chromosomes*)

Neighborhood

It would be tedious and inefficient to list all gene neighborhoods for the 87 locations of GOLGA8H. Thus, here are surrounding genes of GOLGA8H on chromosome 15 in the q13.2 region listed on NCBI:[1]

Gene Neighborhood of GOLGA8H on Chromosome 15 q13.2 in Homo sapiens[3]:

Gene Additional Information
LOC106736468 Gamma inversion proximal recombination region
ARHGAP11B Rho GTPase activating protein
LOC106736476 Proximal CHRNA7 low-copy repeat recombination region
DNM1P50 Pseudogene of DNM1, which is involved in producing microtubule bundles that is additionally able to bind and hydrolyze GTP
LOC106736480 Proximal microdeletion recombination region
ULK4P2 Pseudogene of ULK4, which encodes a member of the unc-51-like serine/threonine kinase (STK) family, in which members play a role in neuronal growth and endocytosis
RN7SL628P Pseudogene stemming from cytoplasmic 7SL, an RNA component of the SRP (signal recognition particle)
LOC106783506 A nonconserved acetylation island sequence 49 enhancer which can function as an enhancer in Jurkat T cells

Transcript

There are no isoforms of GOLGA8H.[1]

Multiple Sequence Alignment

Paralogs

A multiple sequence alignment (MSA) of GOLGA8H and its top seven paralogs was created using Clustal Omega [1]. [Appendix A] All eight genes from the Golgin Subfamily A Member 8 group were 632 amino acids in length [1]. All 632 amino acids of GOLGA8H and its top seven paralogs were analyzed and compared using Clustal Omega were analyzed and compared in an attempt to understand what makes Golgin Subfamily A Member 8H, GOLGA8H, a distinct entity. Two amino acids make GOLGA8H unique: Valine at amino acid 32 and Cysteine at amino acid 169.[4] For all seven paralogs, the amino acid in position 32 is Isoleucine and the amino acid in position 169 is Arginine[4]

Protein

The predicted molecular weight of GOLGA8H, rounded down to three significant figures, is 71.3 kDa.[5] This is a theoretical value; predicted molecular weights are merely based on the amino acids present in the protein. The theoretical isoelectric point of GOLGA8H, rounded down to one significant figure, is a pI of 8[5]

Composition

When compared to other human proteins, GOLGA8H is semi glutamine- and glutamate-enriched.[6] In contrast, GOLGA8H is depleted in threonine, phenylalanine, and tyrosene.[6]

Amino Acid Multiplets in GOLGA8H retrieved via Statistical Analysis of Protein Sequences (SAPS)

There are no charge runs, hydrophobic segments, or transmembrane domains in the GOLGA8H protein.[6] There are 62 amino acid multiplets for the protein, which is higher than the expected range.[6] It also has amino acid patterns with high periodicity[6]

Motifs

There are 11 motifs present in GOLGA8H:.[7] The single experimentally-verified motif is a glutamine-rich protein located in the 323-416 amino acid region.

GOLGA8H Motifs[7]
Motif # Motif Information
1 N-Glycosylation site
2 cAMP- and cGMP-dependent protein kinase phosphorylation site
3 Casein kinase II phosphorylation site
4 N-myristoylation site
5 Protein kinase C phosphorylation site
6 Alanine-rich region profile
7 Glutamine-rich region profile (experimentally verified)
8 K-box domain profile
9 Bipartite nuclear localization signal profile
10 HCaRG protein
11 Involucrin repeat

Post-Translational Modifications

GOLGA8H is predicted to undergo phosphorylation at multiple locations of serine, threonine, and tyrosine throughout its structure.[8] It is expected to undergo phosphorylation most frequently on serine amino acids.[8] Furthermore, there is one predicted N-linked glycosylation site, which occurs at amino acid 39.[8] The sequence for this site is NGS.[8] N-linked glycosylation functions intrinsically and extrinsically to assist in regulating the migration patterns of cells.[9]

Primary Sequence

The protein is 632 amino acids long.[1] It has 19 exons and two polyadenylation signals.[1] Its sequence only partially matches a Kozak consensus sequence.[1]

Amino Acid Periodicity in GOLGA8H retrieved via Statistical Analysis of Protein Sequences (SAPS)

Secondary Structure

The predicted secondary structure of GOLGA8H is composed of 81% alpha helices, 25.6% beta sheets, and 17.2% turns.[10]

Using Phyre2, 284 residues (45% of GOLGA8H) was modeled with 97.8% confidence by the single highest scoring template.[11] This structure shows an extremely high proportion of alpha helices:[11]

284 residues (45% of GOLGA8H) was modeled with 97.8% confidence by the single highest scoring template. The N-terminus begins on the red side and goes down the rainbow to the C-terminus (blue side).

Tertiary Structure

A predicted model for a tertiary structure of GOLGA8H was generated using I-TASSER[12]

I-TASSER Predicted Tertiary Structure of GOLGA8H. The N-terminus begins on the red side and goes down the rainbow to the C-terminus (blue side).

Transcript level regulation

Promoter

There is one promoter for the GOLGA8H gene, GXP_2235212, which is 1197 nt long.[13] It lies from base pairs 30,603,030 to 30,604,226 on the positive strand[13]

Transcription Factor Binding Sites

Several transcription factors are predicted to bind to the promoter sequence. Some examples include:[13]

Homology and evolution

Paralogs

GOLGA8H has several dozen paralogs. There are seven paralogs with identity similarities above 90%, charted below under GOLGA8H (included as a reference point):[1]

GOLGA8H Paralogs (>90% Similarity)[1]
# Gene Name Accession # Similarity (%)
1. GOLGA8H NP_001269419.1 100.0
2. GOLGA8J NP_001269401.1 97.8
3. GOLGA8T NP_001342398.1 97.2
4. GOLGA8K NP_001269422.1 93.5
5. GOLGA8I pseudogene A6NC78.2 96.0
6. GOLGA8M NP_001269397.1 95.9
7. GOLGA8O NP_001264237.1 90.7
8. GOLGA8N NP_001269423.1 90.4

Orthologs

Putting the amino acid sequence of GOLGA8H through a protein BLAST via NCBI does not yield any hits for orthologs:.[1] However, putting the same sequence through BLAT (the BLAST-Like Alignment Tool) yields multiple orthologs[2]

GOLGA8H Orthologs (Inclusion criteria: 1+ Characterized Chromosome Locations or 50+ BLAT Hits)
Organism Common Name Scientific Name Divergence (MYA)[14] BLAT Hits Main Chromosome Location Other Chromosome Locations
Human Homo sapiens - 87 Chromosome 15 Chromosomes 7, 9, 10, 12, Y
Rhesus Macaque Macaca mulatta 6 70 Uncharacterized** Chromosomes 2, 3, 7, 9, 11, 15
Golden Snub-Nosed Monkey Rhinopithecus roxellana 6 54 Uncharacterized** -
Olive Baboon Papio anubis 9 45 Uncharacterized** Chromosomes 2, 7, 9, 11
Gorilla Gorilla gorilla 15 37 Chromosome 15 Chromosomes 7, 10, 12
Crab-Eating Macaque Macaca fascicularis 20 36 Chromosome 7 Chromosomes 2, 3, 6, 9, 11, 15
Chimpanzee Pan troglodytes 29 35 Chromosome 15 Chromosomes 7, 8, 12, Y
Bornean orangutan Pongo pygmaeus 29 34 Chromosome 15 Chromosomes 5, 7, 9, 10, 12, 19
Bonobo Pan paniscus 29 25 Chromosome 15 Chromosomes 6, 7, 9, 10, 12
Northern White-Cheeked Gibbon Nomascus leucogenys 29 24 Chromosome 6 Chromosomes 5, 8, 10, 16, 17, 18
Green Monkey Chlorocebus sabaeus 29 21 Chromosome 26 Chromosomes 9, 11, 12, 21, 22, 29
Proboscis Monkey Nasalis larvatus 29 19 Chromosome 7 Chromosomes 3, 9, 11, 15
Common Marmoset callithrix jacchus 42 4 Chromosome 6 Chromosome 9
Horse (Domesticated) Equus ferus caballus 89 3 Chromosome 29 Chromosome 25
Gray Short-Tailed Opossum Monodelphis domestica 94 3 Chromosome 3 -
Common House Mouse Mus musculus 94 3 Chromosome 11 -
Dog (Domesticated) Canis familiaris 94 3 Chromosome 15 -
Taurine Cow Bos taurus 160 1 Chromosome 13 -

**Chromosomes labeled as 'uncharacterized' have clone contigs (an assembled set of overlapping DNA sequences) that cannot be confidently placed on a specific chromosome. Similar contigs are concatenated together into short pseudo-chromosomes.

Expression

Data from NCBI shows that GOLGA8H in Homo sapiens has the strongest expression is through the thyroid and testis, with RKPMs of 12.2 and 12.1 respectively. It is also expressed in lesser amounts in 25 other tissues.[1] Data from GEO DataSet show the tissue expression is highest in bone marrow and pancreas tissue.[15] However, samples from all tissues were above the 90th percentile, indicating that the expression value of that gene is much higher in respect to all other genes on the array.[15]

Normal Human Tissue Expression Profiling for GOLGA8H. Two samples are used for each tissue type. Red bars represent count and blue squares represent relative percentile rank within the sample.


When comparing GOLGA8H tissue expression in abnormal conditions to normal human tissue levels, there is not significant deviation in its expression with any variable.[15] This supports the notion that GOLGA8H is ubiquitous.

Interactions

GOLGA8H has been shown to interact with Ubiquitin C (UBC).[16] UBC is a polyubiquitin precursor. Polyubiquitin precursors are a chain of the protein ubiquitin that can be turned into an active form by post-translational modifications. This can mark proteins for degradation, alter their cellular location, affect their activity, and promote or prevent protein interactions. Further research on the link between ubiquitin and the Golgi apparatus include a reliance on ubiquitin to achieve certain processes around the Golgi apparatus.[17][18]

String db lists the following genes as interacting with GOLGA8H:[19]

Genes Interacting with GOLGA8H:
Gene Name Full Name Accession Number[3] Experimentally- Determined Coexpression Text Mining
STX5 Syntaxin 5 NC_000011.10
GORASP1 Golgi reassembly stacking protein 1 NC_000003.12
GOSR1 Golgi SNAP receptor complex member 1 NC_000017.1
USO1 USO1 vesicle transport factor NC_000004.12
GOLGB1 Golgin B1 NC_000003.12

Splice variants

The Homo sapiens GOLGA8H gene has 1 splice variant[20]

References

  1. 1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10 "GOLGA8H golgin A8 family member H [Homo sapiens (human) - Gene - NCBI"]. https://www.ncbi.nlm.nih.gov/gene/728498. 
  2. 2.0 2.1 2.2 2.3 "BLAT Search: GOLGA8H". https://genome.ucsc.edu/cgi-bin/hgBlat. 
  3. 3.0 3.1 "GOLGA8H golgin A8 family member H [ Homo sapiens (human) "]. National Center for Biotechnology Information. https://www.ncbi.nlm.nih.gov/gene/?term=GOLGA8H. 
  4. 4.0 4.1 "Clustal Omega < Multiple Sequence Alignment < EMBL-EBI". https://www.ebi.ac.uk/Tools/msa/clustalo/. 
  5. 5.0 5.1 "ExPASy". https://web.expasy.org/cgi-bin/compute_pi/pi_tool1?P0CJ92@1-632@average. 
  6. 6.0 6.1 6.2 6.3 6.4 "SAPS < Sequence Statistics < EMBL-EBI". https://www.ebi.ac.uk/Tools/seqstats/saps/. 
  7. 7.0 7.1 "PROSITE". https://prosite.expasy.org/PDOC50099. 
  8. 8.0 8.1 8.2 8.3 "NetPhos 3.1 Server". http://www.cbs.dtu.dk/services/NetPhos/. 
  9. Introduction to glycobiology. Drickamer, Kurt. (2nd ed.). Oxford: Oxford University Press. 2006. ISBN 0199282781. OCLC 62307306. 
  10. "CFSSP: Chou & Fasman Secondary Structure Prediction Server". http://www.biogem.org/tool/chou-fasman/index.php. 
  11. 11.0 11.1 "Phyre 2 Results for GOLGA8H". http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=index. 
  12. "I-TASSER results". https://zhanglab.ccmb.med.umich.edu/I-TASSER/output/S461042/. 
  13. 13.0 13.1 13.2 "Genomatix: Gene2Promoter". https://www.genomatix.de/cgi-bin/sessions/login.pl?s=9797c94d2d4215749ea3f9fb93846a37. 
  14. "TimeTree :: The Timescale of Life". http://timetree.org/. 
  15. 15.0 15.1 15.2 "About GEO Profiles - GEO - NCBI". https://www.ncbi.nlm.nih.gov/geo/info/profiles.html. 
  16. "Gene Set - GOLGA8H". http://amp.pharm.mssm.edu/Harmonizome/gene_set/GOLGA8H/Pathway+Commons+Protein-Protein+Interactions. 
  17. Molecular Biology of the Cell (4th ed.). Garland Science. 2002. ISBN 9780815332183. https://www.ncbi.nlm.nih.gov/books/NBK21054/. 
  18. "The ubiquitin-proteasome proteolytic pathway: destruction for the sake of construction". Physiological Reviews 82 (2): 373–428. April 2002. doi:10.1152/physrev.00027.2001. PMID 11917093. 
  19. "STRING: functional protein association networks". https://string-db.org/. 
  20. "Gene: GOLGA8H (ENSG00000261794) - Splice variants - Homo sapiens - Ensembl genome browser 95". http://useast.ensembl.org/Homo_sapiens/Gene/Splice?db=core;g=ENSG00000261794;r=15:30604126-30614561;t=ENST00000566740.