Biology:ARMH1

From HandWiki
Short description: Novel human gene


A representation of the 3D structure of the protein myoglobin showing turquoise α-helices.
Generic protein structure example

Armadillo-like Helical Domain Containing 1 (ARMH1) is a protein which in humans is encoded by chromosome 1 open reading frame 228, also known as the ARMH1 gene. The gene shows expression levels significantly higher in bone marrow, lymph nodes, and testis.[1] Currently the function of the gene and subsequent protein is still uncertain.

Gene

The ARMH1 gene is is found on the plus strand of chromosome 1 between base pairs 45,140,361 and 45,191,784. Other known aliases include P40, NCRNA00082, and most commonly C1orf228. The gene has 13 exons, most of which are concentrated near the poly-A site at the end of the gene and two located upstream from the start codon. The gene is highly expressed in bone marrow and lymph nodes, suggesting an immunological function.[2]

Gene expression

RNA seq data was produced using multiple samples of human tissues at varying stages of development. One study was acquired from 20 separate samples of human tissue showing significantly more expression of ARMH1 in the thymus, trachea, and lungs.[3] A second study shows 27 different tissues samples in 95 different individual subjects. The expression levels are significantly higher in bone marrow, lymph nodes, and testis.[4] A third shows high expression in white blood cells and testis again, corroborating previous studies.[5] A temporal study focused on expression in different stages of develipment collected 35 human fetal samples, from 6 distinct tissues, between 10 and 20 weeks gestational time and sequenced using Illumina TruSeq Stranded Total RNA. The data slightly favored expression in the adrenal glands throughout development. In each of the other tissues there were no stark changes in expression through time, only a small decline of gene expression as development furthers.[6]

Gene transcripts

The ARMH1 gene has extensive abilities to alter it's function and size through isoforms. Gene isoforms are mRNAs that are produced from the same locus but are different in their transcription start sites, protein coding DNA sequences and/or untranslated regions, potentially altering gene function. All known isoforms are organized and listed below with information gathered from NCBI gene,[7] and a Bioinformatics tool for calculating molecular weight.[8]

Protein Isoform Protein Accession Protein Length Molecular Weight mRNA Isoform mRNA Accession mRNA length
X1 XP_047275293 446 aa 49.58 Kda X5 XM_011541340 1693 bp
X2 XP_011539647 433 aa 48.17 Kda X7 XM_011541345 1909 bp
X3 XP_047275308 431 aa 47.39 Kda X8 XM_047419352 1782 bp
X4 XP_047275309 419 aa 46.17 Kda X9 XM_047419353 1507 bp
X5 XP_047275314 405 aa 44.49 Kda X12 XM_047419358 1588 bp
X6 XP_016856631 391 aa 43.58 Kda X13 XM_017001142 1546 bp
X7 XP_047275318 379 aa 41.32 Kda X14 XM_047419362 1393 bp
X8 XP_011539651 376 aa 41.67 Kda X15 XM_011541349 1645 bp
X9 XP_016856632 365 aa 40.47 Kda X16 XM_017001143 1468 bp
X10 XP_047275323 364 aa 40.17 Kda X17 XM_047419367 1342 bp
X11 XP_054192270 338 aa 37.06 Kda X18 XM_054336295 1264 bp
X12 XP_054192271 336 aa 36.46 Kda X19 XM_054336296 1207 bp
X13 XP_054192272 333 aa 36.84 Kda X20 XM_054336297 1474 bp
x14 XP_047275327 332 aa 36.65 Kda X21 XM_047419371 1262 bp
x15 XP_054192274 274 aa 30.61 Kda X23 XM_054336299 1670 bp
x16 XP_016856635 263 aa 29.31 Kda X24 XM_017001146 1146 bp
x17 XP_054192276 242 aa 27.05 Kda X25 XM_054336301 2306 bp
x18 XP_054192277 213 aa 23.69 Kda X26 XM_054336302 1380 bp

mRNA

The mRNA for this gene can be spliced in many different ways, making way for approximately 20 known isoforms. The most common mRNA gets spliced down to a coding region that is about 1693 nucleotides long which makes up 440 amino acids in total.[9] In a comprehensive study on oral squamous cell carcinoma, the sixth most prevalent cancer worldwide, identified ARMH1 as a gene of interest by comparing healthy subjects mRNA against affected individuals. Through mRNA inhibition of ARMH1, researchers demonstrated significantly reduced leukemic cell proliferation (P=.0041) and leukemic cell migration (P=.0001), as well as a decreased resistance to the chemotherapy drug Cytarabine.[10][11]

Protein

The protein encoded by the gene goes by the same name, Armadillo like containing helical domain 1. The isoelectric point of the ARMH1 protein is around a pH of 5.5.[12] The protein has 2 known major domains, one being a transmembrane domain and the other being a coiled coil.[13] Within the coiled coil domains, the ARMH1 protein has 24 alpha helices.[14][15][16][17] The European Bioinformatics Institute’s analysis of ARMH1 reveals clearly a significantly enriched lysine content as well as a significantly deficient proline count.[18] The protein has been proven to have one major interaction with the human protein known as ABAT.[19] Gamma-aminobutyric acid transaminase (ABAT) catalyzes the conversion of gamma-aminobutyric acid (GABA) into succinic semialdehyde. Additionally, ABAT expression was associated with glycolysis-related genes, infiltrated immune cells, immunoinhibitors, and immunostimulators in HCC.[20]

AlphaFold, the state-of-the-art AI system developed by DeepMind, is able to computationally predict protein structures in 3D space.[21]

Homology and evolution

The ARMH1 gene is extremely diverse and is found in thousands of different species. From primates to fungus, this gene has been evolutionarily relevant for hundreds of millions of years. While in near relatives such as cows, the similarity score is 91% that of our genome, in species of fungi the similarity ranges between 20-30%.[22] While attempting to find homologs in any round or flat worms, single celled eukaryotes or prokaryotes, plants, or any fungi besides chitrids, there were no significantly similar genes found. Below is a table of orthologous genes in order of sequence similarity compared to the human ARMH1 isoform X1.

Species Common name Accession number Date of divergence Sequence length (AA) Sequence similarity Sequence Identity
Homo sapiens Human NP_001139108 0 mya 440 100% 100%
Microcebus murinus Grey Mouse Lemur XP_012631405.1 74 mya 441 88% 82%
Rattus norvegicus Brown Rat NP_001119769.2 87 mya 441 80% 78%
Bos taurus Cow XP_005204913.1 94 mya 442 91% 83%
Ornithorhynchus anatinus Platypus XP_028938784.1 180 mya 459 75% 60%
Apteryx rowi Oktarito Kiwi XP_025942684 319 mya 419 73% 59%
Haliaeetus leucocephalus Bald Eagle XP_010581029 319 mya 418 70% 56%
Gopherus flavomarginatus Bolson Tortoise XP_050817160 319 mya 421 78% 65%
Xenopus tropicalis Western Clawed Frog XP_017949069 352 mya 409 70% 55%
Danio rerio Zebra Fish XP_001341083.1 429 mya 410 71% 53%
Leucoraja erinacea Little Skate XP_055497706 462 mya 406 69% 53%
Lytechinus pictus Painted Urchin XP_054764007 619 mya 406 67% 51%
Owenia fusiformis Segmented Worm CAH1776102.1 686 mya 410 71% 51%
Aplysia californica California Sea Hare XP_012936639.1 708 mya 410 69% 52%
Adineta sterineri Rotifera CAF4083605.1 708 mya 420 56% 37%
Pocillopora verrucosa Colonial Coral XP_058955966.1 708 mya 404 67% 49%
Geodia barretti Sea Sponge CAI8036895.1 758 mya 404 50% 35%
Blastocladiella britannica Chytrids KAI9218662 1275 mya 423 34% 22%
Borealophlyctis nickersoniae Rhizophlyctidales KAJ3289137 1275 mya 453 19% 11%

Clinical significance

The ARMH1 gene and subsequent protein have been extensively linked to leukemia, specifically T-cell acute lymphoblastic leukemia (T-ALL).[23] In mostly lymphatic tissue cell lines, T-ALL showed dramatically increased expression of the ARMH1 gene. Bone marrow samples were taken at the initial diagnosis and the conclusion of treatment and ARMH1 along with 5 other genes that were all found to be dramatically changed in expression. To corroborate these findings, once again ARMH1 saw a 1.8x expression increase in samples after diagnosis of leukemia. Higher ARMH1 expression was significantly associated with poor overall survival.[24]

References

  1. Fagerberg, Linn; Hallström, Björn M.; Oksvold, Per; Kampf, Caroline; Djureinovic, Dijana; Odeberg, Jacob; Habuka, Masato; Tahmasebpoor, Simin et al. (February 2014). "Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics". Molecular & Cellular Proteomics 13 (2): 397–406. doi:10.1074/mcp.M113.035600. ISSN 1535-9484. PMID 24309898. 
  2. https://www.genecards.org/cgi-bin/carddisp.pl?gene=ARMH1>
  3. Duff, Michael O.; Olson, Sara; Wei, Xintao; Garrett, Sandra C.; Osman, Ahmad; Bolisetty, Mohan; Plocik, Alex; Celniker, Susan E. et al. (2015-05-21). "Genome-wide identification of zero nucleotide recursive splicing in Drosophila". Nature 521 (7552): 376–379. doi:10.1038/nature14475. ISSN 1476-4687. PMID 25970244. Bibcode2015Natur.521..376D. 
  4. Fagerberg, Linn; Hallström, Björn M.; Oksvold, Per; Kampf, Caroline; Djureinovic, Dijana; Odeberg, Jacob; Habuka, Masato; Tahmasebpoor, Simin et al. (February 2014). "Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics". Molecular & Cellular Proteomics 13 (2): 397–406. doi:10.1074/mcp.M113.035600. ISSN 1535-9484. PMID 24309898. 
  5. "Illumina bodyMap2 transcriptome (ID 204271) - BioProject - NCBI". https://www.ncbi.nlm.nih.gov/bioproject/PRJEB2445/. 
  6. Szabo, Linda; Morey, Robert; Palpant, Nathan J.; Wang, Peter L.; Afari, Nastaran; Jiang, Chuan; Parast, Mana M.; Murry, Charles E. et al. (2015-06-16). "Statistically based splicing detection reveals neural enrichment and tissue-specific induction of circular RNA during human fetal development". Genome Biology 16 (1): 126. doi:10.1186/s13059-015-0690-5. ISSN 1474-760X. PMID 26076956. 
  7. "ARMH1 armadillo like helical domain containing 1 [Homo sapiens (human) - Gene - NCBI"]. https://www.ncbi.nlm.nih.gov/gene/339541. 
  8. "Protein Molecular Weight". https://www.bioinformatics.org/sms/prot_mw.html. 
  9. https://www.ncbi.nlm.nih.gov/gene/339541>
  10. Huang, Su-Ning; Li, Guo-Sheng; Zhou, Xian-Guo; Chen, Xiao-Yi; Yao, Yu-Xuan; Zhang, Xiao-Guohui; Liang, Yao; Li, Ming-Xuan et al. (2020-06-12). "Identification of an Immune Score-Based Gene Panel with Prognostic Power for Oral Squamous Cell Carcinoma". Medical Science Monitor: International Medical Journal of Experimental and Clinical Research 26: e922854. doi:10.12659/MSM.922854. ISSN 1643-3750. PMID 32529991. 
  11. Bhasin, Swati S.; Thomas, Beena E.; Summers, Ryan J.; Sarkar, Debasree; Mumme, Hope; Pilcher, William; Emam, Mohamed; Raikar, Sunil S. et al. (2023-08-02). "Pediatric T-cell acute lymphoblastic leukemia blast signature and MRD associated immune environment changes defined by single cell transcriptomics analysis". Scientific Reports 13 (1): 12556. doi:10.1038/s41598-023-39152-z. ISSN 2045-2322. PMID 37532715. Bibcode2023NatSR..1312556B. 
  12. "ARMH1 (human)". https://www.phosphosite.org/proteinAction.action?id=19080406&showAllSites=true. 
  13. https://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/av.cgi?c=geneid&org=9606&l=339541>
  14. "Bioinformatics Toolkit". https://toolkit.tuebingen.mpg.de/tools/ali2d. 
  15. "JPred Secondary Structure Prediction". http://www.jalview.org/help/html/webServices/jnet.html. 
  16. Jumper, John; Evans, Richard; Pritzel, Alexander; Green, Tim; Figurnov, Michael; Ronneberger, Olaf; Tunyasuvunakool, Kathryn; Bates, Russ et al. (August 2021). "Highly accurate protein structure prediction with AlphaFold" (in en). Nature 596 (7873): 583–589. doi:10.1038/s41586-021-03819-2. ISSN 1476-4687. PMID 34265844. Bibcode2021Natur.596..583J. 
  17. Rost, B. (2003-07-01). "The PredictProtein server". Nucleic Acids Research 31 (13): 3300–3304. doi:10.1093/nar/gkg508. ISSN 1362-4962. PMID 12824312. PMC 168915. http://dx.doi.org/10.1093/nar/gkg508. 
  18. "SAPS Results". https://www.ebi.ac.uk/Tools/services/web/toolresult.ebi?jobId=saps-I20231130-182532-0692-96162528-p1m. 
  19. Huttlin, Edward L.; Bruckner, Raphael J.; Paulo, Joao A.; Cannon, Joe R.; Ting, Lily; Baltier, Kurt; Colby, Greg; Gebreab, Fana et al. (May 2017). "Architecture of the human interactome defines protein communities and disease networks" (in en). Nature 545 (7655): 505–509. doi:10.1038/nature22366. ISSN 1476-4687. PMID 28514442. Bibcode2017Natur.545..505H. 
  20. Gao, Xiaoqiang; Jia, Xiaodong; Xu, Moyan; Xiang, Jiao; Lei, Jin; Li, Yinyin; Lu, Yinying; Zuo, Shi (2022-06-24). "Regulation of Gamma-Aminobutyric Acid Transaminase Expression and Its Clinical Significance in Hepatocellular Carcinoma". Frontiers in Oncology 12: 879810. doi:10.3389/fonc.2022.879810. ISSN 2234-943X. PMID 35847853. 
  21. Laura Howes (2020-12-05). "DeepMind AI predicts protein structures". Chemical & Engineering News: 4. doi:10.47287/cen-09847-leadcon. ISSN 1520-605X. http://dx.doi.org/10.47287/cen-09847-leadcon. 
  22. "ARMH1 armadillo like helical domain containing 1 [Homo sapiens (human) - Gene - NCBI"]. https://www.ncbi.nlm.nih.gov/gene/339541. 
  23. Bhasin, Swati S.; Thomas, Beena E.; Summers, Ryan J.; Sarkar, Debasree; Mumme, Hope; Pilcher, William; Emam, Mohamed; Raikar, Sunil S. et al. (2023-08-02). "Pediatric T-cell acute lymphoblastic leukemia blast signature and MRD associated immune environment changes defined by single cell transcriptomics analysis". Scientific Reports 13 (1): 12556. doi:10.1038/s41598-023-39152-z. ISSN 2045-2322. PMID 37532715. Bibcode2023NatSR..1312556B. 
  24. Bakhtiarigheshlaghbakhtiar, Mojtaba; Bhasin, Swati; Thomas, Beena. "Single-Cell Profiling of Acute Myeloid Leukemia Identified ARMH1, a Novel Protein Associated with Proliferation, Migration, and Drug Resistance". https://ashpublications.org/blood/article/140/Supplement%201/3017/491869/Single-Cell-Profiling-of-Acute-Myeloid-Leukemia.