Biology:Protein FAM46B

From HandWiki
A representation of the 3D structure of the protein myoglobin showing turquoise α-helices.
Generic protein structure example

Protein FAM46B also known as family with sequence similarity 46 member B is a protein that in humans is encoded by the FAM46B gene.[1] FAM46B contains one protein domain of unknown function, DUF1693.[2] Yeast two-hybrid screening has identified three proteins that physically interact with FAM46B. These are ATX1, PEPP2 (encoded by RHOXF2) and DAZAP2.[3][4]

Gene

Overview

FAM46B is the most common name used for the gene encoding FAM46B. The aliases MGC16491 and RP11-344H11 have also been used to describe the same gene.[3] FAM46B a 7,283 base pair gene located on the antisense strand of DNA on the short arm of chromosome 1 at the specific locus 1p36.11. Because it is on the antisense strand, the direction FAM46B is transcribed in is opposite to the standard numbering of nucleotides along the chromosome. FAM46B starts at base 27,339,333 and ends at 27,331,522.

The El Dorado program through Genomatix predicts the promoter region to be 1028 bases long, spanning bases 27,339,962 to 27,338, 935.[5]

Exon structure and splice variants

FAM46B is composed of two exons with no alternative splicing. As evidenced by the direction of the arrows on the exons in comparison with the base pair numbers on the chromosome, FAM46B is on the reverse strand of Chromosome 1.

The FAM46B gene contains two exons, both of which are found in FAM46B protein. There is one main protein isoform indicating no alternative splicing of FAM46B mRNA.[6]

Homology

Paralogs

FAM46B has three paralogs in Homo sapiens: FAM46A, FAM46C, and FAM46D.[3] Multiple sequence alignments of the four members of the FAM46 show high levels of conservataion particularly toward the C-terminus. Amino acids conserved in all four paraologs indicate residues which make up the core of the FAM46 family.

Multiple Sequence Alignment of FAM46 Paralogs
FAM46B orthologs in vertebrates and more distant homologs in invertebrates

Orthologs

FAM46B is present in the common ancestor to animals and is only found in eukaryotes. Although strict orthologs of FAM46B are only found in a relatively small range of animals such as insects and vertebrates, orthologs of FAM46 paralogs have been identified in a broader range of species. Within vertebrates, FAM46B is highly conserved in fish, amphibians and mammals. Common model organisms that FAM46B has been identified in are Danio rerio, Xenopus tropicalis, and Mus musculus. A strict ortholog of FAM46B is not found in reptiles or birds; however both the FAM46A and the FAM46C paralogs are found in the Anolis carolinensis, and the FAM46C paralog is found in birds such as Gallus gallus.[2]

Distant homologs

Distant homologs of FAM46B are present in Drosophila and nematodes such as Caenorhabditis elegans. There are no orthologs of FAM46B in plants, protists, or fungi.[7]

Phylogeny

This unrooted phylogenetic tree shows the relationship between human FAM46B and selective orthologs and homologs.

The phylogenetic tree of FAM46B mirrors a standard phylogenetic tree. As should be expected, the mammals are grouped together with the primates clustered most tightly. The more distant homologs such as Drosophila and Caenorhabditis are on the left, representing greater divergence between the gene sequences.

Protein

The function of FAM46B has not yet been determined. The information below is based on bioinformatic analyses and predictions.

Properties/characteristics

The human form of FAM46B contains 425 amino acid residues, has an isoelectric point of 8.093,[8] and a molecular mass of 46,888 Daltons.[3] FAM46B is a soluble protein predicted to be located in the cytosol.[9][10]

Domains and motifs

FAM46B contains only one identified domain: Domain of Unknown Function 1693 (DUF1693). DUF1693 has been identified as part of the nucleotidyltransferase superfamily and contains four nematode prion-like proteins, but the exact function remains unknown.[11] A SAPS protein analysis does not predict any unusual protein characteristics based on amino acid composition, internal repeats, charge clusters, or periodicities.[12]

Post-translational modifications

This diagram summarizes the major post-translational modifications of FAM46B. All of the individual images were generated using tools available through ExPASy.

FAM46B is not predicted to contain a signal peptide cleavage site,[13] Glycophosphatidylinositol (GPI) anchors, or transmembrane regions. The absence of a signal peptide supports the prediction that FAM46B is located in the cytosol.

Tools at ExPASy were used to predict phosphorylation sites, O-linked glycosylation sites, and N-linked glycosylation sites. Although two sites in FAM46B are predicted as potential sites of N-linked glycosylation, FAM46B lacks a signal peptide and thus, does not enter the lumen of the endoplasmic reticulum where N-linked glycosylation occurs. Five sites were identified as possible O-linked glycosylation sites.[14] These are marked in the Conceptual Translation section below.

The most common post-translational modification predicted in FAM46B is phosphorylation. The program, NetPhos 2.0 predicts 23 phosphorylation sites. The majority of predicted phosphorylation are predicted on serine residues (14), but there are 6 predicted on threonine and 3 on tyrosines.[15] These tend to be clustered together within the protein sequence. A comparison of predicted phosphorylation sites in human, mouse and zebrafish shows that all three species have approximately the same number and distribution of phosphorylation sites (on serines vs. threonines vs. tryrosines).

Secondary structure

The exact structure of FAM46B has not been characterized. Predictive programs available though Biology Workbench[16] such GOR4, PELE, CHOFAS were used to predict secondary structure. The results obtained through programs at Biology Workbench were compared to the results obtained using Phyre2.[17] Since these programs are predictive and rely on different algorithms, each provides slightly different output. Consensus between programs suggests that FAM46B contains mainly alpha helix and random coils. Although present, FAM46B appears to contain only a few small sections pre predicted to form beta sheets. Annotated results of both PELE and PHYRE2 secondary structure predictions are outlined in the figure below.

Conceptual translation

Conceptual translation and key for FAM46B

Expression

Microarray based gene expression of FAM46B in a variety of tissues. Image obtained from BioGPS

Expression can be assessed in a variety of ways. Both expressed sequence tags and GEO profiles show the number of transcripts of a gene present in a certain tissue type and relative to the total gene transcripts. Microarrays are also useful in quantifying gene expression. Protein in-situ hybridization is a more accurate measure of expression than mRNA or cDNA based methods, as probes can be fused directly to the protein.

Expression of FAM46B, broken down by tissue type and health state. Data obtained from the NCBI UniGene page

According to some available microarray data, FAM46B is highly expressed in the tongue (levels 10x above mean gene expression for the tissue).[18] Outside of the tongue, FAM46B seems to be uniformly expressed across most tissues. In addition to gene expression in healthy tissues, EST data also highlights gene expression by health state. It appears FAM46B expression is elevated in cases of skin cancer and gliomas.[19]

Interacting proteins

Transcription factors that bind to regulatory sequences

The El Dorado program through Genomatix was used to predict this list of transcription factors that are likely to bind to the promoter region of FAM46B. Numerous E2F sites are predicted, in addition to numerous Zinc Finger transcription factor sites, several E-box binding factors and TWIST homologs. The binding sites are not evenly distributed within promoter region. The largest clustering of binding sites was located around base 177 of the promoter, which is about 600 base pairs upstream from the start of transcription for FAM46B.[5] The image below shows selected transcription factor binding sites for the top twenty matches identified by El Dorado that are on the antisense strand.

Transcription factor binding sites with high matrix match scores and located on the antisense strand. Data obtained from El Dorado

Confirmed protein-protein interactions and possible clinical significance

Yeast two hybrid screening indicates FAM46B physically interacts with the ataxin-1 protein, which is encoded by ATXN1.[4] The exact function of ATXN1 is not known, but it is thought to be involved in regulating aspects of protein production, particularly transcription. Since FAM46B physically interacts with ATXN1, it is possible that FAM46B also plays a role in the regulation of protein production and regulation of transcription.[20]

A second protein shown to physically interact with FAM46B is DAZAP2, is a proline-rich brain expressed protein.[4] In combination with the information about ATXN1 above, it appears that FAM46B interacts with brain-specific proteins. A third protein identified by yeast two-hybrid screening as a physical interactant of FAM46B is PEPP2,[4] a paired-like homeobox protein. If this interaction is significant, the interaction between FAM46B and PEPP2 may play a role in development and morphogenesis.

However, the protein interactome is not yet well understood. Not every program identified interacting proteins in the same ways. As an example, STRING identified ATXN-1 as a strong interaction partner with FAM46B, but did not identify PEPP2 nor DAZAP2. The prediction network from STRING is shown in the adjacent image.

References

  1. "NCBI Gene: FAM46B family with sequence similarity 46, member B". https://www.ncbi.nlm.nih.gov/gene/115572. Retrieved 23 April 2013. 
  2. 2.0 2.1 "NCBI BLAST". National Library of Medicine. National Center for Biotechnology Information. http://blast.ncbi.nlm.nih.gov/Blast.cgi. Retrieved 11 May 2013. 
  3. 3.0 3.1 3.2 3.3 "family with sequence similarity 46, member B". https://www.genecards.org/cgi-bin/carddisp.pl?gene=FAM46B#snp. Retrieved 23 April 2013. 
  4. 4.0 4.1 4.2 4.3 "FAM46B Interaction Summary". BioGRID. Tyers Lab. http://thebiogrid.org/125440/summary/homo-sapiens/fam46b.html. Retrieved 11 May 2013. 
  5. 5.0 5.1 "Annotation and Analysis". El Dorado. Genomatix. http://www.genomatix.de. Retrieved 4 May 2013. 
  6. "Homo sapiens family with sequence similarity 46, member B (FAM46B), mRNA". https://www.ncbi.nlm.nih.gov/nuccore/NM_052943.3. Retrieved 23 April 2013. 
  7. "family with sequence similarity 46, member B". https://www.genecards.org/cgi-bin/carddisp.pl?gene=FAM46B&ortholog=all#orthologs. Retrieved 23 April 2013. 
  8. "Big-PI". IMP Bioinformatics. http://mendel.imp.ac.at/sat/gpi/gpi_server.html. Retrieved 11 May 2013. 
  9. "SOSUI Prediction". http://bp.nuap.nagoya-u.ac.jp/sosui/sosui_submit.html. Retrieved 4 May 2013. 
  10. "PSORT II". http://psort.hgc.jp/form2.html. Retrieved 4 May 2013. 
  11. "NCB Conserved Domains: DUF1693 Superfamily". https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=RE2DV55J014&mode=all. Retrieved 23 April 2013. 
  12. "Methods and algorithms for statistical analysis of protein sequences". Proc. Natl. Acad. Sci. U.S.A. 89 (6): 2002–6. March 1992. doi:10.1073/pnas.89.6.2002. PMID 1549558. 
  13. "SignalP 4.0: discriminating signal peptides from transmembrane regions". Nat. Methods 8 (10): 785–6. 2011. doi:10.1038/nmeth.1701. PMID 21959131. 
  14. "NetOGlyc". CBS Prediction Servers. http://www.cbs.dtu.dk/services/NetOGlyc/. Retrieved 11 May 2013. 
  15. "NetPhos". CBS Prediction Servers. http://www.cbs.dtu.dk/services/NetPhos/. Retrieved 11 May 2013. 
  16. "GOR4, CHOFAS, PELE". Protein Tools. San Diego Supercomputer Center. http://seqtool.sdsc.edu/CGI/BW.cgi. Retrieved 12 May 2013. [yes|permanent dead link|dead link}}]
  17. "Protein structure prediction on the Web: a case study using the Phyre server". Nat Protoc 4 (3): 363–71. 2009. doi:10.1038/nprot.2009.2. PMID 19247286. 
  18. "SymAtlas Expression FAM46B". BioGPS. The Scripps Research Institute. http://biogps.org/#goto=genereport&id=115572. Retrieved 12 May 2013. 
  19. "UniGene Data, FAM46B". EST Profile. National Center for Biotechnology Information. https://www.ncbi.nlm.nih.gov/UniGene/ESTProfileViewer.cgi?uglist=Hs.632378. Retrieved 12 May 2013. 
  20. "ATXN1 - ataxin 1". Genetic Home Reference, National Library of Medicine. http://ghr.nlm.nih.gov/gene/ATXN1. Retrieved 11 May 2013.