Biology:C7orf50
Generic protein structure example |
C7orf50 (Chromosome 7, Open Reading Frame 50) is a gene in humans (Homo sapiens) that encodes a protein known as C7orf50 (uncharacterized protein C7orf50). This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity.[1][2] C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi.[3] This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing.[4][5] Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.[6][7]
Gene
Background
C7orf50, also known as YCR016W, MGC11257, and LOC84310, is a protein coding gene of poor characterization in need of further research. This gene can be accessed on NCBI at the accession number NC_000007.14, on HGNC at the ID number 22421, on ENSEMBL at the ID ENSG00000146540, on GeneCards at GCID:GC07M000996, and on UniProtKB at the ID Q9BRJ6.
Location
C7orf50 is located on the short arm of chromosome 7 (7p22.3), starting at base pair (bp) 977,964 and ending at bp 1,138,325. This gene spans 160,361 bps on the minus (-) strand and contains a total of 13 exons.[1]
Gene Neighborhood
Genes within the neighborhood of C7orf50 are the following: LOC105375120, GPR146, LOC114004405, LOC107986755, ZFAND2A, LOC102723758, LOC106799841, COX19, ADAP1, CYP2W1, MIR339, GPER1, and LOC101927021. This neighborhood extends from bp 89700 to bp 1165958 on chromosome 7.[1]
mRNA
Alternative Splicing
C7orf50 has a total of 7 experimentally curated mRNA transcripts.[1] These transcripts are maintained independently of annotated genomes and were not generated computationally from a specific genome build such as the GRCh38.p13 primary assembly; therefore, they are typically more reliable. The longest and most complete of these transcripts (transcript 4) being 2138bp, producing a 194 amino acid-long (aa) protein, and consisting of 5 exons.[8] Of these transcripts, four of them encode for the same 194aa protein (isoform a),[9] only differing in their 5' and 3' untranslated regions (UTRs). The three other transcripts encode isoform b, c, and d, respectively. The table below is representative of these transcripts.
C7orf50 Experimentally Determined
NCBI Reference Sequences (RefSeq) mRNA Transcripts | |||||
Name | NCBI Accession # | Transcript Length | # of Exons | Protein Length | Isoform |
Transcript Variant 1 | NM_032350.5 | 1311bp | 5 | 194aa | a |
Transcript Variant 2 | NM_001134395.1 | 1301bp | 5 | 194aa | a |
Transcript Variant 3 | NM_001134396.1 | 1282bp | 5 | 194aa | a |
Transcript Variant 4 | NM_001318252.2 | 2138bp | 5 | 194aa | a |
Transcript Variant 7 | NM_001350968.1 | 1081bp | 6 | 193aa | b |
Transcript Variant 8 | NM_001350969.1 | 1500bp | 5 | 180aa | c |
Transcript Variant 9 | NM_001350970.1 | 1448bp | 3 | 60aa | d |
Alternatively, when the primary genomic assembly, GRCh38.p13, is used for annotation (NCBI: NC_000007.14), there are 10 computationally predicted mRNA transcripts.[1] The most complete and supported of these transcripts (transcript variant X6) is 1896bp, producing a 225aa-long protein.[10] In total, there are 6 different isoforms predicted for C7orf50. Of these transcripts, 5 of them encode for the same isoform (X3).[11] The remaining transcripts encode isoforms X2, X4, X5, X6, and X7 as represented below.
C7orf50 Computationally Determined
NCBI Reference Sequences (RefSeq) mRNA Transcripts | ||||
Name | NCBI Accession # | Transcript Length | Protein Length | Isoform |
Transcript Variant X2 | XM_017012719.1 | 1447bp | 375aa | X2 |
Transcript Variant X3 | XM_011515582.3 | 1192bp | 225aa | X3 |
Transcript Variant X4 | XM_024446977.1 | 1057bp | 193aa | X4 |
Transcript Variant X5 | XM_011515581.3 | 1240bp | 225aa | X3 |
Transcript Variant X6 | XM_011515584.2 | 1896bp | 225aa | X3 |
Transcript Variant X7 | XM_017012720.2 | 1199bp | 225aa | X3 |
Transcript Variant X8 | XM_011515583.2 | 1215bp | 225aa | X3 |
Transcript Variant X9 | XM_017012721.2 | 2121bp | 211aa | X5 |
Transcript Variant X10 | XM_024446978.1 | 2207bp | 180aa | X6 |
Transcript Variant X11 | XM_024446979.1 | 933bp | 93aa | X7 |
5' and 3' UTR
Based on the experimentally determined C7orf50 mRNA transcript variant 4, the 5' UTR of C7orf50 is 934 nucleotides (nt) long, while the 3' UTR is 619nt. The coding sequence (CDS) of this transcript spans nt 935..1519 for a total length of 584nt and is encoded in reading frame 2.[8] Interestingly, the 5'UTR of C7orf50 contains a uORF in need of further study, ranging from nt 599 to nt 871 also in the second reading frame.[12]
Protein
General Properties
The C7orf50 Isoform a's 194aa protein sequence from NCBI [9] is as follows:
>NP_001127867.1 uncharacterized protein C7orf50 isoform a [Homo sapiens] MAKQKRKVPEVTEKKNKKLKKASAEGPLLGPEAAPSGEGAGSKGEAVLRPGLDAEPELSPEEQRVLERKL 70 KKERKKEERQRLREAGLVAQHPPARRSGAELALDYLCRWAQKHKNWRFQKTRQTWLLLHMYDSDKVPDEH 140 FSTLLAYLEGLQGRARELTVQKAEALMRELDEEGSDPPLPGRAQRIRQVLQLLS 194
The underlined region within the sequence is indicative of a domain known as DUF2373 ("domain of unknown function 2373"), found in isoforms a, b, and c.
C7orf50 has a predicted molecular weight (Mw) of 22 kDa, making C7orf50 smaller than the average protein (52 kDa).[13] The isoelectric point (theoretical pI) for this isoform is 9.7, meaning that C7orf50 is slightly basic.[14][15] As for charge runs and patterns within isoform a, there is a significant mixed charge (*) run (-++0++-+++--+) from aa67 to aa79 and an acidic (-) run from aa171 – aa173. It is likely that this mixed charge run encodes the protein-protein interaction (PPI) site of C7orf50.[16][17]
Domains and Motifs
DUF2373 is a domain of unknown function found in the C7orf50 protein. This is a highly conserved c-terminal region found from fungi to humans.[18] As for motifs, a bipartite nuclear localization signal (NLS) was predicted from aa6 to aa21, meaning that C7orf50 is likely localized in the nucleus.[19] Interestingly, a nuclear export signal (NES) is also found within the C7orf50 protein at the following amino acids: 150, and 153 - 155, suggesting that C7orf50 has function both inside and outside the nucleus.[20][21]
Structure
Secondary Structure
The majority of C7orf50 (isoform a) secondary structure is made up of alpha helices, with the remainder being small portions of random coils, beta turns, or extended strands.[22][23]
Tertiary Structure
The tertiary structure of C7orf50 consists primarily of alpha helices as determined I-TASSER.[5][24][25]
Quaternary Structure
The interaction network (quaternary structure) involving the C7orf50 protein has significantly more (p < 1.0e-16) interactions than a randomly selected set of proteins. This indicates that these proteins are partially connected biologically as a group; therefore, they intrinsically depend on each other within their biological pathway.[26] This means that although the function of C7orf50 is uncharacterized, it is most likely to be associated with the same processes and functions as the proteins within its network.
Biological Processes | rRNA processing | maturation of 5.8S, LSU, and SSU rRNA |
Molecular Functions | catalytic activity, acting on RNA | ATP-dependent RNA helicase activity |
Cellular Components | nucleolus | preribosomes |
Reactome Pathways | major pathway of rRNA processing in the nucleolus and cytosol | rRNA modification in the nucleus and cytosol |
Protein Domains and Motifs | helicase conserved C-terminal domain | DEAD/DEAH box helicase |
The closest predicted functional partners of C7orf50 are the following proteins: DDX24, DDX52, PES1, EBNA1BP2, RSLD1, NOP14, FTSJ3, KRR1, LYAR, and PWP1. These proteins are predicted to co-express rather than bind directly C7orf50 and each other.
Regulation
Gene Regulation
Promoter
C7orf50 has 6 predicted promoter regions. The promoter with the greatest number of transcripts and CAGE tags overall is promoter set 6 (GXP_6755694) on ElDorado by Genomatix. This promoter region is on the minus (-) strand and has a start position of 1,137,965 and an end position of 1,139,325, making this promoter 1,361bp long. It has 16 coding transcripts and the transcript with the greatest identity to C7orf50 transcript 4 is transcript GXT_27788039 with 98746 CAGE tags.[27]
Promoter ID | Start Position | End Position | Length | # of Coding Transcripts | Greatest # of CAGE Tags in Transcripts |
GXP_9000582 | 1013063 | 1013163 | 1101bp | 0 | N/A |
GXP_6755691 | 1028239 | 1030070 | 1832bp | 4 | 169233 |
GXP_6053282 | 1055206 | 1056306 | 1101bp | 1 | 449 |
GXP_3207505 | 1127288 | 1128388 | 1101bp | 1 | 545 |
GXP_9000584 | 1130541 | 1131641 | 1101bp | 0 | N/A |
GXP_6755694 | 1137965 | 1139325 | 1361bp | 16 | 100,070 |
The CpG island associated with this promoter has 75 CpGs (22% of island), and is 676bp long. The C count plus G count is 471, the percentage C or G is 70% within this island, and the ratio of observed to expected CpG is 0.91.[28][29]
Transcription Factor Binding Sites
As determined by MatInspector at Genomatix, the following transcription factor (TFs) families are most highly predicted to bind to C7orf50 in the promoter region.[27]
Transcription Factor | Detailed Family Information |
NR2F | Nuclear receptor subfamily 2 factors |
PERO | Peroxisome proliferator-activated receptor |
HOMF | Homeodomain transcription factors |
PRDM | PR (PRDI-BF1-RIZ1 homologous) domain transcription factor |
VTBP | Vertebrate TATA binding protein factor |
HZIP | Homeodomain-leucine zipper transcription factors |
ZTRE | Zinc transcriptional regulatory element |
XBBF | X-box binding factors |
SP1F | GC-Box factors SP1/GC |
CAAT | CCAAT binding factors |
ZF57 | KRAB domain zinc finger protein 57 |
CTCF | CTCF and BORIS gene family, transcriptional regulators with highly conserved zinc finger domains |
MYOD | Myoblast determining factors |
KLFS | Krueppel like transcription factors |
Expression Pattern
C7orf50 shows ubiquitous expression in the kidneys, brain, fat, prostrate, spleen and 22 other tissues and low tissue and immune cell specificity .[1][2] This expression is very high, 4 times above the average gene; therefore, there is a higher abundance of C7orf50 mRNA than the average gene within a cell.[30] There does not appear to be a definitive cell type in which this gene is not expressed.[31]
Transcription Regulation
Splice Enhancers
The mRNA of C7orf50 is predicted to have exonic splicing enhancers, in which SR proteins can bind, at bp positions 45 (SRSF1 (IgM-BRCA1)), 246 (SRSF6), 703 (SRSF5), 1301 (SRSF1), and 1308 (SRSF2) [32][33]
Stem Loop Prediction
Both the 5' and 3' UTRs of the mRNA of C7orf50 are predicted to fold into structures such as bulge loops, internal loops, multibranch loops, hairpin loops, and double helices. The 5'UTR has a predicted free energy of -416 kcal/mol with an ensemble diversity of 238. The 3' UTR has a predicted free energy of -279 kcal/mol with an ensemble diversity of 121.[34]
miRNA Targeting
There are many poorly conserved miRNA binding sites predicted within the 3’UTR of C7orf50 mRNA. The notable miRNA families that are predicted to bind to C7orf50 mRNA and regulate/repress transcription are the following: miR-138-5p, miR-18-5p, miR-129-3p, miR-124-3p.1, miR-10-5p, and miR-338-3p.[35][36][37]
Protein Regulation
Subcellular Localization
The C7orf50 protein is predicted to localize intercellularly in both the nucleus and cytoplasm, but primarily within the nucleoplasm and nucleoli.[38][39][19][40]
Post-Translational Modification
The C7orf50 protein is predicted to be mucin-type GalNAc o-glycosylated at the following amino acid sites: 12, 23, 36, 42, 59, and 97.[41][42] Additionally, this protein is predicted to be SUMOylated at aa71 with the SUMO protein binding from aa189 through aa193.[43][44][45] C7orf50 is also predicted to be kinase-specific phosphorylated at the following amino acids: 12, 23, 36, 42, 59, 97, 124, 133, 159, and 175.[46][47][48][49][50] Interestingly, many of these sites overlap with the o-glycosylation sites. Of these phosphorylation sites, the majority are serines (53%) with the remainder being either tyrosines or threonines. The most associated kinases with these sites are the following kinase groups: AGC, CAMK, TKL, and STE. Finally, this protein is predicted to have 8 glycations of the ε amino groups of lysines at the following sites: aa3, 5, 14, 15, 17, 21, 76, and 120.[51][52]
Homology
Paralogs
No paralogs of C7orf50 have been detected in the human genome; however, there is slight evidence (58% similarity) of a paralogous DUF2373 domain in the protein of KIDINS220.[53]
Orthologs
Below is a table of a variety of orthologs of the human C7orf50 gene.[54][3] The table includes closely, moderately, and distantly related orthologs. C7orf50 is highly evolutionary conserved from mammals to fungi. When these ortholog sequences are compared, the most conserved portions are those of DUF2373, highlighting this domain's importance in the functioning of C7orf50. C7orf50 has evolved moderately and evenly over time with a divergence rate greater than Hemoglobin but less than Cytochrome C.
Genus and Species | Common Name | Taxon Class | Date of Divergence (MYA) | Accession # | Length (AA) | % identity w/ human |
---|---|---|---|---|---|---|
Homo sapiens | Human | Mammalia | N/A | NM_001318252.2 | 194aa | 100% |
Tupaia chinensis | Chinese Tree Shrew | Mammalia | 82 | XP_006167949.1 | 194aa | 76% |
Dasypus novemcinctus | Nine-banded Armadillo | Mammalia | 105 | XP_004483895.1 | 198aa | 70% |
Miniopterus natalens | Natal Long-fingered Bat | Mammalia | 96 | XP_016068464.1 | 199aa | 69% |
Protobothrops mucrosquamatus | Brown-spotted Pit Viper | Reptilia | 312 | XP_015673296.1 | 196aa | 64% |
Balearica regulorum gibbericeps | Grey-crowned Crane | Aves | 312 | XP_010302837.1 | 194aa | 61% |
Falco peregrinus | Peregrine Falcon | Aves | 312 | XP_027635198.1 | 193aa | 59% |
Xenopus laevis | African Clawed Frog | Amphibia | 352 | XP_018094637.1 | 198aa | 50% |
Electrophorus electricus | Electric Eel | Actinopterygii | 435 | XP_026880604.1 | 195aa | 53% |
Rhincodon typus | Whale Shark | Chondrichthyes | 465 | XP_020372968.1 | 195aa | 52% |
Ciona intestinalis | Sea Vase | Ascidiacea | 676 | XP_026696561.1 | 282aa | 37% |
Octopus bimaculoides | California Two-spot Octopus | Cephalopoda | 797 | XP_014772175.1 | 221aa | 40% |
Priapulus caudatus | Priapulus | Priapulida | 797 | XP_014663190.1 | 333aa | 39% |
Bombus terrestris | Buff-tailed Bumblebee | Insecta | 797 | XP_012171653.1 | 260aa | 32% |
Actinia tenebrosa | Australian Red Waratah Sea Anemone | Anthozoa | 824 | XP_031575029.1 | 330aa | 43% |
Trichoplax adhaerens | Trichoplax | Trichoplacidae | 948 | XP_002110193.1 | 137aa | 44% |
Spizellomyces punctatus | Branching Chytrid Fungi | Fungi | 1105 | XP_016610491.1 | 412aa | 29% |
Eremothecium cymbalariae | Fungi | Fungi | 1105 | XP_003644395.1 | 266aa | 25% |
Quercus suber | Cork Oak Tree | Plantae | 1496 | XP_023896156.1 | 508aa | 30% |
Plasmopara halstedii | Downy Mildew of Sunflower | Oomycetes | 1768 | XP_024580369.1 | 179aa | 26% |
Function
The consensus prediction of C7orf50 function (GO terms), as determined by I-TASSER,[55][24][25] predicts the molecular function to be protein binding, the biological process to be protein import (specifically into the nucleus), and the associated cellular component to be a pore complex (specifically of the nuclear envelope). It can be predicted that the function of C7orf50 is one in which C7orf50 imports ribosomal proteins into the nucleus in order to be made into ribosomes, but further research is needed to solidify this function.
Interacting Proteins
Name of Protein | Name of Gene | Function | UniProt Accession # |
---|---|---|---|
THAP1 domain-containing protein 1 | THAP1 | DNA-binding transcription regulator that regulates endothelial cell proliferation and G1/S cell-cycle progression.[58] | Q9NVV9 |
Protein Tax-2 | tax | Transcriptional activator that activates both the viral long terminal repeat (LTR) and cellular promoters via activation of CREB, NF-kappa-B, SRF and AP-1 pathways.[59] | P03410 |
Major Prion Protein | PRNP | Its primary physiological function is unclear. May play a role in neuronal development and synaptic plasticity. May be required for neuronal myelin sheath maintenance. May promote myelin homeostasis through acting as an agonist for ADGRG6 receptor. May play a role in iron uptake and iron homeostasis.[60] | P04156 |
Aldehyde dehydrogenase X, mitochondrial | ALDH1B1 | Pay a major role in the detoxification of alcohol-derived acetaldehyde. They are involved in the metabolism of corticosteroids, biogenic amines, neurotransmitters, and lipid peroxidation.[61] | P30837 |
Cell growth-regulating nucleolar protein | LYAR | Plays a role in the maintenance of the appropriate processing of 47S/45S pre-rRNA to 32S/30S pre-rRNAs and their subsequent processing to produce 18S and 28S rRNAs.[62][63] | Q9NX58 |
Coiled-coil domain-containing protein 85B | CCDC85B | Functions as a transcriptional repressor.[64][65] | Q15834 |
Nucleolar protein 56 | NOP56 | Involved in the early to middle stages of 60S ribosomal subunit biogenesis. Core component of box C/D small nucleolar ribonucleoprotein (snoRNP) particles. Required for the biogenesis of box C/D snoRNAs such U3, U8 and U14 snoRNAs.[66] | O00567 |
rRNA 2'-O-methyltransferase fibrillarin | FBL | Has the ability to methylate both RNAs and proteins. Involved in pre-rRNA processing by catalyzing the site-specific 2'-hydroxyl methylation of ribose moieties in pre-ribosomal RNA.[67][68][69] | P22087 |
40S ribosomal protein S6 | RPS6 | May play an important role in controlling cell growth and proliferation through the selective translation of particular classes of mRNA.[70] | P62753 |
Clinical Significance
C7orf50 has been noted in a variety of genome-wide association studies (GWAS) and has been shown to be associated with type 2 diabetes among sub-Saharan Africans,[71] daytime sleepiness in African-Americans,[72] prenatal exposure to particulate matter,[73] heritable DNA methylation marks associated with breast cancer,[74] DNA methylation in relation to plasma carotenoids and lipid profile,[75] and has significant interactions with prion proteins.[76]
References
- ↑ 1.0 1.1 1.2 1.3 1.4 1.5 "C7orf50 chromosome 7 open reading frame 50 [Homo sapiens (human) - Gene - NCBI"]. https://www.ncbi.nlm.nih.gov/gene?cmd=retrieve&list_uids=84310.
- ↑ 2.0 2.1 "C7orf50 protein expression summary - The Human Protein Atlas". https://www.proteinatlas.org/ENSG00000146540-C7orf50.
- ↑ 3.0 3.1 "C7orf50 orthologs" (in en). https://www.ncbi.nlm.nih.gov/gene/84310/ortholog/.
- ↑ Alberts, Bruce; Johnson, Alexander; Lewis, Julian; Raff, Martin; Roberts, Keith; Walter, Peter (2002). "The Transport of Molecules between the Nucleus and the Cytosol". Molecular Biology of the Cell. (4th ed.). https://www.ncbi.nlm.nih.gov/books/NBK26932/.
- ↑ 5.0 5.1 "I-TASSER server for protein structure and function prediction". https://zhanglab.ccmb.med.umich.edu/I-TASSER/.
- ↑ "Protein coding genes as hosts for noncoding RNA expression". Seminars in Cell & Developmental Biology 75: 3–12. March 2018. doi:10.1016/j.semcdb.2017.08.016. PMID 28811264.
- ↑ HUGO Gene Nomenclature Committee. "MicroRNA protein coding host genes". https://www.genenames.org/data/genegroup/#!/group/1691.
- ↑ 8.0 8.1 (in en-US) Homo sapiens chromosome 7 open reading frame 50 (C7orf50), transcript variant 4, mRNA. 2020-04-25. http://www.ncbi.nlm.nih.gov/nuccore/NM_001318252.2.
- ↑ 9.0 9.1 "uncharacterized protein C7orf50 isoform a [Homo sapiens - Protein - NCBI"]. https://www.ncbi.nlm.nih.gov/protein/970919379.
- ↑ (in en-US) PREDICTED: Homo sapiens chromosome 7 open reading frame 50 (C7orf50), transcript variant X6, mRNA. 2020-03-02. http://www.ncbi.nlm.nih.gov/nuccore/XM_011515584.2.
- ↑ "uncharacterized protein C7orf50 isoform X3 [Homo sapiens - Protein - NCBI"]. https://www.ncbi.nlm.nih.gov/protein/767945960.
- ↑ "ORF Finder". https://www.bioinformatics.org/sms2/orf_find.html.
- ↑ "Average protein size - Various - BNID 113349" (in en). https://bionumbers.hms.harvard.edu/bionumber.aspx?s=n&v=0&id=113349.
- ↑ Kozlowski, Lukasz P.. "Proteome-pI - Proteome Isoelectric Point Database statistics" (in en). http://isoelectricpointdb.org/statistics.html.
- ↑ "ExPASy - Compute pI/Mw tool". https://web.expasy.org/compute_pi/.
- ↑ "SAPS < Sequence Statistics < EMBL-EBI". https://www.ebi.ac.uk/Tools/seqstats/saps/.
- ↑ "Clusters of charged residues in protein three-dimensional structures". Proceedings of the National Academy of Sciences of the United States of America 93 (16): 8350–5. August 1996. doi:10.1073/pnas.93.16.8350. PMID 8710874. Bibcode: 1996PNAS...93.8350Z.
- ↑ "Pfam: Family: DUF2373 (PF10180)". http://pfam.xfam.org/family/PF10180.
- ↑ 19.0 19.1 "Motif Scan" (in en). https://myhits.isb-sib.ch/cgi-bin/motif_scan.
- ↑ "NetNES 1.1 Server". http://www.cbs.dtu.dk/services/NetNES/.
- ↑ "Analysis and prediction of leucine-rich nuclear export signals". Protein Engineering, Design & Selection 17 (6): 527–36. June 2004. doi:10.1093/protein/gzh062. PMID 15314210.
- ↑ "NPS@ : CONSENSUS secondary structure prediction". https://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_seccons.html.
- ↑ "CFSSP: Chou & Fasman Secondary Structure Prediction Server". https://www.biogem.org/tool/chou-fasman/index.php.
- ↑ 24.0 24.1 "COFACTOR: improved protein function prediction by combining structure, sequence and protein-protein interaction information". Nucleic Acids Research 45 (W1): W291–W299. July 2017. doi:10.1093/nar/gkx366. PMID 28472402.
- ↑ 25.0 25.1 "I-TASSER server: new development for protein structure and function predictions". Nucleic Acids Research 43 (W1): W174-81. July 2015. doi:10.1093/nar/gkv342. PMID 25883148.
- ↑ "C7orf50 protein (human) - STRING interaction network". https://string-db.org/cgi/network.pl?taskId=hbSXbSZovONr.
- ↑ 27.0 27.1 "Genomatix - NGS Data Analysis & Personalized Medicine". https://www.genomatix.de/?s=435a27c29323011e3aef6e3522bf640d.
- ↑ "CpG Island Info". https://genome.ucsc.edu/cgi-bin/hgc?hgsid=831382483_oz24NTGzgAG5H4k6gGBN0YDsSK42&c=chr7&l=1064094&r=1205336&o=1137943&t=1138619&g=cpgIslandExt&i=CpG:+75.
- ↑ "CpG islands in vertebrate genomes". Journal of Molecular Biology 196 (2): 261–82. July 1987. doi:10.1016/0022-2836(87)90689-9. PMID 3656447.
- ↑ "AceView: Gene:C7orf50, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView.". https://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/av.cgi?db=human&term=C7orf50&submit=Go.
- ↑ "2895856 - GEO Profiles - NCBI". https://www.ncbi.nlm.nih.gov/geoprofiles/2895856.
- ↑ "An increased specificity score matrix for the prediction of SF2/ASF-specific exonic splicing enhancers". Human Molecular Genetics 15 (16): 2490–508. August 2006. doi:10.1093/hmg/ddl171. PMID 16825284.
- ↑ "ESEfinder: A web resource to identify exonic splicing enhancers". Nucleic Acids Research 31 (13): 3568–71. July 2003. doi:10.1093/nar/gkg616. PMID 12824367.
- ↑ "RNAfold web server". http://rna.tbi.univie.ac.at//cgi-bin/RNAWebSuite/RNAfold.cgi.
- ↑ "TargetScanHuman 7.2". http://www.targetscan.org/vert_72/.
- ↑ "miRNA Targeting: Growing beyond the Seed" (in English). Trends in Genetics 35 (3): 215–222. March 2019. doi:10.1016/j.tig.2018.12.005. PMID 30638669.
- ↑ "Most mammalian mRNAs are conserved targets of microRNAs". Genome Research 19 (1): 92–105. January 2009. doi:10.1101/gr.082701.108. PMID 18955434.
- ↑ "C7orf50 protein expression summary - The Human Protein Atlas". https://www.proteinatlas.org/ENSG00000146540-C7orf50.
- ↑ "PSORT II Prediction". https://psort.hgc.jp/form2.html.
- ↑ "Better prediction of protein cellular localization sites with the k nearest neighbors classifier". Proceedings. International Conference on Intelligent Systems for Molecular Biology 5: 147–52. 1997. PMID 9322029.
- ↑ "NetOGlyc 4.0 Server" (in en). http://www.cbs.dtu.dk/services/NetOGlyc/.
- ↑ "Precision mapping of the human O-GalNAc glycoproteome through SimpleCell technology". The EMBO Journal 32 (10): 1478–88. May 2013. doi:10.1038/emboj.2013.79. PMID 23584533.
- ↑ "GPS-SUMO: a tool for the prediction of sumoylation sites and SUMO-interaction motifs". Nucleic Acids Research 42 (Web Server issue): W325-30. July 2014. doi:10.1093/nar/gku383. PMID 24880689.
- ↑ "Systematic study of protein sumoylation: Development of a site-specific predictor of SUMOsp 2.0". Proteomics 9 (12): 3409–3412. June 2009. doi:10.1002/pmic.200800646. PMID 29658196.
- ↑ "GPS-SUMO: Prediction of SUMOylation Sites & SUMO-interaction Motifs". http://sumosp.biocuckoo.org/.
- ↑ "GPS 5.0 - Kinase-specific Phosphorylation Site Prediction". http://gps.biocuckoo.cn/online.php.
- ↑ "NetPhos 3.1 Server". http://www.cbs.dtu.dk/services/NetPhos/.
- ↑ "Sequence and structure-based prediction of eukaryotic protein phosphorylation sites". Journal of Molecular Biology 294 (5): 1351–62. December 1999. doi:10.1006/jmbi.1999.3310. PMID 10600390.
- ↑ "Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence". Proteomics 4 (6): 1633–49. June 2004. doi:10.1002/pmic.200300771. PMID 15174133.
- ↑ "GPS 5.0: An Update on the Prediction of Kinase-specific Phosphorylation Sites in Proteins". Genomics, Proteomics & Bioinformatics 18 (1): 72–80. March 2020. doi:10.1016/j.gpb.2020.01.001. PMID 32200042.
- ↑ "NetGlycate 1.0 Server" (in en). http://www.cbs.dtu.dk/services/NetGlycate/.
- ↑ "Analysis and prediction of mammalian protein glycation". Glycobiology 16 (9): 844–53. September 2006. doi:10.1093/glycob/cwl009. PMID 16762979.
- ↑ "Protein BLAST: search protein databases using a protein query". https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins.
- ↑ "BLAST: Basic Local Alignment Search Tool". https://blast.ncbi.nlm.nih.gov/Blast.cgi.
- ↑ "I-TASSER results". https://zhanglab.ccmb.med.umich.edu/I-TASSER/output/S533933/.
- ↑ "IntAct Portal". https://www.ebi.ac.uk/intact/.
- ↑ "CCSB Interactome Database". http://interactome.dfci.harvard.edu/index.php?page=home.
- ↑ "THAP1 - THAP domain-containing protein 1 - Homo sapiens (Human) - THAP1 gene & protein". https://www.uniprot.org/uniprot/Q9NVV9.
- ↑ "tax - Protein Tax-2 - Human T-cell leukemia virus 2 (HTLV-2) - tax gene & protein". https://www.uniprot.org/uniprot/P03410.
- ↑ "PRNP - Major prion protein precursor - Homo sapiens (Human) - PRNP gene & protein". https://www.uniprot.org/uniprot/P04156.
- ↑ "ALDH1B1 - Aldehyde dehydrogenase X, mitochondrial precursor - Homo sapiens (Human) - ALDH1B1 gene & protein". https://www.uniprot.org/uniprot/P30837.
- ↑ "LYAR - Cell growth-regulating nucleolar protein - Homo sapiens (Human) - LYAR gene & protein". https://www.uniprot.org/uniprot/Q9NX58.
- ↑ "Human cell growth regulator Ly-1 antibody reactive homologue accelerates processing of preribosomal RNA". Genes to Cells 19 (4): 273–86. April 2014. doi:10.1111/gtc.12129. PMID 24495227.
- ↑ "DIPA, which can localize to the centrosome, associates with p78/MCRS1/MSP58 and acts as a repressor of gene transcription". Experimental and Molecular Pathology 81 (3): 184–90. December 2006. doi:10.1016/j.yexmp.2006.07.008. PMID 17014843.
- ↑ "CCDC85B - Coiled-coil domain-containing protein 85B - Homo sapiens (Human) - CCDC85B gene & protein". https://www.uniprot.org/uniprot/Q15834.
- ↑ "NOP56 - Nucleolar protein 56 - Homo sapiens (Human) - NOP56 gene & protein". https://www.uniprot.org/uniprot/O00567.
- ↑ "FBL - rRNA 2'-O-methyltransferase fibrillarin - Homo sapiens (Human) - FBL gene & protein". https://www.uniprot.org/uniprot/P22087.
- ↑ "Glutamine methylation in histone H2A is an RNA-polymerase-I-dedicated modification". Nature 505 (7484): 564–8. January 2014. doi:10.1038/nature12819. PMID 24352239. Bibcode: 2014Natur.505..564T.
- ↑ "SIRT7-Dependent Deacetylation of Fibrillarin Controls Histone H2A Methylation and rRNA Synthesis during the Cell Cycle". Cell Reports 25 (11): 2946–2954.e5. December 2018. doi:10.1016/j.celrep.2018.11.051. PMID 30540930.
- ↑ "RPS6 - 40S ribosomal protein S6 - Homo sapiens (Human) - RPS6 gene & protein". https://www.uniprot.org/uniprot/P62753.
- ↑ "Epigenome-wide association study in whole blood on type 2 diabetes among sub-Saharan African individuals: findings from the RODAM study". International Journal of Epidemiology 48 (1): 58–70. February 2019. doi:10.1093/ije/dyy171. PMID 30107520.
- ↑ "Epigenome-wide association analysis of daytime sleepiness in the Multi-Ethnic Study of Atherosclerosis reveals African-American-specific associations". Sleep 42 (8): zsz101. August 2019. doi:10.1093/sleep/zsz101. PMID 31139831.
- ↑ "Prenatal Particulate Air Pollution and DNA Methylation in Newborns: An Epigenome-Wide Meta-Analysis". Environmental Health Perspectives 127 (5): 57012. May 2019. doi:10.1289/EHP4522. PMID 31148503.
- ↑ "Heritable DNA methylation marks associated with susceptibility to breast cancer". Nature Communications 9 (1): 867. February 2018. doi:10.1038/s41467-018-03058-6. PMID 29491469. Bibcode: 2018NatCo...9..867J.
- ↑ "Network Analysis of the Potential Role of DNA Methylation in the Relationship between Plasma Carotenoids and Lipid Profile". Nutrients 11 (6): 1265. June 2019. doi:10.3390/nu11061265. PMID 31167428.
- ↑ "Protein microarray analysis identifies human cellular prion protein interactors". Neuropathology and Applied Neurobiology 35 (1): 16–35. February 2009. doi:10.1111/j.1365-2990.2008.00947.x. PMID 18482256.
Original source: https://en.wikipedia.org/wiki/C7orf50.
Read more |