Biology:EsmGFP

From HandWiki

esmGFP is an artificial green fluorescent protein designed using the AI model ESM3, developed by EvolutionaryScale.[1][2] The protein does not exist in nature and was generated through a simulation of 500 million years of molecular evolution.[3][4]

Development

Scientists at EvolutionaryScale and the Arc Institute developed esmGFP by training ESM3 on a dataset of 770 billion protein sequences. The AI model, designed to predict and generate protein structures, mimicked evolutionary processes over 500 million simulated years to create functional proteins beyond those found in nature.[5]

EvolutionaryScale, founded by former researchers from Meta, developed ESM3 as one of the largest AI models applied to protein design. The model's ability to generate new fluorescent proteins has attracted significant investment.[1]

ESM3 functions similarly to a language model, predicting protein sequences and structures. Scientists prompted ESM3 to generate a fluorescent protein by focusing on residues responsible for fluorescence. EMS3 designed a protein that shares 58% sequence similarity with its closest known counterpart, a fluorescent protein from the bubble-tip sea anemone (Entacmaea quadricolor).[3][5]

The resulting protein was synthesized and tested, exhibiting fluorescence.[5]

Applications

Green fluorescent proteins are widely used in biological research, particularly for tagging and tracking cellular processes. AI-designed proteins like esmGFP could lead to advancements in medicine, environmental science, and synthetic biology. Potential applications include enzyme development for plastic degradation, novel disease treatments, and tools for exploring protein evolution.[5]

Scientific Review

The research on esmGFP was initially released as a preprint and later peer-reviewed and published in Science on January 16, 2025.[3] Independent scientists have noted the potential of AI-driven protein engineering while also cautioning that such methods do not replicate the full complexity of natural selection.[3]

References