top of page

AI writes Genomes from scratch : EVO 2

  • Writer: Ditto Mohan
    Ditto Mohan
  • Mar 24
  • 2 min read

A groundbreaking AI model called Evo 2 is making waves in the field of biology. Developed by scientists at the Arc Institute, Stanford University, and Nvidia, Evo 2 is the largest generative AI model for biology to date. This powerful AI has been trained on a massive dataset of roughly 128,000 genomes and 9.3 trillion DNA letter pairs.Evo 2 can write whole chromosomes and even small genomes from scratch. This ability stems from its deep understanding of how DNA mutations affect proteins, RNA, and overall health. Evo 2 is also shedding light on the mysterious non-coding regions of DNA, which has been a puzzle to scientists for a very long time.


Evo 2 is the largest generative AI model for biology to date
Evo 2 is the largest generative AI model for biology to date

The potential applications of Evo 2 are vast. The model can predict how mutations will affect a protein's function and how genes operate differently across various cell types. This knowledge is invaluable for researchers all over the world working to understand and treat diseases. Evo 2 can also be a game-changer for synthetic biology, helping scientists design entirely new genomes.


Unlike the original Evo model, which focused on microbial genomes, Evo 2 includes genes from humans, plants, yeast, and other complex eukaryotic organisms. This advancement is crucial as eukaryotic genomes are more intricate, featuring regulatory elements that control gene expression and are essential for multicellular life. Evo2 was trained on a vast dataset called OpenGenome2, comprising 9.3 trillion DNA base pairs from 128,000 genomes across all domains of life. This extensive training enables Evo 2 to predict the functional impacts of genetic variations and generate genomic sequences with unprecedented accuracy. The model can handle up to 1 million nucleotides at once, allowing it to capture long-range patterns in DNA sequences, a challenge for previous models.


Evo 2 has already demonstrated its impressive capabilities. It outperformed existing models predicting the effects of mutations in BRCA1, a gene associated with breast cancer, achieving an accuracy rate of over 90 percent. The original Evo model was used to design new CRISPR gene-editing tools and even a full-length bacterial genome from scratch. Evo 2 has continued this trend, generating unique mitochondrial DNA genomes and a minimal bacterial genome using a protein prediction tool ALPHAFOLD 3.

Comments


bottom of page