Artificial intelligence (AI) is known for enabling deeper insights into drug development, identifying patterns and molecules that may otherwise go unnoticed. Now it is poised to make similar contributions to gene editing. A few companies are using AI to develop gene editing tools that are more specific and more efficacious.
CRISPR systems such as CRISPR-Cas9 revolutionized gene editing, but genomic rearrangements are becoming a real concern for in vivo therapies, and nonspecific editing has been a longstanding issue that affects subsequent generations of cells. Zinc finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs) also have challenges, thus underscoring the need for improvements.
Using rational design to find new gene editors, however, hasn’t yielded anything notably different from CRISPR-Cas9, says Chelsea Trengrove, PhD, CEO of Neoclease, and a platform approach to their development tends to limit efficacy and specificity.
AI technology is emerging as a possible solution to enhance the precision of multiple types of gene editors. And the addition of generative AI lets scientists look beyond what exists in nature.
GenAI-created editors
Neoclease’s custom AI model develops gene-specific editors in silico. Eventually, top candidates may direct gene editing for humans in vivo, using the CRISPR nucleases, ZFNs, TALENs, and other gene editing nucleases.
“We’re using a generative AI model,” Trengrove says. “This is a large language model that’s trained on millions of known proteins that cut DNA.” The idea, she adds, isn’t to create a workhorse enzyme that can do everything, but to optimize every editor for a specific gene of interest.
Trengrove explains that generative AI enables Neoclease to create a knowledge network of variables to understand how editors can be optimized, and to make a virtue of hallucination such that truly novel sequences can be generated. The goal, she stresses, is to generate additional editors that are “optimized and weighted in the direction we want to push them toward.”
“It’s almost like ChatGPT for proteins,” Trengrove remarks. “While some associate hallucinations with errors, we leverage them… as an innovation tool to generate novel and effective protein designs.”
Generating potential gene editors is just the first step. After tens of thousands of novel sequences have been generated that can be optimized toward specific features—certain binding energies, degrees of polarity, or domains, for example—the features are fed through a series of computational checkpoints. Those checkpoints identify which editors are best suited to advance into in vitro validation based upon their features and functionality. Of the tens of thousands of nucleases the company has created in silico, it has, to date, advanced about 7,000.
Some of these editors are about half the size of the CRISPR- Cas9 system, Trengrove notes. They include the miniaturized nucleases developed by Jin Liu, PhD, chief technology officer and co-founder of Neoclease and tenured professor of pharmaceutical sciences in Texas. According to Trengrove, Liu “has shown that some of her miniaturized editors have comparable cleavage, energy, and efficacy in vitro, and have reduced off-target effects by sixfold.”
These small editors can be packaged into adeno-associated vectors or similar vehicles to deliver them to tissues throughout the body. “We’re actually looking at targeting the brain for Parkinson’s disease,” Trengrove says.
Currently, most of the testing has been done in silico, with only limited in vitro validation. Neoclease plans to take these editors into mouse and zebrafish models in mid-2025, and that the company anticipates Investigational New Drug–enabling studies will begin in 2026. Trengove adds that the company is also “working on a small deal with a pharma company to evaluate thousands of nucleases in vitro.”
AI CRISPR-like technology
In 2017, Wayne Danter, MD, CEO of 123Genetix, pioneered the development of artificial human stem cells and organoids for medical research. This work led to the development of aiHumanoid simulations for virtual drug trials. “To produce a specific type of cell, I had to… alter the cell’s genetic makeup,” Danter says. “I did that by creating a symbolic representation of a gene and then adding to it or deleting it.” The AI system he developed to do that, DeepNEU, simulates the CRISPR-Cas9 enzyme.
DeepNEU is built around an intelligent database. It functions like a text editor for genes to enable rapid prototyping and quality checks. It is fully developed and is already in use as a complement to CRISPR-Cas9 gene editing.
The advantage of AI-enabled gene editing is specificity. Off-target effects are avoided so that when the virtual results are compared to those of CRISPR-Cas9 experiments, any differences can be identified and, perhaps, minimized or eliminated.
Rather than train algorithms on data and outcomes, DeepNEU makes use of a healthcare-oriented Wise Learning process. As Danter indicated in a recent bioRxiv preprint (DOI: 10.1101/2022.06.18.496679), Wise Learning “combines fuzzy cognitive map simulations, with data from multiple experts and a generic decision-making system.” He added that the Wise Learning process “should also explore available learning algorithms including deep learning methods when available.” Essentially, Wise Learning uses an unsupervised (untrained) approach based on experiences. According to Danter, AI technology that incorporates Wise Learning can emulate human thought more closely.
DeepNEU applications yield “a very large matrix of relationships and weights,” Danter says. “The basic information includes gene–gene and gene–protein relationships.” 123Genetix’s gene relationship network has approximately 65 million neurons.
Danter’s passion is to find effective treatments for rare diseases, and he is proud that DeepNEU has been used for multiple studies of rare diseases. He indicates that access to DeepNEU has been free for rare disease organizations, and that he is currently “bringing on board a number of pharma partners interested in using the technology.”
This fall, 123Genetix plans to release a version of the aiHumanoid that includes Serious Second Look. This addition enables the AI system to pause to consider whether it accurately answered the question before presenting results. If the results fall short, the AI reoptimizes on subsequent attempts.
Danter is also validating an AI system that is designed to use simulated sentience to make ethical decisions, specifically, decisions in line with the “first do no harm” principle of the Hippocratic Oath. Danter notes that the system is not self-aware.
Zinc finger improvements
Marcus Noyes, PhD, co-founder of newly formed TBG Tx and assistant professor of biochemistry and molecular pharmacology at New York University Langone Health, is developing an AI-enabled gene editor for ZFNs with his collaborator and co-founder, Philip Kim, PhD, professor of molecular genetics and professor of computer science at the University of Toronto. This gene editor, ZFDesign, is ready for commercial use.
Since publishing ZFDesign in 2023, Noyes and his team have increased the editor’s precision. “The first version of the model was trained to understand how to design an array of ZFs, but it didn’t really know which of the thousands of designs returned for each target would be the most specific,” Noyes says. “We needed to teach the model which target sequences and which proteins will provide the most precise activity genome wide.”
The latest iteration of ZFDesign incorporates several improvements. “We added more interface data to increase our understanding of compatibility,” Noyes details. “We also screened the specificity of hundreds of ZFNs to train the model.” As a result of this work, ZFDesign can identify the most precise options and thus reduce off-targeting. “We’ve also modified the model to express all the ZFNs in the array continuously, rather than skipping bases between pairs,” Noyes adds. “This reduces the modularity in the design.” He says he expects to publish the updated model in 2025.
The most notable aspect of ZFDesign, Noyes says, is the gene editor’s ability to understand whether trends regarding modifications to a ZFN could be generalized to subsequent designs: “In the past, you could ask questions about how modifications of a designed ZFN array might change its on- or off-target activity, but it was never really clear if the trends were generalizable or were specific to just that protein, because you would need to design, validate, and test several arrays. By contrast, ZFDesign allows the simple design of any number of proteins for any number of target sequences, making the confirmation of generalizability a trivial process.”
How well this model works depends on function and precision requirements. Regarding activation and repression—the areas for which he has the most data.” Noyes says, “In general, about 80% of the designs will produce a change in target gene expression.”
About 30% of the designs have more than fivefold activation and more than 70% repression when assayed by transient transfection. Precision for highly functional designs appears high. However, Noyes cautions, “We have only tested off-target activity for around 20 constructs designed with the new model.” About half have shown minimal to no off-target activity without optimization. And according to Noyes, even better results are obtained with optimization: “Typically, we can develop a candidate for any target gene with single-target resolution. … If we design 10, we expect about 8 will do something, 3 will be really good, and those 3 should have limited off-target activity.”
ZFDesign is being used in the research community now. “One scientist tried three activators in cardiomyocytes, and two worked very well,” Noyes reports. “Another group created a nearly complete set of precise probes that bind each of the human centromeres, allowing them to be labeled in live cells. Yet another group found four potent repressors in neurons from a screen of 12 candidates.
“We are finding that the amount of off-target activity is often tied to the mechanism. For example, activation, repression, labeling, and cutting all seem to have different optimal affinity regimes. Moving forward, we hope our model will be precise enough that users will only need to test a few designs, and that any optimization will be a straightforward affinity adjustment to match the mechanism.”
Additional tools
Several other companies are creating tools that support the use of AI for gene editing. In October, Shape Therapeutics published two preprints. One detailed how it engineered guide RNA to fit into adeno-associated viruses. The other discussed how the Sharpes’s system, which is based on the company’s DeepREAD technology, allows therapeutic guide RNA to be expressed within cells.
Last spring, Profluent announced that its open source, AI-based gene editor, OpenCRISPR-1, successfully edited the human genome. The company reported the gene editor generates “millions of diverse CRISPR-like proteins that do not occur in nature.”
AI tools for gene editing are helping scientists enact more precise edits, which lowers off-target effects for multiple gene editing technologies. Ultimately, this may help make gene editing more accessible.