Researchers at Gladstone Institutes, the Broad Institute of MIT and Harvard, and Dana-Farber Cancer Institute have turned to artificial intelligence (AI) to help them understand how large networks of interconnected human genes control the function of cells, and how disruptions in those networks cause disease.

Large language models, also known as foundation models, are AI systems that learn fundamental knowledge from massive amounts of general data, and then apply that knowledge to accomplish new tasks—a process called transfer learning. These systems have recently gained mainstream attention with the release of ChatGPT, a chatbot built on a model from OpenAI.

In the new work, published in the journal Nature, Gladstone Assistant Investigator Christina Theodoris, MD, PhD, developed a foundation model for understanding how genes interact. The new model, dubbed Geneformer, learns from massive amounts of data on gene interactions from a broad range of human tissues and transfers this knowledge to make predictions about how things might go wrong in disease.

Theodoris and her team used Geneformer to shed light on how heart cells go awry in heart disease. This method, however, can tackle many other cell types and diseases too.

“Geneformer has vast applications across many areas of biology, including discovering possible drug targets for disease,” says Theodoris, who is also an assistant professor in the Department of Pediatrics at UC San Francisco. “This approach will greatly advance our ability to design network-correcting therapies in diseases where progress has been obstructed by limited data.”

Theodoris designed Geneformer during a postdoctoral fellowship with X. Shirley Liu, PhD, former director of the Center for Functional Cancer Epigenetics at Dana-Farber Cancer Institute, and Patrick Ellinor, MD, PhD, director of the Cardiovascular Disease Initiative at the Broad Institute—both authors of the new study.