AI model 'popEVE' predicts the likelihood of unknown human gene mutations causing disease, beating Google DeepMind's AlphaMissense

Harvard Medical School researchers have unveiled popEVE , a new AI model designed to more accurately diagnose rare genetic diseases, in a bid to challenge
Proteome-wide model for human disease genetics | Nature Genetics
https://www.nature.com/articles/s41588-025-02400-1

Harvard's popEVE AI Model Identifies 123 Novel Disease Genes - WinBuzzer
https://winbuzzer.com/2025/11/24/harvards-popeve-ai-model-identifies-123-novel-disease-genes-xcxwbn/
New AI model enhances diagnosis of rare diseases
https://www.ft.com/content/bc49e334-776b-41d0-a9be-fb0c29c54853
On November 24, 2025, local time, a research team at Harvard Medical School published a paper in the academic journal Nature Genetics about 'popEVE,' an AI model for diagnosing rare genetic diseases. By calibrating the severity of mutations across the entire proteome , popEVE successfully identified 123 new candidate genes for developmental disorders.
popEVE is also excellent in that it has succeeded in significantly reducing false positives, a persistent flaw in existing AI, such as AlphaMissense , an AI developed by Google DeepMind that predicts the harmfulness of genetic mutations.
Google DeepMind unveils AI 'AlphaMissense', which could help identify the causes of genetic diseases by predicting which genetic mutations are harmful - GIGAZINE

Despite the rapid expansion of genome sequencing in clinical practice, diagnosis rates for rare genetic diseases remain low, with as few as 25% of patients in some cohorts receiving a definitive genetic diagnosis.
Clinicians frequently face a diverse array of variants of unknown clinical significance ( VUS ), genetic mutations with unknown impact on human health. VUS create a diagnostic bottleneck, hindering the identification of disease-causing variants. Furthermore, until now, it has often been impossible to distinguish between variants that cause severe childhood disease and those that cause milder disease that only manifests later in life, posing a critical challenge in pediatric care.
popEVE successfully bridges the gap between the two by setting a stricter pathogenicity threshold. In tests, popEVE dramatically reduces false-positive predictions in the general population, flagging only 11% of individuals as carriers of severe mutations. In comparison, Google DeepMind's AlphaMissense classifies approximately 44% of the general population as having a severe mutation. By filtering out the noise, popEVE allows clinicians to focus on the mutations most likely to cause genetic disease.
The efficacy of popEVE was rigorously validated in a developmental disorder elucidation study, GeneDx data, and a meta-cohort of 31,058 patients with severe developmental disorders provided by Radboud University Medical Center.

In this massive dataset, popEVE's high-confidence severity threshold revealed a 15-fold enrichment of pathogenic variants, five times greater than other leading AI models such as
Perhaps most importantly for the field of genetics, popEVE's ability to discover entirely novel disease associations has led to the identification of 123 novel candidate genes associated with developmental disorders, 119 of which are identifiable at the single mutation level.
Notably, of the 123 novel candidate genes, 31 were detected using only missense mutations , suggesting that popEVE can detect pathogenic signals that cannot be detected by conventional enrichment analysis.
Additionally, validation of popEVE discoveries has already led to clinical outcomes, with 25 of the 123 novel candidate genes being independently confirmed by other laboratories and formally deposited in the Database of Developmental Disorder Genes and Phenotypes (DDG2P).
Furthermore, when popEVE was applied to cases of de novo mutations (DNMs), 7% of variants were determined to be severe, compared with only 0.5% in healthy controls, demonstrating a high degree of segregation between pathogenic and benign mutations.

'Our goal was to develop an AI model that could rank mutations according to disease severity and provide prioritized, clinically meaningful personal genomic information,' said Deborah Marks, a professor of systems biology at Harvard Medical School. She emphasized that popEVE is an AI model designed to translate statistical findings into concrete clinical outcomes.
While existing AI models such as EVE and AlphaMissense excel at ranking mutations within a single gene, they struggle to compare the severity of mutations across different genes, resulting in high scores being assigned to mutations that disrupt protein function but do not necessarily cause severe disease in humans.
popEVE solves this problem by combining deep evolutionary data and human population constraints from EVE and ESM-1v , and the research team also uses data from the UK Biobank and gnomAD v2 to identify natural resistance mutations.
Using a latent Gaussian process, we calibrate the evolutionary scores for observed human variants to create a unified deleteriousness score. This adjustment enables singleton analysis, a major clinical advancement, which can prioritize causative variants using only the child's exome .
Identifying de novo mutations typically requires trio sequencing (the affected child and both parents), which is often cost-prohibitive or logistically impossible.

Rose Orenbuch, a researcher in Marks' lab, expressed optimism about popEVE's integration into clinical workflows. 'We feel we're one step closer to popEVE being a valuable part of the daily pipeline for more rapid diagnosis of genetic diseases,' she said.
Related Posts:







