Researchers at Mount Sinai used artificial intelligence to identify 17 genes with rare coding variants that help explain the molecular basis of coronary artery disease.
Summary: Researchers at the Icahn School of Medicine at Mount Sinai used AI tools to identify rare coding variants in 17 genes related to coronary artery disease. Published in Nature Genetics, this study leverages an in silico score developed from electronic health records to reveal new genetic insights and potential personalized treatments for coronary artery disease. The study aims to overcome limitations of traditional methods and highlights the potential of machine learning in uncovering novel genetic factors in complex diseases.
Three Key Takeaways:
- Identification of Genetic Variants: Researchers identified rare coding variants in 17 genes that contribute to the understanding of the molecular basis of coronary artery disease, potentially leading to new treatment avenues.
- Use of In Silico Score: The study utilized an in silico score derived from electronic health records to represent coronary artery disease, incorporating diverse clinical features and leveraging machine learning models trained on data from over 600,000 individuals.
- Potential for Personalized Medicine: The findings emphasize the role of machine learning in uncovering genetic insights that traditional methods might miss, opening doors to personalized approaches in cardiovascular care and other complex diseases.
Using an advanced artificial intelligence tool, researchers at the Icahn School of Medicine at Mount Sinai have identified rare coding variants in 17 genes that shed light on the molecular basis of coronary artery disease, the leading cause of morbidity and mortality worldwide.
The discoveries, detailed in Nature Genetics, reveal genetic factors impacting heart disease that open new avenues for targeted treatments and personalized approaches to cardiovascular care.
The investigators used an in silico, or computer-derived, score for coronary artery disease (ISCAD) that holistically represents coronary artery disease, as described in a previous paper by the team in The Lancet. The ISCAD score incorporates hundreds of different clinical features from the electronic health record, including vital signs, laboratory test results, medications, symptoms, and diagnoses.
To build the score, they trained machine learning models on the electronic health records of 604,914 individuals across the UK Biobank, All of Us Research Program, and BioMe Biobank in this comprehensive meta-analysis. The score was then tested for association with rare and ultra-rare coding variants found in the exome sequences of these individuals.
Further Investigation of Discovered Genes
In addition, the research team conducted further investigation into the discovered genes to study their roles in causal coronary artery disease risk factors, clinical manifestations of coronary artery disease, and their connections with coronary artery disease status in traditional large-scale genome-wide association studies, among other factors.

“Our findings help us understand how these 17 genes are involved in coronary artery disease. Some of these genes are already known to influence heart disease development, while others have never been linked to it before,” says Ron Do, PhD, senior study author and the Charles Bronfman Professor in Personalized Medicine at Icahn Mount Sinai, in a release. “Our study shows how machine learning tools can uncover genetic insights that traditional methods might miss when comparing cases and controls. This could lead to new ways to identify biological mechanisms of heart disease or gene targets for treatment.”
Because they occur in only a small percentage of individuals, rare coding variants may have a significant impact on disease risk or susceptibility when present, according to the researchers. Therefore, studying these variants is essential to understanding the genetic basis of diseases and can inform therapeutic targets.
Overcoming Traditional Method Challenges
The study was driven by the challenges faced, over the last decade, in identifying rare coding variants associated with coronary artery disease using traditional methods relying on diagnosed cases and controls. Diagnostic codes’ limitations in capturing the complexity of coronary artery disease prompted the researchers to explore new avenues of investigation.
“Our previous Lancet paper showed that a machine learning model trained with electronic health records can generate an in silico score for coronary artery disease, capturing disease across its spectrum,” says lead author Ben Omega Petrazzini, BS, associate bioinformatician in Do’s lab at Icahn Mount Sinai, in a release. “Based on these findings, we hypothesized that the in-silico score for [coronary artery disease] could reveal novel rare coding variants related to [coronary artery disease] by offering a more holistic view of the disease.”
Next, the investigators plan to further investigate the role of the identified genes in coronary artery disease biology and explore potential applications of machine learning in the genetic study of other complex diseases, as part of their ongoing efforts to advance understanding of disease mechanisms, discover new treatments, and improve patient outcomes.
Photo 109397740 © Wave Break Media Ltd | Dreamstime.com