A study evaluated AI-generated explanations of echocardiogram results, finding that 73% were accurate, relevant, and understandable enough to be sent to patients without any modifications.
Summary: A recent study published in JACC Cardiovascular Imaging found that an AI program was able to generate explanations of echocardiogram results that were largely accurate, relevant, and easy for patients to understand, with 73% of the AI-generated reports deemed suitable for direct patient communication without modifications. The study highlights the potential of AI to improve patient comprehension and reduce anxiety related to complex medical reports, though it also underscores the need for human oversight due to occasional inaccuracies.
Three Key Takeaways:
- High Accuracy and Clarity: The study found that 73% of AI-generated explanations for echocardiogram results were accurate, relevant, and understandable enough to be sent to patients without any edits, suggesting that AI can play a significant role in improving patient communication.
- Human Oversight Required: While the majority of AI-generated reports were accurate, 16% contained inaccuracies, such as incorrect interpretations of medical conditions, highlighting the importance of human oversight to ensure patient safety.
- Potential to Reduce Patient Anxiety: The use of AI in generating patient-friendly explanations of complex test results could help reduce patient anxiety and the burden on clinicians, as the study indicated that AI explanations were generally well-received and more understandable than traditional reports.
An artificial intelligence (AI) program created explanations of heart test results that were in most cases accurate, relevant, and easy to understand by patients, a new study finds.
The study, published in JACC Cardiovascular Imaging, addressed the echocardiogram (echo), which uses sound waves to create pictures of blood flowing through the heart’s chambers and valves.
Echo reports include machine-generated numerical measures of function, as well as comments from the interpreting cardiologist on the heart’s size, the pressure in its vessels, and tissue thickness, which can signal the presence of disease. In the form typically generated by doctors, the reports are difficult for patients to understand, often resulting in unnecessary worry, say the study authors.
Addressing Patient Comprehension Challenges
To address the issue, NYU Langone Health has been testing the capabilities of a form of AI that generates likely options for the next word in any sentence based on how people use words in context on the internet. A result of this next-word prediction is that such generative AI “chatbots” can reply to questions in simple language. However, AI programs—which work based on probabilities instead of “thinking” and may produce inaccurate summaries—are meant to assist, not replace, human providers.
In March 2023, NYU Langone requested from OpenAI, the company that created the chatGPT chatbot, access to the company’s latest generative AI tool, GPT4. NYU Langone Health licensed one of the first “private instances” of the tool, which freed clinicians to experiment with AI using real patient data while adhering to privacy rules.
Coming out of that effort, the current study analyzed 100 doctor-written reports on a common type of echo test to see whether GPT4 could efficiently generate human-friendly explanations of test results. Five board-certified echocardiographers evaluated AI-generated echo explanations on five-point scales for accuracy, relevance, and understandability, and either agreed or strongly agreed that 73% were suitable to send to patients without any changes.
AI Performance and Accuracy
All AI explanations were rated either “all true” (84%) or mostly correct (16%). In terms of relevance, 76% of explanations were judged to contain “all of the important information,” 15% “most of it,” 7% “about half,” and 2% “less than half.” None of the explanations with missing information were rated as “potentially dangerous,” the authors say.
“Our study, the first to evaluate GPT4 in this way, shows that generative AI models can be effective in helping clinicians to explain echocardiogram results to patients,” says corresponding author Lior Jankelson, MD, PhD, associate professor of medicine at the NYU Grossman School of Medicine and artificial intelligence leader for cardiology at NYU Langone, in a release. “Fast, accurate explanations may lessen patient worry and reduce the sometimes overwhelming volume of patient messages to clinicians.”
The federal mandate for the immediate release of test results to patients through the 21st Century Cures Act in 2016 has been linked to dramatic increases in the number of inquiries to clinicians, say the study authors. Patients receive raw test results, do not understand them, and grow anxious while they wait for clinicians to reach them with explanations, the researchers say.
The Role of Human Oversight
Ideally, clinicians would advise patients about their echocardiogram results the instant they are released, but that is delayed as providers struggle to manually enter large amounts of related information into the electronic health record.
“If dependable enough, AI tools could help clinicians explain results at the moment they are released,” says first study author Jacob Martin, MD, a cardiology fellow at NYU Langone. “Our plan moving forward is to measure the impact of explanations drafted by AI and refined by clinicians on patient anxiety, satisfaction, and clinician workload.”
The new study also found that 16% of the AI explanations contained inaccurate information. In one error, the AI echocardiogram report stated that “a small amount of fluid, known as a pleural effusion, is present in the space surrounding your right lung.” The tool has mistakenly concluded that the effusion was small, an error known in the industry as an AI “hallucination.” The researchers emphasized that human oversight is important to refine drafts from AI, including correcting any inaccuracies before they reach patients.
Patient Perspectives and Future Directions
The research team also surveyed participants without clinical backgrounds who were recruited to get the perspective of lay people on the clarity of AI explanations. In short, they were well received, according to the authors. Non-clinical participants found 97% of AI-generated rewrites more understandable than the original reports, which reduced worry in many cases.
“This added analysis underscores the potential of AI to improve patient understanding and ease anxiety,” Martin says in a release. “Our next step will be to integrate these refined tools into clinical practice to enhance patient care and reduce clinician workload.”