A team led by Dr. Hana Haver from the University of Maryland Medical Intelligent Imaging Center in Baltimore found that ChatGPT showed mostly appropriate responses when presented with scenarios that address fundamental breast cancer prevention and screening concepts. The product is built on top of OpenAI's GPT-3.5 and GPT-4 families of large language models and has been fine-tuned using both supervised and reinforcement learning techniques.
"Radiologists can use this information to counsel patients who might be seeking healthcare information on the Internet and through new technologies like chatbots," corresponding author Dr. Paul Yi told AuntMinnie.com.
ChatGPT has made waves in recent months with its large language model, and radiologists have been debating whether it can be useful for imaging, weighing the benefits and risks to the profession. A recent study showed that ChatGPT could generate appropriate recommendations for common questions regarding cardiovascular disease prevention, for example.
Haver's group sought to assess the appropriateness of ChatGPT's responses to common questions about breast cancer prevention and screening. They developed a 25-question test for the AI to answer. The questions were informed by the BI-RADS Atlas and the researchers' clinical experience in tertiary care breast imaging departments.
The team submitted each question to ChatGPT three times, and three fellowship-trained breast radiologists graded each set of responses based on clinical judgment as appropriate, inappropriate, or unreliable. The reviewers also evaluated responses in patient-facing material such as hospital websites and chatbot responses to patient questions.
ChatGPT generated appropriate responses for 22 of the questions (88%). The reviewers deemed one response to be inappropriate while the other two were deemed unreliable due to inconsistent content in responses. The inappropriate response regarded scheduling mammography around COVID-19 vaccination, which Yi said was "outdated." The unreliable responses had to do with breast cancer prevention and where one could obtain breast cancer screening.
"When we asked ChatGPT the same question three times, it gave different answers each time with the content changing not infrequently," Yi said for the responses categorized as unreliable.
Despite the promise shown in the study, the authors wrote that physician oversight when using these tools is "critical, given the presence of inappropriate and inconsistent responses, consistent with previously cautioned pitfalls of ChatGPT in the context of radiology."
They added that further work is needed to understand how prompts impact recommendations for medical advice provided by ChatGPT, but they also encouraged future studies involving large language models in healthcare education and counseling.
Yi told AuntMinnie.com that while the use of ChatGPT and other large language models is "still very early," radiologists may see rapid introduction of this tool over the next year.
"We have started looking at ChatGPT's use for other imaging use cases including lung cancer screening, as well as how patients might comprehend the recommendations made by ChatGPT and other chatbots," he said.
Copyright © 2023 AuntMinnieEurope.com