(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .



Diverse patients’ attitudes towards Artificial Intelligence (AI) in diagnosis [1]

['Christopher Robertson', 'University Of Arizona', 'Tucson', 'Arizona', 'United States Of America', 'Boston University', 'Boston', 'Massachusetts', 'Andrew Woods', 'Kelly Bergstrand']

Date: 2023-06

Artificial intelligence (AI) has the potential to improve diagnostic accuracy. Yet people are often reluctant to trust automated systems, and some patient populations may be particularly distrusting. We sought to determine how diverse patient populations feel about the use of AI diagnostic tools, and whether framing and informing the choice affects uptake. To construct and pretest our materials, we conducted structured interviews with a diverse set of actual patients. We then conducted a pre-registered (osf.io/9y26x), randomized, blinded survey experiment in factorial design. A survey firm provided n = 2675 responses, oversampling minoritized populations. Clinical vignettes were randomly manipulated in eight variables with two levels each: disease severity (leukemia versus sleep apnea), whether AI is proven more accurate than human specialists, whether the AI clinic is personalized to the patient through listening and/or tailoring, whether the AI clinic avoids racial and/or financial biases, whether the Primary Care Physician (PCP) promises to explain and incorporate the advice, and whether the PCP nudges the patient towards AI as the established, recommended, and easy choice. Our main outcome measure was selection of AI clinic or human physician specialist clinic (binary, “AI uptake”). We found that with weighting representative to the U.S. population, respondents were almost evenly split (52.9% chose human doctor and 47.1% chose AI clinic). In unweighted experimental contrasts of respondents who met pre-registered criteria for engagement, a PCP’s explanation that AI has proven superior accuracy increased uptake (OR = 1.48, CI 1.24–1.77, p < .001), as did a PCP’s nudge towards AI as the established choice (OR = 1.25, CI: 1.05–1.50, p = .013), as did reassurance that the AI clinic had trained counselors to listen to the patient’s unique perspectives (OR = 1.27, CI: 1.07–1.52, p = .008). Disease severity (leukemia versus sleep apnea) and other manipulations did not affect AI uptake significantly. Compared to White respondents, Black respondents selected AI less often (OR = .73, CI: .55-.96, p = .023) and Native Americans selected it more often (OR: 1.37, CI: 1.01–1.87, p = .041). Older respondents were less likely to choose AI (OR: .99, CI: .987-.999, p = .03), as were those who identified as politically conservative (OR: .65, CI: .52-.81, p < .001) or viewed religion as important (OR: .64, CI: .52-.77, p < .001). For each unit increase in education, the odds are 1.10 greater for selecting an AI provider (OR: 1.10, CI: 1.03–1.18, p = .004). While many patients appear resistant to the use of AI, accuracy information, nudges and a listening patient experience may help increase acceptance. To ensure that the benefits of AI are secured in clinical practice, future research on best methods of physician incorporation and patient decision making is required.

Funding: This study was funded by the National Institutes of Health (3R25HL126140-05S1 to CR and AW). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Copyright: © 2023 Robertson et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

To our knowledge, this is the first large-scale population-based survey experiment, with random assignment to realistic clinical vignettes systematically manipulated to analyze a range of factors that could influence AI uptake with patients. Moreover, our study is enriched to allow sufficient sample size to compare AI uptake across five different racial/ethnic groups, including those who have historically shown lower levels of trust in the healthcare system.

To the extent that patients retain the right of informed consent with regard to their own healthcare, much will depend on patient attitudes towards AI. To that end, we used qualitative and quantitative methods to study diverse patient populations’ views about AI in medicine.

There are many drivers to the development and uptake of AI in healthcare, including commercial incentives and physician attitudes [ 29 , 30 ]. Trustworthiness may depend on the relationship established between users, infrastructures, technologies, and practitioners, rather than the certainty and accuracy of the technology [ 31 , 32 ].

More generally, research suggests that patients’ trust in their physicians is an essential component of effective healing [ 16 , 17 ]. Black, Hispanic, and Native Americans reportedly have lower levels of trust in their physicians [ 18 , 19 , 20 ]. These communities have experienced harm historically from components of the medical system (e.g., the Tuskegee Syphilis Study). Yet trust can be enhanced when patients and their providers have similar background, geography, or ethnic groups [ 21 , 22 , 23 ]. Patients may be concerned about human physicians being biased by their financial relationships with pharmaceutical companies [ 24 ] or biased by implicit racial stereotypes [ 25 ]. Although initial forays into algorithmic decisions gave rise to similar concerns [ 26 ], AI systems may be rigorously designed, tested, and continuously monitored to minimize racial or financial biases in healthcare [ 27 , 28 ].

Yet acceptance of AI may depend on specific features of the system and how the choice is framed, and there may be differences among groups of patients [ 11 ]. Outside of healthcare, consumers have been shown to more often trust AI systems for objective tasks, while subjective tasks are viewed as more appropriate for humans [ 12 ]. Some qualitative research suggests that lower levels of patient education are associated with lower trust in computerization [ 13 ]. Other small studies suggest that AI may be acceptable if human physicians ultimately remain in control [ 14 ]. And, although patients may prefer human physicians, some work suggests that they may better adhere to advice coming from algorithms [ 15 ].

Research shows that patients prefer human doctors to AI-powered machines in diagnosis, screening, and treatment [ 7 – 10 ]. In an early study, patients were more likely to follow medical advice from a physician than a computer and were less trustful of computers as providers of medical advice [ 7 ]. Other work shows that patients are less trusting of doctors that rely on non-human decision aids [ 8 , 9 ]. More recently, in a series of studies, patients were less willing to schedule an appointment to be diagnosed by a robot, and they were willing to pay significantly more money for a human provider, with a reported perception that AI providers are less able to account for patients’ unique characteristics [ 10 ].

Artificial intelligence (AI) is poised to transform healthcare. Today, AI is used to analyze tumors in chest images [ 1 ], regulate implanted devices [ 2 ], and select personalized courses of care [ 3 ]. Despite the promise of AI, there is broad public skepticism about AI in a range of domains from transportation to criminal justice to healthcare [ 4 , 5 ]. Doctors and patients tend to primarily rely on doctors’ clinical judgment, even when it is at odds with statistical judgment [ 6 ].

Methods

We conducted two study phases, one qualitative and one quantitative. In the qualitative phase (February to December 2020), we conducted structured interviews with 24 patients recruited for racial and ethnic diversity to understand their reactions to current and future AI technologies. In the quantitative phase (January and February 2021), we used an internet-based survey experiment oversampling Black, Hispanic, Asian, and Native American populations. Both phases placed respondents as mock patients into clinical vignettes to explore whether they would prefer to have an AI system versus a doctor for diagnosis and treatment and under what circumstances.

We chose this mixed-methods design for a few reasons. First, because this is the first study of its kind, we wanted to ensure that the vignettes driving our quantitative survey were realistic and intuitive; the qualitative pre-study helped us to gauge participant reaction. Second, large-scale quantitative surveys often raise a number of questions about why people respond the way they do. The mixed-method design allows us to accomplish something that neither approach—purely quantitative nor purely qualitative—would achieve on its own.

Development and oversight of the survey To develop clinical vignettes, we consulted physicians specializing in cardiology, pulmonology, hematology, and sleep medicine to develop vignettes, which were reviewed by physician co-author (MJS), for authenticity. This study was determined to be exempt by the Human Subjects Protection Program (Institutional Review Board) at the University of Arizona, and all subjects consented.

Qualitative pre-study Our qualitative study pre-tested the vignettes and generated hypotheses. For 30–60 minute qualitative interviews in Spanish or English, we recruited 24 individuals from clinics in Tucson, Arizona, including 10 White, 8 Hispanic, 3 Black, 2 Native American, and 1 Asian patients. In total, 16 were females and 8 were males. Ages ranged from 19 to 92 years, with most over 50, as would be expected given our recruitment from a cardiac clinic. Educational achievement was relatively high, with 7 of the subjects having a graduate degree, and 5 having a bachelor’s degree. The nature of these interviews was semi-structured. All participants were given the same script describing the topic, the research design, and the sample vignettes. The prompts were deliberately open-ended to allow participants to share reactions and feedback to make sure the vignettes were understood. We recorded all of our interviews and had them transcribed to better inform the design of the quantitative survey. After getting informed consent, each interview began with an open-ended question asking the participant to think back to a difficult medical decision, and in particular “who or what influenced your decision?” This prompted a wide array of responses. The most common influence on participants’ decision-making was their primary care physician, though several also noted family and friends influencing their decision. We asked another open-ended question: “generally how do you feel about doctors relying on computer systems to make treatment decisions for you?” This prompted a broad array of responses, with some participants expressing fear or anxiety (and occasionally humor) about the increasing use of machines in everyday life. We often probed to distinguish routine use of electronic health records (EHRs) systems, which were quite familiar to respondents, versus computerized diagnostic tools, which were less familiar. The core of our interviews were the vignettes which asked participants to imagine themselves in particular medical scenarios. We started by asking participants to imagine their primary care physician recommending a change to their diet and exercise, based on a family history of leukemia and advice from “the MedX computer system with data from millions of other patients.” This provoked a mild reaction, with most participants noting they were already being told to mind their diet and exercise habits. Then we raised the stakes. Participants were told to suppose they start to feel tired and achy, and their primary care physician wants a second opinion from either an oncologist or a new AI-driven lab that “is more accurate than the oncologist.” The participants were asked whether they’d choose one or the other, or pay $1000 to visit both. Tellingly, the majority of patients said they’d prefer to see only the oncologist, despite being told the oncologist was less accurate than the AI lab. We presented another vignette involving sleep apnea, where participants were asked whether they’d rather visit a traditional sleep clinic requiring an overnight stay away from home and interpretation by a human physician versus an at-home device that relies on self-placed sensors and AI diagnostic interpretation. We saw a broad range of views in response to this vignette, with several participants having strong and perhaps idiosyncratic reactions based on their personal experiences dealing with sleep apnea and visiting sleep clinics. Overall, while some patients expressed confidence that AI systems could achieve greater accuracy in diagnosis and treatment compared to a physician, several patients called on their own experiences with technology to suggest that an AI system could be fallible. Other patients, especially those who were non-White, expressed lack of trust with the healthcare system more generally and recounted anecdotes where they felt unheard or mistreated. Patients nearly uniformly said they would rely heavily on their physicians to guide their choice of whether an AI system would be used for their diagnosis or treatment, but most nonetheless emphasized that they would generally want to know of such use, suggesting that it is material for their informed consent. Several patients indicated that they had greater confidence in their human physicians than an AI system to personalize treatment decisions to the patient’s own situation. The technology was more attractive for younger and more educated patients. Several patients invoked their belief in God as being important to their healthcare decisions, and some suggested confidence that God would work through human physicians.

Quantitative survey experimental design and materials We designed our quantitative phase as a blinded, randomized survey experiment in factorial design, manipulating eight variables by substituting text in or out of the base vignette. Respondents were block randomized by race/ethnicity to experimental conditions. We counterbalanced whether respondents answered certain covariate questions before or after our vignettes and primary outcomes. The full text of the vignettes and manipulations are shown at osf.io/9y26x. These materials were based on the vignettes that we tested in the qualitative phase, with refinements and clarifications based on feedback from those participants. The base clinical vignette was split into two parts, with an initial segment laying out the patient’s history and primary care physician’s (PCP’s) initial impressions. As one of the experimental manipulations, all respondents saw either a leukemia or a sleep apnea version of the base case, with the PCP explaining that leukemia could be fatal if not properly diagnosed and treated, while sleep apnea was described as interfering with the patient’s comfort and lifestyle. Respondents were asked to explain their “reactions and feelings at this point in the story.” A final segment of the vignette explained that, “your doctor would like to get a second opinion on whether you have leukemia, and if so get the best treatment plan,” or “to determine whether you actually have an apnea, determine its type, and determine the best course of treatment, your physician suggests a sleep study.” The PCP then presented the choice of AI versus physician specialist, followed by the experimental manipulations. Table 1 displays the description of the two providers. Table 2 shows a summary of the manipulations in either Level 1 or Level 2, which followed this presentation. In several of the manipulations, when Level 1 was randomly selected, the vignettes were simply silent about the issue. PPT PowerPoint slide

PNG larger image

TIFF original image Download: Table 1. Presentation of Provider Choice. https://doi.org/10.1371/journal.pdig.0000237.t001 PPT PowerPoint slide

PNG larger image

TIFF original image Download: Table 2. Experimental Conditions. https://doi.org/10.1371/journal.pdig.0000237.t002 Our primary outcome variable (“AI uptake”) was binary, “Which provider would you choose to diagnose your health problem? Dr. Williams, the specialist physician [or] The Med-X clinic, the AI computer system,” with presentation order randomized. Finally, we presented several debriefing questions. These included an attention-check question to test whether the respondent could identify the disease featured in the vignette read a few minutes prior and a self-assessment of the respondent’s understanding of the vignette, on a ten-point scale.

[END]
---
[1] Url: https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000237

Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/