This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Human Factors, is properly cited. The complete bibliographic information, a link to the original publication on https://humanfactors.jmir.org, as well as this copyright and license information must be included.
Artificial intelligence (AI), such as machine learning (ML), shows great promise for improving clinical decision-making in cardiac diseases by outperforming statistical-based models. However, few AI-based tools have been implemented in cardiology clinics because of the sociotechnical challenges during transitioning from algorithm development to real-world implementation.
This study explored how an ML-based tool for predicting ventricular tachycardia and ventricular fibrillation (VT/VF) could support clinical decision-making in the remote monitoring of patients with an implantable cardioverter defibrillator (ICD).
Seven experienced electrophysiologists participated in a near-live feasibility and qualitative study, which included walkthroughs of 5 blinded retrospective patient cases, use of the prediction tool, and questionnaires and interview questions. All sessions were video recorded, and sessions evaluating the prediction tool were transcribed verbatim. Data were analyzed through an inductive qualitative approach based on grounded theory.
The prediction tool was found to have potential for supporting decision-making in ICD remote monitoring by providing reassurance, increasing confidence, acting as a second opinion, reducing information search time, and enabling delegation of decisions to nurses and technicians. However, the prediction tool did not lead to changes in clinical action and was found less useful in cases where the quality of data was poor or when VT/VF predictions were found to be irrelevant for evaluating the patient.
When transitioning from AI development to testing its feasibility for clinical implementation, we need to consider the following: expectations must be aligned with the intended use of AI; trust in the prediction tool is likely to emerge from real-world use; and AI accuracy is relational and dependent on available information and local workflows. Addressing the sociotechnical gap between the development and implementation of clinical decision-support tools based on ML in cardiac care is essential for succeeding with adoption. It is suggested to include clinical end-users, clinical contexts, and workflows throughout the overall iterative approach to design, development, and implementation.
Ventricular tachycardia and ventricular fibrillation (VT/VF) are potentially lethal cardiac arrhythmias, which constitute a growing challenge to health care systems worldwide [
Artificial intelligence (AI), such as machine learning (ML), shows great promise for improving clinical decision-making in cardiac diseases by outperforming statistical-based models [
However, few prediction outcome algorithms based on ML have been implemented in cardiology clinics because of the challenges during transitioning from algorithm development to real-world implementation. While studies of medical AI-based tools that undergo prospective clinical validation are emerging [
This study addresses the sociotechnical gap between the development and implementation of a clinical decision-support tool based on ML for the prediction of VT/VF in remote monitoring of ICD patients. The aim of this study was to explore the feasibility and clinician preimplementation perspectives of using a prediction tool for improved workflows. Therefore, this study does not provide algorithmic validation per se but instead answers questions about the clinical feasibility and workflow integration of a decision-support tool based on ML.
This study was conducted at the remote monitoring center at Rigshospitalet, Copenhagen University Hospital, Denmark, which is a large tertiary hospital covering all aspects of treatments in cardiology and is among the largest centers in Europe having more than 4000 patients with cardiac implanted electronic devices in remote follow-up. The study was organized in 3 stages (
Overall study design. ML: machine learning.
A prediction tool was developed for improving the support for clinical decision-making in ICD remote monitoring based on the random forest ML method, and it consisted of a risk prediction algorithm of VT/VF within 30 days. The prediction tool was designed to show alarm status (yes/no), risk probability (%), and ranking of the 5 most and least important parameters for the prediction, using the LIME technique [
The random forest ML method [
Feature engineering was carried out in collaboration between 2 data scientists (MKHH and CV) and a cardiologist consultant (SZD) during 5 co-design workshops. A total of 48 features (referred to as parameters when discussed with the study participants) were developed, and the following 2 main principles were adopted: aggregating episodes by day and building a historic snapshot for days leading up to the arrhythmic event. To provide the clinical end-user with algorithm explainability, the LIME technique [
The prediction tool on a paper printout as shown to study participants (Case 3, see Table 2). The output shows the alarm (yes/no), risk probability (%), and up to 5 most important parameters for increasing and decreasing the likelihood of ventricular tachycardia and ventricular fibrillation within 30 days. To the right: example pictures of electrophysiologists conducting near-live case walkthroughs.
Seven medical doctors specialized in electrophysiology (ie, cardiologists treating patients with cardiac arrhythmia) were selected for participation from a convenience sample (
A selection of 5 retrospective patient cases (
Participating electrophysiologists.
Participant | Sex | Age (years) | Title | Years since obtaining specialist certification in cardiology |
1 | Female | 52 | Consultant cardiologist, MD, PhD | 11 |
2 | Male | 61 | Professor, consultant cardiologist, MD, DMSc | 23 |
3 | Male | 55 | Consultant cardiologist, MD, PhD | 14 |
4 | Male | 43 | Cardiologist, MD, PhD | 2 |
5 | Male | 62 | Consultant cardiologist, MD, DMSc | 28 |
6 | Male | 44 | Cardiologist, MD, PhD | 2 |
7 | Male | 47 | Consultant cardiologist, MD, DMSc | 9 |
Case overview with patient summary, current implantable cardioverter defibrillator transmission information, and prediction tool information.
Case number | Patient summary | Current ICDa transmission | Prediction tool | |||||
|
|
Transmission type | Primary episode type | ICD treatment | Transmission summary | 30-day VTb/VFc risk probability | Alarm raised (prediction outcome) | |
1 | Male, age 63 years, ischemic heart failure, left ventricular assist device | Automated | VT/VF | ATPd | 3 VT/VF; 36 sensing episodes; 217 VT-NSe | 58.6 | Yes (true positive) | |
2 | Female, age 67 years, dilated cardiomyopathy | Automated | VT/VF | Shock | 1 VT/VF; 1 VT-NS; 20 min of AFf since the last transmission | 14.4 | No (true negative) | |
3 | Female, age 40 years, dilated cardiomyopathy | Automated | VT/VF | Shock | 2 VT/VF; 4 VT-NS | 35.4 | Yes (true positive) | |
4 | Male, age 61 years, ischemic heart failure | Patient initiated | AF | None | 12 hours of AF since the last transmission | 1.2 | No (true negative) | |
5 | Male, age 73 years, ischemic heart failure | Automated | AF | None | 14 hours of AF since the last session; 26 VT-NS | 7.8 | No (true negative) |
aICD: implantable cardioverter defibrillator.
bVT: ventricular tachycardia.
cVF: ventricular fibrillation.
dATP: antitachycardia pacing.
eVT-NS: nonsustained ventricular tachycardia.
fAF: atrial fibrillation.
A combined feasibility and qualitative interview study was undertaken based on a retrospective case study design. The primary aim of the study was to address the following 4 main questions about the feasibility of the prediction tool using quantitative measures: Does use of the tool lead to change in clinical action? Does it support decision-making? Are visualizing parameters useful? Can it reduce time spent? The secondary aims were to understand the electrophysiologist’s immediate reactions to using the prediction tool, including qualifying the quantitative feasibility measures against qualitative dimensions based on interviews. Electrophysiologists were invited to conduct a “near-live” clinical simulation of decision-making based on walkthroughs of the 5 patient cases (
“Near-live” case walkthroughs were performed with inspiration from Li et al [
Data from electrophysiologists’ reactions to the interview study were analyzed using an inductive qualitative approach based on grounded theory [
Overall, the electrophysiologists did not change their decisions on clinical action when presented with the 30-day VT/VF arrhythmia prediction (
Effect of the prediction tool on electrophysiologists’ decision-making.
Question and answer | Total (N=35), n (%) | Case 1 (N=7), n (%) | Case 2 (N=7), n (%) | Case 3 (N=7), n (%) | Case 4 (N=7), n (%) | Case 5 (N=7), n (%) | |
|
|
|
|
|
|
|
|
|
Yes | 1 (3) | 1 (14) | 0 (0) | 0 (0) | 0 (0) | 0 (0) |
|
No | 34 (97) | 6 (86) | 7 (100) | 7 (100) | 7 (100) | 7 (100) |
|
|
|
|
|
|
|
|
|
Strongly disagree/disagree | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) |
|
Neither agree nor disagree | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) |
|
Agree/strongly agree | 35 (100) | 7 (100) | 7 (100) | 7 (100) | 7 (100) | 7 (100) |
|
|
|
|
|
|
|
|
|
Strongly disagree/disagree | 3 (9) | 0 (0) | 1 (14) | 1 (14) | 0 (0) | 1 (14) |
|
Neither agree nor disagree | 6 (17) | 1 (14) | 0 (0) | 0 (0) | 4 (57) | 1 (14) |
|
Agree/strongly agree | 26 (74) | 6 (86) | 6 (86) | 6 (86) | 3 (43) | 5 (71) |
|
|
|
|
|
|
|
|
|
Strongly disagree/disagree | 19 (54) | 5 (71) | 3 (43) | 5 (71) | 2 (29) | 4 (57) |
|
Neither agree nor disagree | 5 (14) | 1 (14) | 1 (14) | 1 (14) | 2 (29) | 0 (0) |
|
Agree/strongly agree | 11 (31) | 1 (14) | 3 (43) | 1 (14) | 3 (43) | 3 (43) |
|
|
|
|
|
|
|
|
|
Strongly disagree/disagree | 8 (23) | 2 (29) | 1 (14) | 0 (0) | 3 (43) | 2 (29) |
|
Neither agree nor disagree | 4 (11) | 1 (14) | 0 (0) | 1 (14) | 1 (14) | 1 (14) |
|
Agree/strongly agree | 23 (66) | 4 (57) | 6 (86) | 6 (86) | 3 (43) | 4 (57) |
|
|
|
|
|
|
|
|
|
Strongly disagree/disagree | 11 (31) | 3 (43) | 1 (14) | 2 (29) | 3 (43) | 2 (29) |
|
Neither agree nor disagree | 3 (9) | 1 (14) | 0 (0) | 0 (0) | 1 (14) | 1 (14) |
|
Agree/strongly agree | 21 (60) | 3 (43) | 6 (86) | 5 (71) | 3 (43) | 4 (57) |
|
|
|
|
|
|
|
|
|
Strongly disagree/disagree | 13 (37) | 4 (57) | 2 (29) | 2 (29) | 2 (29) | 3 (43) |
|
Neither agree nor disagree | 4 (11) | 1 (14) | 0 (0) | 0 (0) | 3 (43) | 0 (0) |
|
Agree/strongly agree | 18 (51) | 2 (29) | 5 (71) | 5 (71) | 2 (29) | 4 (57) |
In 23 (66%) of the case walkthroughs, the electrophysiologists agreed that the prediction tool supported their decision-making, whereas in 8 (23%) of the walkthroughs they disagreed. Finding the prediction tool supporting was particularly pertinent in both patient cases 2 and 3, where 6 (86%) of the electrophysiologists agreed, and the prediction tool was found to assist decision-making by confirming the electrophysiologists’ clinical evaluations and expectations of an increasing risk of VF/VT (
The prediction tool’s visualization of the most important parameters in the prediction of increased or decreased probability of VT/VF arrhythmia was found useful when the electrophysiologists agreed with the parameters presented. In patient cases 2 and 3, 6 (86%) and 5 (71%) of the electrophysiologists agreed that showing important parameters supported their decision-making. However, when the parameters represented poor data quality (
In general, presentation of important parameters provided explainability and supported decision-making by resembling the clinical interpretation process of what counts for or against the occurrence of VT/VF (
The electrophysiologists found that the prediction tool could enable a reduction in time for decision-making in cases where they trusted the predictions. Moreover, 5 (71%) of the electrophysiologists agreed that the prediction tool can help reach a decision faster (Case 2 and Case 3). However, agreement was lower (29% in Case 1 and 57% in Case 5) when predictions were found to be uncertain or less useful for handling patients.
Several of the electrophysiologists expressed that once they become familiar with the system, they expect the AI tool will speed up decision-making and reduce the diagnostic workload. This indicates that establishing trust in AI predictions is essential. One of the electrophysiologists explained how time can be saved when personal trust in the prediction tool is developed (
Across all cases, several electrophysiologists found that the probability score and the presentation of important parameters can reduce information search time. Typically, electrophysiologists must retrieve valuable information by clicking through multiple webpages in the ICD manufacturer’s web-based system, which the prediction tool summarizes in a table. Some electrophysiologists also speculated that the tool could support decision-making when patient input is inaccessible, such as when a patient does not answer the phone (
Acceptability of the prediction tool was high when patient cases concerned VT/VF, as the risk predictions were found to be relevant. However, several electrophysiologists had expectations that the prediction tool would bring new and groundbreaking insights (
There was consensus that high precision is important for prediction tool adoption to happen. Several of the electrophysiologists emphasized that the positive or negative predictive value should be as unambiguous as possible, showing either low or high risk when the alarm is raised (
Several of the electrophysiologists emphasized that there is a high demand for workflow support in remote monitoring of cardiac device patients. They found the prediction tool useful for supporting more efficient prioritization and identification of important patient cases (
To ensure successful implementation, some electrophysiologists described how remote monitoring clinics may want to be able to adjust the threshold of the prediction tool to fit local workflows and prioritization rules. For example, technicians and electrophysiologists should be able to configure the prediction tool and decide on related actions, such as “no need to take action” or “need to contact the patient.” Relatedly, several electrophysiologists explained that indications of low-risk patients are especially useful in supporting clinicians in handling low-risk transmissions (
In bridging the sociotechnical gap between the development of ML-based tools and clinical implementation, this study explored the feasibility and clinical perspectives of using a prediction tool for improved workflows in ICD remote monitoring. We found that the feasibility of the ML-based tool is promising when the intended use of the tool is aligned with expectations, that is, by providing support for decision-making, visualizing useful information, and reducing time spent. The results also show that an actionable prediction tool is one that presents the reason for why the algorithm deemed as it did, such as in this study, by highlighting important data to be used for clinical evaluation and enabling clinicians to assess the algorithm’s outcome against their own evaluation [
However, the current prediction tool did not lead to change in clinical action, suggesting that ML and explainability techniques do not outperform specialized and experienced electrophysiologist evaluations, but at best confirm and support the interpretation of complex ICD device information along with a promise for a less time-consuming clinical workflow.
The contribution of this paper lies in the implications of the qualitative results suggesting that clinical end-users, clinical contexts, and workflows must be included throughout an overall iterative approach to design, development, and implementation. In the following sections, we will discuss the qualitative results concerning the sociotechnical challenges and implementation of ML-based tools for clinical decision support.
In cases where misalignment emerged between the electrophysiologists’ expectations and intended use, the prediction tool was considered less useful and at best “nice to have” for clinical decision-making. For example, in cases where the ICD transmissions revolved around other types of arrhythmias than what the prediction tool was designed for and in cases where the electrophysiologists expected that the prediction tool should be capable of outperforming their own evaluation, disappointment was raised about the performance of the underlying AI algorithm. This aligns with recent studies that reported on physicians’ high expectations and attitudes toward medical AI [
Trust is another key factor for user acceptance and adoption of AI technologies. Trust is typically considered an issue in creating transparent and understandable algorithmic behavior, as opposed to seeing the prediction tool as a black box [
While AI algorithms have been validated and have been shown to have similar or higher accuracy than humans, recent studies of AI deployment in clinical settings report that professional autonomy, workflow, and local sociotechnical factors have impacts on how accuracy is perceived and used in clinical practice [
The findings in this study are limited to the small number of study participants and patient cases. One electrophysiologist (PKJ) participated in co-design workshops, resulting in potential positive bias. Patient cases were selected to represent diversity in prediction capabilities, rather than the distribution in clinical practice, which may weaken the generalizability of the results. Only cases where the prediction tool provided true-positive and true-negative prediction outcomes were used, which means that the clinical feasibility of ML in cases with false-positive and false-negative outcomes [
This study shows that a tool based on ML for the prediction of VT/VF in remote monitoring of ICD patients has the potential to support electrophysiologists’ decision-making. While the prediction tool was regarded as “nice to have” rather than “need to have” in its current form, the tool demonstrated potential for supporting clinical decision-making, as it provided reassurance, increased confidence, and indicated the potential for reducing information search time, as well as enabled delegation of decisions to nurses and technicians. The findings also indicate that trust in the prediction tool, acceptable data quality, and clearly defined intended use are decisive for end-user acceptance and that adoption hinges on successful clinical implementation. This suggests that clinical end-users’ sociotechnical contexts and workflows need to be taken into consideration early on and continuously throughout a participatory design process to address the sociotechnical gap between the development and implementation of medical AI in cardiac care.
Questionnaire used before the electrophysiologists are presented with the prediction tool results.
Questionnaire after the electrophysiologists have received the prediction tool results.
The semistructured interview guide.
artificial intelligence
implantable cardioverter defibrillator
long short-term memory
machine learning
ventricular fibrillation
ventricular tachycardia
We wish to thank the participating electrophysiologists at the Department of Cardiology, Rigshospitalet, Copenhagen University Hospital, Copenhagen, Denmark. The research and technology developed (working title: SafeHeart) were supported by the European Data Pitch Innovation Program H2020–732506 and led by TOA.
TOA is a co-founder of Vital Beats, which has commercial interests in the technology under investigation. MKHH and CV were full-time employees of Vital Beats. JHS, SZD, MCHL, and SM are affiliated with Vital Beats as advisors or independent researchers. The authors have no other conflicts of interest to disclose.