Original Paper
Abstract
Background: ChatGPT (OpenAI) is a powerful tool for a wide range of tasks, from entertainment and creativity to health care queries. There are potential risks and benefits associated with this technology. In the discourse concerning the deployment of ChatGPT and similar large language models, it is sensible to recommend their use primarily for tasks a human user can execute accurately. As we transition into the subsequent phase of ChatGPT deployment, establishing realistic performance expectations and understanding users’ perceptions of risk associated with its use are crucial in determining the successful integration of this artificial intelligence (AI) technology.
Objective: The aim of the study is to explore how perceived workload, satisfaction, performance expectancy, and risk-benefit perception influence users’ trust in ChatGPT.
Methods: A semistructured, web-based survey was conducted with 607 adults in the United States who actively use ChatGPT. The survey questions were adapted from constructs used in various models and theories such as the technology acceptance model, the theory of planned behavior, the unified theory of acceptance and use of technology, and research on trust and security in digital environments. To test our hypotheses and structural model, we used the partial least squares structural equation modeling method, a widely used approach for multivariate analysis.
Results: A total of 607 people responded to our survey. A significant portion of the participants held at least a high school diploma (n=204, 33.6%), and the majority had a bachelor’s degree (n=262, 43.1%). The primary motivations for participants to use ChatGPT were for acquiring information (n=219, 36.1%), amusement (n=203, 33.4%), and addressing problems (n=135, 22.2%). Some participants used it for health-related inquiries (n=44, 7.2%), while a few others (n=6, 1%) used it for miscellaneous activities such as brainstorming, grammar verification, and blog content creation. Our model explained 64.6% of the variance in trust. Our analysis indicated a significant relationship between (1) workload and satisfaction, (2) trust and satisfaction, (3) performance expectations and trust, and (4) risk-benefit perception and trust.
Conclusions: The findings underscore the importance of ensuring user-friendly design and functionality in AI-based applications to reduce workload and enhance user satisfaction, thereby increasing user trust. Future research should further explore the relationship between risk-benefit perception and trust in the context of AI chatbots.
doi:10.2196/55399
Keywords
Introduction
ChatGPT (OpenAI) [
] is a powerful tool for a wide range of tasks, from entertainment and creativity to health care queries [ ]. However, there are potential benefits associated with this technology. For instance, it can help summarize large amounts of text data [ , ] or generate programming code [ ]. There is also the notion that ChatGPT may potentially assist with health care tasks [ - ]. However, the risks associated with using ChatGPT can hinder its adoption in various high-risk domains. These risks include the potential for inaccuracies and lack of citation relevance in scientific content generated by ChatGPT [ ], ethical issues (copyright, attribution, plagiarism, and authorship) [ ], the risk of hallucination (inaccurate information that sounds plausible scientifically) [ ], and the possibility of biased content and inaccurate information due to the quality of training data sets generated prior to the year 2021 [ ].In the discourse concerning the deployment of ChatGPT and similar artificial intelligence (AI) technologies, it is sensible to recommend their use primarily for tasks a human user can execute accurately. Few studies have advocated using the technology under human supervision [
, ]. Encouraging users to rely on such tools for tasks beyond their competence is risky, as they may need help to evaluate the AI’s output effectively. The strength of ChatGPT lies in its ability to automate more straightforward, mundane tasks, freeing human users to invest their time and cognitive resources into critical tasks (not vice versa). This approach to technology use maintains a necessary balance, leveraging AI for efficiency gains while ensuring that critical decision-making remains within the purview of human expertise.As we transition into the subsequent phase of ChatGPT deployment, establishing realistic performance expectations and understanding users’ perceptions of risk associated with its use are crucial in determining the successful integration of this AI technology. Thus, understanding users’ perceptions of ChatGPT becomes essential, as these perceptions significantly influence their usage decisions [
]. For example, suppose users believe that ChatGPT’s capabilities surpass human knowledge. In that case, they may be tempted to use it for tasks such as self-diagnosis, which could lead to potentially harmful outcomes if the generated information is mistaken or misleading. Conversely, a realistic appraisal of the limitations and strengths of technology would encourage its use in low-risk, routine tasks and foster a safer, more effective integration into our everyday lives.Building upon the importance of user perceptions and expectations, we must also consider that the extent to which users trust ChatGPT hinges mainly on the perception of its accuracy and reliability. As users witness the technology’s ability to perform tasks effectively and generate correct, helpful information, their trust in the system grows. This, in turn, allows them to offload routine tasks to the AI and focus their energies on more complex or meaningful endeavors. Similarly, instances where the AI generates inaccurate or misleading information can quickly erode users’ perception of the technology. Users may become dissatisfied and lose trust if they perceive the technology as unreliable or potentially harmful, particularly if they have previously overestimated its capabilities. This underlines the importance of setting realistic expectations and accurately understanding the strengths and limitations of ChatGPT, which can help foster a healthy level of trust and satisfaction among users. Ultimately, establishing and maintaining trust and satisfaction are not a onetime event but an ongoing process of validating the AI’s outputs, understanding and acknowledging its limitations, and making the best use of its capabilities within a framework of informed expectations and continuous learning. This dynamic balance is pivotal for the effective and safe integration of AI technologies such as ChatGPT into various sectors of human activity.
In our prior work, we explored the impact of trust in the actual use of ChatGPT [
]. This study aims to explore a conceptual framework delving deeper into the aspects influencing user trust in ChatGPT.As shown in
, the proposed conceptual model is grounded in the well-established theories of technology acceptance and use, incorporating constructs such as performance expectancy, workload, satisfaction, risk-benefit perception, and trust to comprehensively evaluate user interaction with technology. Performance expectancy, derived from the core postulates of the technology acceptance model (TAM) [ ] and the unified theory of acceptance and use of technology (UTAUT) [ ], posits that the perceived use of the technology significantly predicts usage intentions. Workload, akin to effort expectancy, reflects the perceived cognitive and physical effort required to use the technology, where a higher workload may inversely affect user satisfaction—a construct that encapsulates the fulfillment of user expectations and needs through technology interaction. The risk-benefit perception embodies the user’s assessment of the technology’s potential advantages against its risks, intricately influencing both user satisfaction and trust. Trust, a pivotal determinant of technology acceptance [ ], signifies the user’s confidence in the reliability and efficacy of the technology. This theoretical framework thus serves to elucidate the multifaceted process by which users come to accept and use a technological system, highlighting the critical role of both cognitive appraisals and affective responses in shaping the technology adoption landscape.We explore the following hypotheses:
- Hypothesis 1: Perceived workload of using ChatGPT negatively correlates with user trust in ChatGPT.
- Hypothesis 2: Perceived workload of using ChatGPT negatively correlates with user satisfaction with ChatGPT.
- Hypothesis 3: User satisfaction with ChatGPT positively correlates with trust in ChatGPT.
- Hypothesis 4: User trust in ChatGPT is positively correlated with the performance expectancy of ChatGPT.
- Hypothesis 5: The risk-benefit perception of using ChatGPT is positively correlated with user trust in ChatGPT.
Methods
Ethical Considerations
The study obtained ethics approval from West Virginia University, Morgantown (protocol 2302725983). The study was performed in accordance with relevant guidelines and regulations. No identifiers were collected during the study, and all users were compensated for completing the survey through an audience paneling service. In compliance with ethical research practices, informed consent was obtained from all participants before initiating the survey. Attached to the survey was a comprehensive cover letter outlining the purpose of the study, the procedure involved, the approximate time to complete the survey, and assurances of anonymity and confidentiality. It also emphasized that participation was completely voluntary, and participants could withdraw at any time without any consequences. The cover letter also included the contact information of the researchers for any questions or concerns the participants might have regarding the study. Participants were asked to read through the cover letter information carefully and were instructed to proceed with the survey only if they understood and agreed to the terms described, effectively providing their consent to participate in the study.
Study Design
A semistructured, web-based questionnaire was disseminated to adult individuals within the United States who engaged with ChatGPT (version 3.5) at least once per month. Data collection took place between February and March 2023. The questionnaire was crafted using Qualtrics (Qualtrics LLC), and its circulation was handled by Centiment (Centiment LLC), a provider of audience-paneling services. Centiment’s services were used due to their extensive reach and ability to connect with a diverse and representative group via their network and social media. Their fingerprinting technology, which uses IP address, device type, screen size, and cookies, was used to guarantee the uniqueness of the survey respondents. Prior to the full-scale dissemination, a soft launch was carried out with 40 responses gathered. The purpose of a soft launch, a limited-scale trial of the survey, is to pinpoint any potential problems, such as ambiguity or confusion in questions, technical mishaps, or any other factors that might affect the quality of data obtained. The survey was made available to a larger audience following the successful soft launch.
shows the descriptive statistics of the survey questions used in this study. We developed 3 latent constructs based on the question: trust, workload, and performance expectancy, and 2 single question variables: satisfaction and risk-benefit perception. Participant responses to all the questions were captured using a 4-point Likert scale ranging from 1=strongly disagree to 4=strongly agree. These questions were adapted from constructs used in various models and theories such as the TAM, the theory of planned behavior, UTAUT, and research on trust and security in digital environments.
- Trust: Questions T1-T7 related to trust in AI systems were adapted from the trust building model [ ].
- Workload: WL1 and WL2 questions from the National Aeronautics and Space Administration Task Load Index for measuring perceived workload [ ].
- Performance expectancy: PE1-PE4 are about the perceived benefits of using the system, which is a central concept in TAM and UTAUT.
- Satisfaction: The single item relates to overall user satisfaction, a common measure in information systems success models [ ].
- Risk-benefit perception: Question addresses the user’s assessment of benefits relative to potential risks, an aspect often discussed in the context of technology adoption and use [ ].
These references provide a starting point for understanding the theoretical underpinnings of the survey used in this study. They are adapted from foundational works in information systems, human-computer interaction, and psychology that address trust, workload, performance expectancy, satisfaction, and the evaluation of benefits versus risks in technology use.
- Trust: Questions T1-T7 related to trust in AI systems were adapted from the trust building model [ ].
- Workload: WL1 and WL2 questions from the National Aeronautics and Space Administration Task Load Index for measuring perceived workload [ ].
- Performance expectancy: PE1-PE4 are about the perceived benefits of using the system, which is a central concept in TAM and UTAUT.
- Satisfaction: The single item relates to overall user satisfaction, a common measure in information systems success models [ ].
- Risk-benefit perception: Question addresses the user’s assessment of benefits relative to potential risks, an aspect often discussed in the context of technology adoption and use [ ].
These references provide a starting point for understanding the theoretical underpinnings of the survey used in this study. They are adapted from foundational works in information systems, human-computer interaction, and psychology that address trust, workload, performance expectancy, satisfaction, and the evaluation of benefits versus risks in technology use.
Survey items | Value, mean (SD) | ||
Trust (T) | |||
T1: ChatGPT is competent in providing the information and guidance I need | 3.20 (0.83) | ||
T2: ChatGPT is reliable in providing consistent and dependable information | 3.16 (0.80) | ||
T3: ChatGPT is transparent | 3.12 (0.86) | ||
T4: ChatGPT is trustworthy in the sense that it is dependable and credible | 3.17 (0.84) | ||
T5: ChatGPT will not cause harm, manipulate its responses, or create negative consequences for me | 3.10 (0.88) | ||
T6: ChatGPT will act with integrity and be honest with me | 3.19 (0.82) | ||
T7: ChatGPT is secure and protects my privacy and confidential information | 3.27 (0.81) | ||
Workload (WL) | |||
WL1: Using ChatGPT was mentally demanding | 3.21 (0.75) | ||
WL2: I had to work hard to use ChatGPT | 2.20 (0.98) | ||
Performance expectancy (PE) | |||
PE1: ChatGPT can help me achieve my goals | 3.24 (0.77) | ||
PE2: ChatGPT can reduce my workload | 3.22 (0.78) | ||
PE3: ChatGPT improves my work efficiency | 3.21 (0.84) | ||
PE4: ChatGPT helps me make informed and timely decisions | 3.26 (0.79) | ||
Satisfaction (S) | |||
S: I am satisfied with ChatGPT | 3.24 (0.76) | ||
Risk-benefit perception (R) | |||
R: The benefits of using ChatGPT outweigh any potential risks | 3.20 (0.80) |
Statistical Analysis and Model Validation
To test our hypotheses and structural model, we used the partial least squares structural equation modeling (PLS-SEM) method, a widely used approach for multivariate analysis. PLS-SEM enables the estimation of complex models with multiple constructs, indicator variables, and structural paths, without making assumptions about the data’s distribution [
]. This method is beneficial for studies with small sample sizes that involve many constructs and items [ ]. PLS-SEM is a suitable method because of its flexibility and ability to allow for interaction between theory and data in exploratory research [ ]. The analyses were performed using the SEMinR package in R (R Foundation for Statistical Computing) [ ]. We started by loading the data set collected for this study using the reader package in R. We then defined the measurement model. This consisted of 5 composite constructs: trust, performance expectancy, workload, risk-benefit perception, and satisfaction. Trust was measured with 7 items (T1 through T7), performance expectancy with 4 items (PE1 through PE4), and workload with 2 items (WL1 and WL2), while risk-benefit perception and satisfaction were each measured with a single item. We also evaluated the convergent and discriminant validity of the latent constructs, which we assessed using 3 criteria: factor loadings (>0.50), composite reliability (>0.70), and average variance extracted (>0.50). We used the Heterotrait-Monotrait ratio (<0.90) to assess discriminant validity [ ].Next, we defined the structural model, which captured the hypothesized relationships between the constructs. The model included paths from risk-benefit perception, performance expectancy, workload, satisfaction to trust, and a path from workload to satisfaction. We then estimated the model’s parameters using the partial least squares method. This was done with the estimate_pls function in the seminar package. The partial least squares method was preferred due to its ability to handle complex models and its robustness to violations of normality assumptions. We performed a bootstrap resampling procedure with 10,000 iterations to obtain robust parameter estimates and compute 95% CIs. The bootstrapped model was plotted to visualize the estimates and their 95% CIs.
Results
Of 607 participants who completed the survey, 29.9% (n=182) used ChatGPT at least once per month, 26.1% (n=158) used it weekly, 24.5% (n=149) accessed it more than once per week, and 19.4% (n=118) interacted with it almost daily. A substantial portion of the participants held at least a high school diploma (n=204, 33.6%), and the majority had a bachelor’s degree (n=262, 43.1%). The primary motivations for participants to use ChatGPT were for acquiring information (n=219, 36%), amusement (n=203, 33.4%), and addressing problems (n=135, 22.2%). Some participants used it for health-related inquiries (n=44, 7.2%), while a few others (n=6, 1%) used it for miscellaneous activities such as brainstorming, grammar verification, and blog content creation.
shows the factor loading of the latent constructs in the model.The model explained 2% and 64.6% of the variance in “satisfaction” and “trust,” respectively. Reliability estimates, as shown in
, indicated high levels of internal consistency for all 5 latent variables, with Cronbach α and ρ values exceeding the recommended threshold of 0.7. The average variance extracted for the latent variables also exceeded the recommended threshold of 0.5, indicating that these variables are well-defined and reliable. Based on the root mean square error of approximation (RMSEA) fit index, our PLS-SEM model demonstrates a good fit for the observed data. The calculated RMSEA value of 0.07 falls below the commonly accepted threshold of 0.08, indicating an acceptable fit. The RMSEA estimates the average discrepancy per degree of freedom in the model, capturing how the proposed model aligns with the population covariance matrix. With a value below the threshold, it suggests that the proposed model adequately represents the relationships among the latent variables. This finding provides confidence in the model’s ability to explain the observed data and support the underlying theoretical framework.shows the estimated paths in our model. Hypothesis 1 postulated that as the perceived workload of using ChatGPT increases, user trust in ChatGPT decreases. Our analysis indicated a negative estimate for the path from workload to trust (–0.047). However, the T statistic (–1.674) is less than the critical value, and the 95% CI straddles 0 (–0.102 to –0.007), suggesting that the effect is not statistically significant. Therefore, we do not have sufficient evidence to support hypothesis 1.
Hypothesis 2 stated that perceived workload is negatively correlated with user satisfaction with ChatGPT. The results supported this hypothesis, as the path from workload to satisfaction showed a negative estimate (–0.142), a T statistic (–3.416) beyond the critical value, and a 95% CI (–0.223 to –0.061).
The data confirmed this relationship for hypothesis 3, which proposed a positive correlation between satisfaction with ChatGPT and trust in ChatGPT. The path from satisfaction to trust had a positive estimate (0.165), a T statistic (4.478) beyond the critical value, and a 95% CI (0.093-0.237).
Hypothesis 4 suggested that user performance expectations of ChatGPT increase with their trust in the technology. The analysis supported this hypothesis. The path from performance expectancy to trust displayed a positive estimate (0.598), a large T statistic (15.554), and a 95% CI (0.522-0.672). Finally, we examined hypothesis 5, which posited that user trust in ChatGPT increases as their risk-benefit perception of using the technology increases. The path from risk-benefit perception to trust showed a positive estimate (0.114). The T statistic (3.372) and the 95% CI (0.048-0.179) indicating this relationship is significant, but the positive sign suggests that as the perceived benefits outweigh the risks, the trust in ChatGPT increases. Therefore, hypothesis 5 is supported.
illustrates the structural model with all path coefficients.Bootstrapped loadings | Loadings | T statistic | 95% CI | |||
Trust (T) | ||||||
T1 | 0.788 | 41.998 | 0.750-0.823 | |||
T2 | 0.753 | 33.795 | 0.706-0.794 | |||
T3 | 0.773 | 40.293 | 0.733-0.808 | |||
T4 | 0.732 | 28.772 | 0.679-0.779 | |||
T5 | 0.673 | 21.066 | 0.607-0.732 | |||
T6 | 0.799 | 46.065 | 0.763-0.831 | |||
T7 | 0.779 | 38.088 | 0.736-0.816 | |||
Performance expectancy (PE) | ||||||
PE1 | 0.809 | 49.231 | 0.775-0.839 | |||
PE2 | 0.733 | 29.360 | 0.681-0.779 | |||
PE3 | 0.802 | 44.968 | 0.766-0.835 | |||
PE4 | 0.777 | 34.198 | 0.729-0.818 | |||
Workload (WL) | ||||||
WL1 | 0.856 | 28.883 | 0.789-0.905 | |||
WL2 | 0.913 | 44.872 | 0.869-0.950 |
Construct | Cronbach α | ρ C | AVEa | ρ A |
Performance expectation | 0.787 | 0.862 | 0.610 | 0.610 |
Workload | 0.729 | 0.870 | 0.771 | 0.968 |
Trust | 0.876 | 0.904 | 0.575 | 0.880 |
aAVE: average variance extracted.
Direct path | Bootstrap mean standard estimate (SD) | T statistic | 95% CI |
Risk-benefit perception→trust | 0.114 (0.034) | 3.372 | 0.048 to 0.179 |
Performance expectancy→trust | 0.598 (0.038) | 15.554 | 0.522 to 0.672 |
Workload→satisfaction | –0.142 (0.041) | –3.416 | –0.223 to –0.061 |
Workload→trust | –0.047 (0.028) | –1.674 | –0.102 to 0.007 |
Satisfaction→trust | 0.165 (0.037) | 4.478 | 0.093 to 0.237 |
Discussion
Main Findings
This study represents one of the initial attempts to investigate how human factors such as workload, performance expectancy, risk-benefit perception, and satisfaction influence trust in ChatGPT. Our results showed that these factors significantly influenced trust in ChatGPT, with performance expectancy exerting the strongest association, highlighting its critical role in fostering trust. Additionally, we found that satisfaction was a mediator in the relationship between workload and trust. At the same time, a positive correlation was observed between trust in ChatGPT and the risk-benefit perception. Our findings align with the May 23, 2023, efforts and initiatives of the Biden-Harris Administration to advance responsible AI research, development, and deployment [
]. The Administration recognizes that managing its risks is crucial and prioritizes protecting individuals’ rights and safety. One of the critical actions taken by the administration is the development of the artificial intelligence risk management framework (AI RMF). The AI RMF builds on the importance of trustworthiness in AI systems and is a framework for strengthening AI trustworthiness and promoting the trustworthy design, development, deployment, and use of AI systems, contributing to the need for our research [ ]. Our findings reveal the importance of performance expectancy, satisfaction, and risk-benefit perception in determining the user’s trust in AI systems. By addressing these factors, AI systems can be designed and developed to be more user-centric, aligning with the AI RMF’s emphasis on human-centricity and responsible AI.Workload and Trust in ChatGPT
Moreover, we found that reducing user workload is vital for enhancing user satisfaction, which in turn improves trust. This finding aligns with the AI RMF’s focus on creating AI systems that are equitable and accountable and that mitigate inequitable outcomes. Additionally, our research emphasizes the need for future exploration of other factors impacting user trust in AI technologies. Such endeavors align with the AI RMF’s vision of managing AI risks comprehensively and holistically, considering technical and societal factors. Understanding these factors is crucial for fostering public trust and enhancing the overall trustworthiness of AI systems, as outlined in the AI RMF [
].This study also extends and complements existing literature. Consistent with the observed patterns in studies on flight simulators, dynamic multitasking environments, and cyberattacks [
- ], we also found that higher perceived workload in using ChatGPT led to lower levels of trust in this technology. Our findings align with the existing research indicating a negative correlation between workload and user satisfaction [ ]. We observed that as the perceived workload of using ChatGPT increased, user satisfaction with the technology decreased. This outcome echoes the consensus within the literature that a high workload can lead to user dissatisfaction, particularly if the technology requires too much effort or time [ ]. The literature reveals that perceived workload balance significantly influences job satisfaction in work organizations [ ], and similar patterns are found in the well-being studies of nurses, where perceived workload negatively impacts satisfaction with work-life balance [ ]. While this study does not directly involve the workplace environment or work-life balance, the parallels between workload and satisfaction are evident. Furthermore, our research parallels the study suggesting that when providing timely service, AI applications can alleviate perceived workload and improve job satisfaction [ ]. ChatGPT, as an AI-powered chatbot, could potentially contribute to workload relief when it performs effectively and efficiently, thereby boosting user satisfaction.Satisfaction and Trust in ChatGPT
Our findings corroborate with existing literature, suggesting a strong positive correlation between user satisfaction and trust in the technology or service provider [
, , , - ]. We found that the users who expressed higher satisfaction with ChatGPT were more likely to trust the system, strengthening the premise that satisfaction can predict trust in a technology or service provider. Similar to the study on digital transaction services, our research indicates that higher satisfaction levels with ChatGPT corresponded with higher trust in the AI system [ ]. This suggests that when users are satisfied with the performance and results provided by ChatGPT, they tend to trust the technology more. The research on mobile transaction apps mirrors our findings, where we also discovered that satisfaction with ChatGPT use was a significant predictor of trust in the system [ ]. This showcases the importance of ensuring user satisfaction in fostering trust using innovative technologies like AI chatbots. The study on satisfaction with using digital assistants, where a positive relationship between trust and satisfaction was observed [ ], further aligns with our study. We also found a positive correlation between trust in ChatGPT and user satisfaction with this AI assistant.Performance Expectancy and Trust in ChatGPT
Our findings concerning the strong positive correlation between performance expectancy and trust in ChatGPT serve as an extension to prior literature. Similar findings have been reported in previous studies on wearables and mobile banking [
, ], where performance expectancy was positively correlated with trust. However, our results diverge from the observations of a recent study that did not find a significant impact of performance expectancy on trust in chatbots [ ]. Moreover, the observed mediating role of satisfaction in the relationship between workload and trust in ChatGPT is a notable contribution to the literature. While previous studies have demonstrated a positive correlation between workload reduction by chatbots and trust, as well as between trust and user satisfaction [ - ], the role of satisfaction as a mediator between workload and trust has not been explored. Finally, the positive correlation between the risk-benefit perception of using ChatGPT and trust aligns with the findings of previous studies [ - ]. Similar studies on the intention to use chatbots for digital shopping and customer service have found that trust in chatbots impacts perceived risk and is affected by the risk involved in using chatbots [ , ]. Our study adds to this body of research by confirming the same positive relationship within the context of ChatGPT.Limitations
Despite the valuable insights provided by this study, limitations should be acknowledged. First, our research focused explicitly on ChatGPT and may not be generalizable to other AI-powered conversational agents or chatbot technologies. Different chatbot systems may have unique characteristics and user experiences that could influence the factors affecting trust. Second, this study relied on self-reported data from survey responses, which may be subject to response biases and limitations inherent to self-report measures. Participants’ perceptions and interpretations of the constructs under investigation could vary, leading to potential measurement errors. Third, this study was cross-sectional, capturing data at a specific point in time. Longitudinal studies that track users’ experiences and perceptions over time provide a more comprehensive understanding of the dynamics between trust and the factors investigated. Finally, the sample of participants in this study consisted of individuals who actively use ChatGPT, which may introduce a self-selection bias. The perspectives and experiences of nonusers or individuals with limited exposure to AI-powered conversational agents may differ, and their insights could provide additional valuable perspectives.
Conclusions
This study examined the factors influencing trust in ChatGPT, an AI-powered conversational agent. Our analysis found that performance expectancy, satisfaction, workload, and risk-benefit perceptions significantly influenced users’ trust in ChatGPT. These findings contribute to understanding trust dynamics in the context of AI-powered conversational agents and provide insights into the factors that can enhance user trust. By addressing the factors influencing trust, we contribute to the broader goal of fostering responsible AI practices that prioritize user-centric design and protect individuals’ rights and safety. Future research should consider longitudinal designs to capture the dynamics of trust over time. Additionally, incorporating perspectives from diverse user groups and examining the impact of contextual factors on trust would further enrich our understanding of trust in AI technologies.
Data Availability
The data sets generated and analyzed during this study are available from the corresponding author on reasonable request.
Authors' Contributions
AC, the lead researcher, was responsible for the study’s conceptualization, the survey’s development, figure illustration, data collection and analysis, and manuscript writing. HS, the student author, was responsible for manuscript writing and conducting the literature review. Both authors collaborated throughout the research process and approved the final version of the manuscript for submission.
Conflicts of Interest
None declared.
References
- OpenAI: Models GPT-3. OpenAI. URL: https://platform.openai.com/docs/models/gpt-4 [accessed 2024-05-07]
- Shahsavar Y, Choudhury A. User intentions to use ChatGPT for self-diagnosis and health-related purposes: cross-sectional survey study. JMIR Hum Factors. 2023;10:e47564. [FREE Full text] [CrossRef] [Medline]
- Editorial N. Tools such as ChatGPT threaten transparent science; here are our ground rules for their use. Nature. 2023;613(7945):612. [FREE Full text] [CrossRef] [Medline]
- Moons P, Van Bulck L. ChatGPT: can artificial intelligence language models be of value for cardiovascular nurses and allied health professionals. Eur J Cardiovasc Nurs. 2023;22(7):e55-e59. [FREE Full text] [CrossRef] [Medline]
- Aljanabi M, Ghazi M, Ali AH, Abed SA. ChatGpt: open possibilities. Iraqi J Sci Comput Sci Math. 2023;4(1):62-64. [CrossRef]
- D'Amico RS, White TG, Shah HA, Langer DJ. I asked a ChatGPT to write an editorial about how we can incorporate chatbots into neurosurgical research and patient care…. Neurosurgery. 2023;92(4):663-664. [FREE Full text] [CrossRef] [Medline]
- Holzinger A, Keiblinger K, Holub P, Zatloukal K, Müller H. AI for life: trends in artificial intelligence for biotechnology. N Biotechnol. 2023;74:16-24. [FREE Full text] [CrossRef] [Medline]
- Sharma G, Thakur A. ChatGPT in drug discovery. ChemRxiv. 2023. URL: https://chemrxiv.org/engage/chemrxiv/article-details/63d56c13ae221ab9b240932f [accessed 2024-05-07]
- Mann DL. Artificial intelligence discusses the role of artificial intelligence in translational medicine: a interview with ChatGPT. JACC Basic Transl Sci. 2023;8(2):221-223. [FREE Full text] [CrossRef] [Medline]
- Chen TJ. ChatGPT and other artificial intelligence applications speed up scientific writing. J Chin Med Assoc. 2023;86(4):351-353. [FREE Full text] [CrossRef] [Medline]
- Liebrenz M, Schleifer R, Buadze A, Bhugra D, Smith A. Generating scholarly content with ChatGPT: ethical challenges for medical publishing. Lancet Digit Health. 2023;5(3):e105-e106. [FREE Full text] [CrossRef] [Medline]
- Shen Y, Heacock L, Elias J, Hentel KD, Reig B, Shih G, et al. ChatGPT and other large language models are double-edged swords. Radiology. 2023;307(2):e230163. [FREE Full text] [CrossRef] [Medline]
- Lubowitz JH. ChatGPT, an artificial intelligence chatbot, is impacting medical literature. Arthroscopy. 2023;39(5):1121-1122. [FREE Full text] [CrossRef] [Medline]
- Jianning L, Amin D, Jens K, Jan E. ChatGPT in healthcare: a taxonomy and systematic review. medRxiv. . Preprint posted online on March 30, 2023. [CrossRef]
- Choudhury A, Shamszare H. Investigating the Impact of User Trust on the Adoption and Use of ChatGPT: Survey Analysis. J Med Internet Res. Jun 14, 2023;25:e47184. [FREE Full text] [CrossRef] [Medline]
- Hu PJ, Chau PYK, Sheng ORL, Tam KY. Examining the technology acceptance model using physician acceptance of telemedicine technology. J Manage Inform Syst. 2015;16(2):91-112. [FREE Full text] [CrossRef]
- Williams MD, Rana NP, Dwivedi YK. The unified theory of acceptance and use of technology (UTAUT): a literature review. J Enterp Inf Manag. 2015;28(3):443-448. [FREE Full text] [CrossRef]
- McKnight DH, Choudhury V, Kacmar C. The impact of initial consumer trust on intentions to transact with a web site: a trust building model. J Strateg Inf Syst. 2002;11(3-4):297-323. [FREE Full text] [CrossRef]
- Hart SG, Staveland LE. Development of NASA-TLX (Task Load Index): results of empirical and theoretical research. Adv Psychol. 1988;52:139-183. [FREE Full text] [CrossRef]
- Delone WH, McLean ER. The DeLone and McLean model of information systems success: a ten-year update. J Manag Inf Syst. 2014;19(4):9-30. [FREE Full text] [CrossRef]
- Featherman MS, Pavlou PA. Predicting e-services adoption: a perceived risk facets perspective. Int J Hum Comput Stud. 2003;59(4):451-474. [FREE Full text] [CrossRef]
- Hair JF, Sarstedt M, Ringle CM, Gudergan SP. Advanced Issues in Partial Least Squares Structural Equation Modeling. 2nd Edition. Thousand Oaks, CA. SAGE Publications; 2017.
- Eriksson K, Hermansson C, Jonsson S. The performance generating limitations of the relationship-banking model in the digital era—effects of customers' trust, satisfaction, and loyalty on client-level performance. Int J Bank Mark. 2020;38(4):889-916. [FREE Full text] [CrossRef]
- Al-Ansi A, Olya HGT, Han H. Effect of general risk on trust, satisfaction, and recommendation intention for halal food. Int J Hosp Manag. 2019;83:210-219. [FREE Full text] [CrossRef]
- Inegbedion H, Inegbedion E, Peter A, Harry L. Perception of workload balance and employee job satisfaction in work organisations. Heliyon. 2020;6(1):e03160. [FREE Full text] [CrossRef] [Medline]
- Marikyan D, Papagiannidis S, Rana OF, Ranjan R, Morgan G. "Alexa, let’s talk about my productivity": the impact of digital assistants on work productivity. J Bus Res. 2022;142:572-584. [FREE Full text] [CrossRef]
- FACT SHEET: Biden-Harris Administration takes new steps to advance responsible artificial intelligence research, development, and deployment. The White House. 2023. URL: https://tinyurl.com/bdfnb97b [accessed 2024-05-07]
- Tabassi E. Artificial Intelligence Risk Management Framework (AI RMF 1.0). National Institute of Standards and Technology. 2023. URL: https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-1.pdf [accessed 2024-05-07]
- Sato T, Yamani Y, Liechty M, Chancey ET. Automation trust increases under high-workload multitasking scenarios involving risk. Cogn Tech Work. 2019;22(2):399-407. [FREE Full text] [CrossRef]
- Karpinsky ND, Chancey ET, Palmer DB, Yamani Y. Automation trust and attention allocation in multitasking workspace. Appl Ergon. 2018;70:194-201. [FREE Full text] [CrossRef] [Medline]
- Gontar P, Homans H, Rostalski M, Behrend J, Dehais F, Bengler K. Are pilots prepared for a cyber-attack? A human factors approach to the experimental evaluation of pilots' behavior. J Air Transp Manag. 2018;69:26-37. [FREE Full text] [CrossRef]
- Tentama F, Rahmawati PA, Muhopilah P. The effect and implications of work stress and workload on job satisfaction. Int J Sci Technol Res. 2019;8(11):2498-2502. [FREE Full text]
- Kim C, Mirusmonov M, Lee I. An empirical examination of factors influencing the intention to use mobile payment. Comput Human Behav. 2010;26(3):310-322. [FREE Full text] [CrossRef]
- Holland P, Tham TL, Sheehan C, Cooper B. The impact of perceived workload on nurse satisfaction with work-life balance and intention to leave the occupation. Appl Nurs Res. 2019;49:70-76. [FREE Full text] [CrossRef] [Medline]
- Nguyen TM, Malik A. A two-wave cross-lagged study on AI service quality: the moderating effects of the job level and job role. Br J Manag. 2021;33(3):1221-1237. [FREE Full text] [CrossRef]
- Kumar A, Adlakaha A, Mukherjee K. The effect of perceived security and grievance redressal on continuance intention to use M-wallets in a developing country. Int J Bank Mark. 2018;36(7):1170-1189. [FREE Full text] [CrossRef]
- Chen X, Li S. Understanding continuance intention of mobile payment services: an empirical study. J Comput Inf Syst. 2016;57(4):287-298. [FREE Full text] [CrossRef]
- Fang Y, Qureshi I, Sun H, McCole P, Ramsey E, Lim KH. Trust, satisfaction, and online repurchase intention. MIS Q. 2014;38(2):407-428. [FREE Full text]
- Gu Z, Wei J, Xu F. An empirical study on factors influencing consumers' initial trust in wearable commerce. J Comput Inf Syst. 2015;56(1):79-85. [CrossRef]
- Oliveira T, Faria M, Thomas MA, Popovič A. Extending the understanding of mobile banking adoption: when UTAUT meets TTF and ITM. Int J Inform Manage. 2014;34(5):689-703. [FREE Full text] [CrossRef]
- Mostafa RB, Kasamani T. Antecedents and consequences of chatbot initial trust. Eur J Mark. 2021;56(6):1748-1771. [FREE Full text] [CrossRef]
- Wang X, Lin X, Shao B. Artificial intelligence changes the way we work: a close look at innovating with chatbots. J Assoc Inf Sci Technol. 2022;74(3):339-353. [FREE Full text] [CrossRef]
- Hsiao KL, Chen CC. What drives continuance intention to use a food-ordering chatbot? An examination of trust and satisfaction. Libr Hi Tech. 2021;40(4):929-946. [FREE Full text] [CrossRef]
- Pesonen JA. ‘Are You OK?’ Students’ trust in a chatbot providing support opportunities. Springer; 2021. Presented at: Learning and Collaboration Technologies: Games and Virtual Environments for Learning: 8th International Conference, LCT 2021, Held as Part of the 23rd HCI International Conference, HCII 2021; July 24-29, 2021; Virtual Event. [CrossRef]
- Dwivedi YK, Kshetri N, Hughes L, Slade EL, Jeyaraj A, Kar AK, et al. Opinion paper: “So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy. Int J Inform Manage. 2023;71:102642. [FREE Full text] [CrossRef]
- Silva SC, De Cicco R, Vlačić B, Elmashhara MG. Using chatbots in e-retailing—how to mitigate perceived risk and enhance the flow experience. Int J Retail Distrib Manag. 2022;51(3):285-305. [FREE Full text] [CrossRef]
- Nordheim CB, Følstad A, Bjørkli CA. An initial model of trust in chatbots for customer service? Findings from a questionnaire study. Interact Comput. 2019;31(3):317-335. [FREE Full text] [CrossRef]
Abbreviations
AI: artificial intelligence |
AI RMF: artificial intelligence risk management framework |
PLS-SEM: partial least squares structural equation modeling |
RMSEA: root mean square error of approximation |
TAM: technology acceptance model |
UTAUT: unified theory of acceptance and use of technology |
Edited by A Kushniruk, E Borycki; submitted 11.12.23; peer-reviewed by P Radanliev, G Farid; comments to author 17.01.24; revised version received 25.03.24; accepted 07.04.24; published 27.05.24.
Copyright©Avishek Choudhury, Hamid Shamszare. Originally published in JMIR Human Factors (https://humanfactors.jmir.org), 27.05.2024.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Human Factors, is properly cited. The complete bibliographic information, a link to the original publication on https://humanfactors.jmir.org, as well as this copyright and license information must be included.