Introduction

Chatbots are getting ever more pervasive to everyday life. Grand View Research (2021) estimates the global chatbot market at USD 430.0 million in 2020 and to have a growth rate of 24.9% annually till 2028. Using machine learning algorithms to chat with humans and learn from these interactions, chatbots can be found in a wide range of settings, including healthcare, retail, travel and tourism. Given the increasing prevalence of chatbots in commercial and public contexts, it is important to ask to what extent users have the knowledge of the context and the limitations of AI-supported conversational interaction and the flexibility to adapt their own interaction patterns to the AI environment.
Human-chatbot interactions represents a form of social interaction carried out online, also described as computer-mediated discourse (Herring, 2004). Like web chat, chatbots are a lean medium of conversation in that these interactions are deprived of the visual and auditory cues which are part face to face interaction (Daft & Lengel, 1986). However, the lean properties of chatbot mediated conversation go beyond that of normal chat in that the contributions of one conversational partner are processed, interpreted and responded by a computer rather than another human, leading to a number of challenges.
Firstly, understanding user intent can be challenging for bots because “people are inconsistent, and their lives are disorderly. Their situations, circumstances and aspirations change. They get tired and hungry, glad or sad […] And sometimes they have no clue what they really want” (Hantula et al., 2021). A bot will have to respond to these changing circumstances understand the user’s intent, despite the potential variations in which this intent is expressed. Secondly, the bot has to generate language that responds correctly to users’ intents, which in most dialogue systems is done through pre-compiled sentences or templates (Di Lascio et al., 2020). And finally, users need to support this process by providing repair when the bot does not understand their intent. As Collins (2018) points out, ‘repair’ is a fundamental feature by which humans deal with communication that is less than perfect. As previous studies have shown and this study will demonstrate, it is also fundamental to human interactions with conversational AI.
However, when interacting with bots, humans cannot necessarily rely on the same models of communication than in face-to-face social interaction. Research by Luger & Sellen (2016) on users’ expectations and experience of conversational agents found that participants with better technical skills were better prepared to adopt new mental models of interaction than lower skilled participants whose expectations did not change and who were more likely to get frustrated by their interactions.
With these insights in the background, this paper takes a user-centric perspective to investigate repair, described here as users’ efforts for addressing intent interpretation issues in task-oriented chatbot interactions. These insights have important implications for how users can be supported to develop the communication skills for conversational AI and to understand the sociolinguistic environment of these interactions. These issues will be discussed in the conclusion to this paper.
Literature review
The term repair is originally derived from conversation analysis. Repair was first described by Schegloff et al. (1977) as a “self-righting mechanism for the organization of language use in social interaction” (p. 381), whereas Seedhouse (2005) defines repair as “the treatment of trouble occurring in interactive language use” (p. 168). In ordinary conversation, repair can be used by speakers to address problems in hearing, speaking and understanding. In conversation with a text-based chatbot, users do repair to address issues with the bot interpreting their intent.
In their typology of repair, Schegloff et al. (1977) distinguish between self-initiation of repair (repair initiated by the speaker who is the cause of the trouble source) and other-initiation of repair (repair initiated by the another speaker). They also distinguish self-repair (repair completed by the speaker who is the cause of the trouble source) and other-repair (repair completed by the another speaker), emphasizing also that some repairs do not have a successful outcome at all. Speakers use a variety of means to other-initiate repair, such as signals of misunderstanding (e.g. Hugh, What?), question words (who?, when?) which may be combined with a partial repeat of the trouble source.
Albert & de Ruiter (2018) argue that the notion of repair as introduced by conversation analysts such as Schegloff et al. (1977) constitutes a “minimal notion of shared understanding as progressivity” (p. 281) which consciously does not focus on context. However, they also argue that observing repair provides rich insights into the sources of the misunderstanding, which may include “contextual problems of propriety or transgression” (p. 303). Repair has been subject to a wide range of investigations in specific contexts of interpersonal communication, such as the classroom (Dippold, 2014; Montiegel, 2021) and in workplace interaction (Oloff, 2018; Tsuchiya & Handford, 2014). In computer-mediated environments, repair has so far primarily been investigated in the context of web-chat, focusing for example on gaming chat (Collister, 2011), German web chat (Schönfeldt & Golato, 2003), library web chat (Koshik & Okazawa, 2012) and facebook chat (Meredith & Stokoe, 2014). These studies found that that repair in online chat shows is organised differently to ordinary conversation due to differences in the sequential flow of messages. Moreover, users do not have access to the same set of resources to accomplish social interactions as in spoken conversation (e.g., prosody). Users do however compensate with other ways of creating meaning (such as * as a repair phoneme), and general principles of repair from ordinary conversation (e.g., the preference for self-repair) still apply.
Repair has also been the subject of research in interactions between humans and embodied robots as well as chatbots. For example, Beneteau et al. (2019) investigated communicative breakdowns between Alexa and family users. They showed that the onus on providing ‘repair’ when communication broke down lay with users. Users deployed a range of strategies to perform repair, e.g., using prosodic changes, over articulation, semantic adjustments / modifications, increased volume, syntactical adjustments, repetition.
Research on human interaction with text-based chatbots confirms that the burden of repair lies primarily with the user. Analysing transcripts of interactions between users and a task-oriented chatbot, Li et al. (2020) investigated the relationship between different types of non-progression and user repair types. They found that bot users were most likely to abandon the conversation after three instances of non-progress, which were caused by the bot’s misrecognition of user intents on one hand, and non-recognition on the other. Users drew on a range of strategies for dealing with non-progress, including quitting, changing the subject temporarily, abandoning the bot service, temporarily quitting the conversation, switching the subject and various forms of reformulating messages (self-repair), e.g., rephrasing, adding, repeating or removing words, using the same words, new topics etc.
Ashktorab et al. (2019) investigated user preferences for the repair strategies used by a banking chatbot in an experimental setting, finding that users preferred the bot to initiate repair by providing options of potential user intents. Users also favoured assisted self-repair (e.g., explaining which keywords contribute the bot’s lack of understanding) over other strategies. However, users’ strategy preferences depended on other factors such as their social orientation towards chatbots, their utilitarian orientation, their experience with chatbots and technology and the repair outcome.
Følstad & Taylor’s study (2020) centred on the bot’s strategies for initiating repair and asked whether a chatbot expressing uncertainty in interpretation and suggesting likely alternatives would affect chatbot dialogues at message, process and outcome level. They found that initiating repair in this manner substantially reduced irrelevant responses that were not relevant to a customer request as well as fallback responses offering escalation or explicitly expressing misunderstanding. The number of relevant responses however remained stable across both conditions.
Whilst this literature review shows that there is already a small body of studies on repair in computer-mediated communication generally and in human-bot interaction more specifically, users’ strategies for dealing with repair and working themselves out of bot misunderstanding have not yet been sufficiently explored, in particular from a primarily qualitative perspective. Besides’ Li et al’s (2019) study on repair types and non-progress, the only other qualitative evaluation on user strategies for overcoming problems in interaction with bots focuses on voice bot interaction. Myers et al. (2018) identified ten different user tactics, the most frequently ones used being hyperarticulation (speaking louder, slower or more clearly), adding more information, using a new utterance to express the same intent, and simplification.
This study compliments builds on these insights by investigating user repair strategies in a text-based chatbot. In doing so, this study will describe the ‘technicalities’ of user repair and use these insights to draw conclusions about users’ understanding of the AI context and skills development for AI more generally.
Objectives
The objective of this paper was to track how users of a task-oriented chatbot navigate through episodes in which the bot lacks understanding of or misunderstands their intents through conversational repair. As this paper was exploratory, the research question was broad in focus, with the analysis revealing further questions which could be explored with a larger dataset gathered ‘in the wild’ rather than a simulated setting.
Data
Asa, the bot
The data for this paper are drawn from a research project conducted conjointly with start-up company spryt.com. SPRYT have developed an intelligent patient scheduling system which allows patients to schedule medical appointments through Whatsapp via a text-to-text interactions. Patients interact with a digital receptionist – the chatbot called ‘Asa’ – to schedule, reschedule or cancel appointments, respond to a medical screening questionnaire or ask questions. At the stage of the data collection, Asa was developed to the stage of being a ‘minimum viable product’ – it was functionable but had not yet been tested with real patients and had not yet engaged in algorithmic learning from real patients’ interactions.
Dataset
The analysis is based on 36 interactions between individual users and the appointment scheduling bot. These interactions took place in a simulated setting as part of user research of the system pre-deployment. Ten of the interactions were created during the first phase of the project. In this phase, user experience interviews were conducted during which users interacted with the bot and were asked to talk in detail about their perceptions of the bot’s speech turns and of the system as a whole. 26 interactions were created in phase two of the project. In this phase, users interacted with Asa to complete a booking at a minimum. In addition, users were also instructed to complete other tasks, such rescheduling, cancelling, or asking a question. Subsequent to their interactions, users reported their opinions about Asa through a questionnaire after their interactions. For the purpose of this analysis, only the interactions in themselves will be considered.
Participant recruitment and demographics
Participants were recruited through the researchers’ social media channels as well as the university’s experimental platform. As a result, the, the majority of participants in the interview phase were UG and PG university students, in addition to two professionals who took part in the research due to professional interest in chatbot development. In the questionnaires stage, the majority of participants (45%) were between 18 and 24 years old. Just over 70& of participants described themselves as White and as native speakers of English.
Data analysis and results
Analytical approach
Data analysis was exploratory and only loosely theory-guided at the start of the project. Whilst the researcher was aware of the possible relevance of repair for chatbot interactions due to her own previous work (Dippold et al., 2020) and her reading of the literature, the analysis did not focus on repair on the outset. However, after an initial reading of the conversational data and exploratory annotations in a qualitative analysis software programme (Nvivo), repair emerged as a possible focus in the analysis.
Stages of analysis
The analysis took place in four subsequent stages. These stages were not pre-determined at the outset; rather, each step was guided on the previous and added an additional layer of evidence. Each of these steps will be discussed in detail below, with examples from the data then allowing a more detailed exploration of the results.
Step 1: This step focused on the identification of all instances of user self-repair. In the majority of cases (65), self-repair occurred either after a turn in which the bot explicitly indicated that there was a problem with the user’s turn, or after an irrelevant bot response to a user turn, prompting self-repair. In much fewer cases, the bot would provide no response at all, leading to self-initiated self-repair (7).
Step 2: The purpose of the second step was to identify the trouble sources leading to other-initiated user self-repair. This resulted in the identification of four different types of trouble sources (Table 1):