Introduction
Chatbots are getting ever more pervasive to everyday life. Grand View
Research (2021) estimates the global chatbot market at USD 430.0 million
in 2020 and to have an growth rate of 24.9% annually till 2028. Using
machine learning algorithms to chat with humans and learn from these
interactions, chatbots can be found in a wide range of settings,
including healthcare, retail, travel and tourism.
Human-chatbot interactions thus represents a form of social interaction
carried out online, also described as computer-mediated discourse
(Herring, 2004). Like web chat, chatbots are a ‘lean’ medium of
conversation in that these interactions are deprived of the visual and
auditory cues which are part face to face interaction (Daft & Lengel,
1986). However, the ‘lean’ properties of chatbot mediated conversation
go beyond that of normal chat in that the contributions of one
conversational partner are processed, interpreted and responded by a
computer rather than another human. These interpretations and responses
depend on the bot’s ability to understand natural language input,
to generate adequate responses and to repair when there is
evidence understanding is lacking. However, these three steps pose a
range of challenges. Understanding is challenging because “people are
inconsistent, and their lives are disorderly. Their situations,
circumstances and aspirations change. They get tired and hungry, glad or
sad […] And sometimes they have no clue what they really
want” (Hantula et al., 2021). A bot will have to respond to these
changing circumstances understand the user’s intent, despite the
potential variations in which this intent is expressed. Secondly, the
bot has to generate language that responds correctly to users’ intents,
which in most dialogue systems is done through pre-compiled sentences or
templates (Di Lascio et al., 2020). And finally, as Collins (2018)
points out, ‘repair’ is a fundamental feature by which humans deal with
communication that is less than perfect and thus is fundamental to
interactions with AI driven systems, too.
However, when interacting with bots, humans cannot necessarily rely on
the same models of communication than in face to face social
interaction. As research by Luger & Sellen (2016) on users’
expectations and experience of conversational agents has found,
technically more skilled participants were better prepared to prepare
new mental models of interaction than lower skilled participants whose
expectations did not change and who were more likely to get frustrated
by their interactions. This raises the question as to whether users
transfer strategies from face to face modes of interaction into their
actions with AI and whether traditional models of communicative
competence (e.g. Canale & Swain, 1982) need to be adapted to different
forms of conversational AI.
With these insights in the background, this paper takes a user-centric
perspective to investigate repair, described here as users’ efforts for
pre-empting or addressing intent interpretation issues in a
task-oriented chatbot interactions. These insights have important
implications for how communicative competence for conversational AI
should be described. These issues will be discussed in the conclusion to
this paper.
Literature review
The term ‘repair’ is originally derived from conversation analysis.
Repair was first described by Schegloff et al. (1977) as a
“self-righting mechanism for the organization of language use in social
interaction” (p. 381), whereas Seedhouse (2005) defines repair as “the
treatment of trouble occurring in interactive language use” (p. 168).
In ordinary conversation, repair can be used by speakers to address
problems in hearing, speaking and understanding. In conversation with a
text-based chatbot, repair is often carried out to address issues with
interpreting user input. When the bot struggles with doing so, users
need to use repair in order to achieve the task they have set out to do.
In their typology of repair, Schegloff et al. (1977) distinguish between
self-initiation of repair (repair initiated by the speaker who is the
cause of the trouble source) and other-initiation of repair (repair
initiated by the another speaker). They also distinguish self-repair
(repair completed by the speaker who is the cause of the trouble source)
and other-repair (repair completed by the another speaker), emphasizing
also that some repairs do not have a successful outcome at all. Speakers
use a variety of means to other-initiate repair, such as signals of
misunderstanding (e.g. Hugh, What?), question words (who?, when?) which
may be combined with a partial repeat of the trouble source.
Albert & de Ruiter (2018) argue that the notion of repair as introduced
by conversation analysts such as Schegloff et al. (1977) constitutes a
“minimal notion of shared understanding as progressivity” (p. 281)
which consciously does not focus on context. However, they also argue
that observing repair provides rich insights into the sources of the
misunderstanding, which may include “contextual problems of propriety
or transgression” (p. 303). This paper will focus on both: whilst it
will identify repair sequences in chatbot dialogue through the lack of
progressivity, it will also attempt to provide insights into users’
understanding of the sociolinguistic environment of these interactions
with the bot, including their ideologies and perceptions of language.
Repair has been subject to a wide range of investigations in specific
contexts of interpersonal communication, such as the classroom (Dippold,
2014; Montiegel, 2021) and in workplace interaction (Oloff, 2018;
Tsuchiya & Handford, 2014). In computer-mediated environments, repair
has so far primarily been investigated in the context of webchat,
focusing for example on the repair morpheme *- in gaming chat
(Collister, 2011), German web chat (Schönfeldt & Golato, 2003), library
web chat (Koshik & Okazawa, 2012) and facebook chat (Meredith &
Stokoe, 2014). This research found that that repair in online chat shows
is organised differently to ordinary conversation due to differences in
the sequential flow of messages. Moreover, users do not have access to
the same set of resources to accomplish social interactions as in spoken
conversation (e.g., prosody), They do however compensate with other ways
of creating meaning (such as (*- as a repair phoneme), and general
principles of repair from ordinary conversation (e.g. the preference for
self-repair) still apply.
Repair has also been the subject of research in interactions between
humans and embodied robots as well as chatbots. For example, Beneteau et
al. (2019) investigated communicative breakdowns between Alexa and
family users. They showed that the onus on providing ‘repair’ when
communication broke down lay with users. Users deployed a range of
strategies to perform repair, e.g., using prosodic changes, over
articulation, semantic adjustments / modifications, increased volume,
syntactical adjustments, repetition.
Research on human interaction with text-based chatbots confirms that the
burden of repair lies primarily with the user. Analysing transcripts of
interactions between users and a task-oriented chatbot, Li et al. (2020)
investigated the relationship between different types of non-progression
and user repair types. They found that bot users were most likely to
abandon the conversation after three instances of non-progress., which
were caused by misrecognition of user intents on one hand, and
non-recognition on the other. Users drew on a range of strategies for
dealing with non-progress, including quitting, changing the subject
temporarily, abandoning the bot service, temporarily quitting the
conversation, switching the subject and various forms of reformulating
messages (self-repair), e.g., rephrasing, adding, repeating or removing
words, using the same words, new topics etc.
Ashktorab et al. (2019) investigated user preferences for the repair
strategies used by a banking chatbot in an experimental setting, finding
that users preferred the bot to initiate repair by providing options of
potential user intents. Users also favoured assisted self-repair (e.g.,
explaining which keywords contribute the bot’s lack of understanding)
over other strategies. However, users’ strategy preferences depended on
other factors such as their social orientation towards chatbots, their
utilitarian orientation, their experience with chatbots and technology
and the repair outcome.
Følstad
& Taylor’s study (2020) centred on the bot’s strategies for initiating
repair and asked whether a chatbot expressing uncertainty in
interpretation and suggesting likely alternatives would affect chatbot
dialogues at message, process and outcome level. They found that
initiating repair in this manner substantially reduced falls positive
answers – responses that are not relevant to a customer request – and
fallback responses offering escalation or explicitly expressing
misunderstanding – whereas the number of relevant responses remained
stable across both conditions.
Whilst this literature review shows that there is already a small body
of studies on repair in computer-mediated communication generally and in
human-bot interaction more specifically, users’ strategies for dealing
with repair and working themselves out of bot misunderstanding have not
yet been sufficiently explored, in particular from a primarily
qualitative perspective. Besides’ Li et al.’s (2019) study on repair
types and non-progress, the only other qualitative evaluation on user
strategies for overcoming problems in interaction with bots focuses on
voice bot interaction. Myers et al. (2018) identified ten different user
tactics, the most frequently ones used being hyperarticulation (speaking
louder, slower or more clearly), adding more information, using a new
utterance to express the same intent, and simplification.
This study compliments builds on these insights by investigating user
repair strategies in a text-based chatbot. In doing so, this study will
not only describe the ‘technicalities’ of repair, but also draw
conclusions into users’ understanding of the AI-mediated environment. I
will use this to then discuss implications for skills development.
Objectives
The objective of this paper was to track how users of a task-oriented
chatbot navigate through episodes in which the bot lacks understanding
of or misunderstands their intents through conversational repair. As
this paper was exploratory, no more detailed research questions were
asked. However, the analysis has revealed possible further questions
which could be explored with a larger dataset gathered ‘in the wild’
rather than a simulated setting.
Data
Asa, the bot
The data for this paper are drawn from a research project conducted
conjointly with start-up company spryt.com. SPRYT have developed an
intelligent patient scheduling system which allows patients to schedule
medical appointments through Whatsapp via a text-to-text interactions.
Patients interact with a digital receptionist – the chatbot – called
‘Asa’ to schedule, reschedule or cancel appointments, respond to a
medical screening questionnaire or ask questions. At the stage of the
data collection, Asa was developed to the stage of being a ‘minimum
viable product’ – it was functionable but had not yet been tested with
real patients and had not yet engaged in algorithmic learning from real
patients’ interactions.
Dataset
The analysis is based on 36 interactions between users and the
appointment scheduling bot. These interactions took place in a simulated
setting as part of user research of the system pre-deployment. Ten of
the interactions were created during the first phase of the project. In
this phase, user experience interviews were conducted during which users
interacted with the bot and were asked to talk in detail about their
perceptions of the bot’s speech turns and of the system as a whole. 26
interactions were created in phase two of the project. In this phase,
users interacted with Asa to complete a booking at a minimum. In
addition, users were also instructed to complete other tasks, such
rescheduling, cancelling, or asking a question. Subsequent to their
interactions, users reported their opinions about Asa through a
questionnaire after their interactions. For the purpose of this
analysis, only the interactions in themselves will be considered.
Participant recruitment and demographics
Participants were recruited through the researchers’ social media
channels as well as the university’s experimental platform. As a result,
the, the majority of participants in the interview phase were UG and PG
university students, in addition to two professionals who took part in
the research due to professional interest in chatbot development. In the
questionnaires stage, the majority of participants (45%) were between
18 and 24 years old. There was also a lack of diversity with respect to
other demographic factors, such as users’ language status and ethnicity.
Data analysis and results
Analytical approach
Data analysis was exploratory and only loosely theory-guided at the
start of the project. Whilst the researcher was aware of the possible
relevance of repair for chatbot interactions due to her own previous
work (Dippold et al., 2020) and her reading of the literature, the
analysis did not focus on repair on the outset. However, after an
initial reading of the conversational data and exploratory annotations
in a qualitative analysis software programme (Nvivo), repair emerged as
a possible focus in the analysis.
Stages of analysis
The analysis took place in five subsequent stages These stages were not
pre-determined at the outset; rather, each step was guided on the
previous and added an additional layer of evidence. Each of these steps
will be discussed in detail below, with examples from the data then
allowing a more detailed exploration of the results.
Step 1: This step focused on the identification of all conversation
sequences in which there was a lack of progression. A sequence was
considered to have ended when the bot gave a relevant response. This
resulted in the identification of 75 repair sequences in total.
Step 2: In this step, all sequences were further coded into those in
which the trouble source was a user turn and those in which it was a bot
turn. They were then further annotated using Schegloff et al.’s (1977)
system of description of repair as self-initiated or other-initiated
self-repair or self-initiated or other-initiated other-repair (see table
1):