Classification in textual conversations: A study of emotion prediction and derailment forecasting

Loading...
Thumbnail Image

Authors

AlTarawneh, Enas Khaled Ahm

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Emotion is fundamental to human communication, shaping not just the content but the very essence of our interactions with others. In the realm of Natural Language Processing (NLP), particularly for applications that bridge human-machine communication such as health-care, education, and social networks, understanding and emulating emotional nuances becomes paramount. While it may be straightforward for humans to perceive and reason about the feelings of others in conversations, it is a challenge for machines, mainly due to context. Conversation models in the literature that incorporate context vary in the type of contextual information they incorporate (e.g., temporal structure, speaker identification, commonsense knowledge). However, studies to date have not explicitly quantified the impact of the type(s) of information incorporated within the critical conversation classification tasks of future emotion prediction and emotional derailment forecasting, nor the structure of the model architectures and encoding structures used for these tasks. These issues are addressed in this work.

This thesis approaches this problem by developing AI models that can capture different design choices for these tasks. Critically, the models developed here are designed to capture three properties inherently connected to the emotional predictive problem in dialogues; sequence modeling, self-dependency modeling, and recency. These modeling dimensions are then incorporated into one of two deep neural network architectures, a sequence model and a graph convolutional network model. The former is designed to capture the sequence of utterances in a dialogue, while the latter captures the sequence of utterances and the formation of multi-party dialogues. Through an empirical evaluation of these model architectures, data type and data encoding choices, this work demonstrates (i) the importance of the self- dependency and recency model dimensions for the prediction tasks, (ii) the effectiveness of graph neural models in improving the predictions obtained by sequence-only models, (iii) the impact of fusing multi-source information of each utterance into utterance capsules, specifically emotion labels and common sense knowledge and, (iv) that using a transformer-based forecaster for the conversation predictive task also improves performance. Optimal design choices within these structures provides near best in class performance for next emotion prediction in conversations and best in class performance for conversation derailment prediction. This thesis also shows that simple fine-tuning of large language models is not an effective classification method for these tasks.

Evaluations are performed using standard conversational datasets and current state of the art network models. Results from this work will help inform future dataset structure and the development of advanced sentiment analysis systems.

Description

Keywords

Computer science, Artificial intelligence

Citation