Speech Emotion Recognition in Conversations Using Graph Convolutional Networks

Chandola, Deeksha

Speech Emotion Recognition in Conversations Using Graph Convolutional Networks

dc.contributor.advisor	Jenkin, Michael R.
dc.contributor.author	Chandola, Deeksha
dc.date.accessioned	2024-03-18T17:49:42Z
dc.date.available	2024-03-18T17:49:42Z
dc.date.issued	2024-03-16
dc.date.updated	2024-03-16T10:56:25Z
dc.degree.discipline	Electrical and Computer Engineering
dc.degree.level	Master's
dc.degree.name	MASc - Master of Applied Science
dc.description.abstract	Speech emotion recognition (SER) is the task of automatically recognizing emotions expressed in spoken language. Current approaches focus on analyzing isolated speech segments to identify a speaker’s emotional state. That being said, models based on text-based emotion recognition methods are considering conversational context and are moving towards emotion recognition in conversation (ERC). With the availability of multimodal datasets, ERC can be extended to non-text modalities as well. Building on these advances, in this thesis, we propose SERC-GCN, a method for speech emotion recognition in conversation (SERC) that predicts a speaker’s emotional state by incorporating conversational context, specifically speaker interactions, and temporal dependencies between utterances. SERC-GCN is a two-stage method. In the first stage, emotional features of utterance-level speech signals are extracted using a graph-based neural network. Here each individual speech utterance is transformed into a cyclic graph. These graphs are then processed by a two layered GCN architecture followed by a pooling layer to extract utterance-specific emotional features. In the second stage, these features are used to form conversation graphs that are used to train a graph convolutional network to perform SERC. We empirically evaluate the effectiveness of SERC-GCN on two benchmark dataset; IEMOCAP and MELD. Results show that SERC-GCN outperforms existing baseline approaches on these datasets.
dc.identifier.uri	https://hdl.handle.net/10315/41852
dc.language	en
dc.rights	Author owns copyright, except where explicitly noted. Please contact the author directly with licensing requests.
dc.subject	Computer science
dc.subject	Computer engineering
dc.subject	Psychology
dc.subject.keywords	Speech emotion recognition in conversation
dc.subject.keywords	Human-computer interaction
dc.subject.keywords	Graph convolutional network
dc.subject.keywords	Emotion recognition in conversation (ERC)
dc.subject.keywords	Multimodal analysis
dc.title	Speech Emotion Recognition in Conversations Using Graph Convolutional Networks
dc.type	Electronic Thesis or Dissertation

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Chandola_Deeksha_2024_Masters.pdf
Size:: 6.07 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 2 of 2

Name:: license.txt
Size:: 1.87 KB
Format:: Plain Text
Description:

Download

Name:: YorkU_ETDlicense.txt
Size:: 3.39 KB
Format:: Plain Text
Description:

Download

Collections

Electrical and Computer Engineering