arxiv:2407.19340

Integrating Large Language Models into a Tri-Modal Architecture for Automated Depression Classification

Published on Jul 27

· Submitted by

Soontosh on Jul 30

#2 Paper of the day

Authors:

Santosh V. Patapati

Abstract

Major Depressive Disorder (MDD) is a pervasive mental health condition that affects 300 million people worldwide. This work presents a novel, BiLSTM-based tri-modal model-level fusion architecture for the binary classification of depression from clinical interview recordings. The proposed architecture incorporates Mel Frequency Cepstral Coefficients, Facial Action Units, and uses a two-shot learning based GPT-4 model to process text data. This is the first work to incorporate large language models into a multi-modal architecture for this task. It achieves impressive results on the DAIC-WOZ AVEC 2016 Challenge cross-validation split and Leave-One-Subject-Out cross-validation split, surpassing all baseline models and multiple state-of-the-art models. In Leave-One-Subject-Out testing, it achieves an accuracy of 91.01%, an F1-Score of 85.95%, a precision of 80%, and a recall of 92.86%.

View arXiv page View PDF Add to collection

Community

Paper author Paper submitter Jul 30

Let me know your thoughts and suggestions!

12leana

Jul 30

This comment has been hidden

12leana

Jul 30

Where is the GitHub link?

·

Paper author Jul 31

ETA is a few weeks, busy with some other stuff right now. Will definitely open source before I submit for publication, though.

Jul 31

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

152334H

Jul 31

who upvoted this paper? are they all bots?

·

Paper author Jul 31

This comment has been hidden

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2407.19340 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2407.19340 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2407.19340 in a Space README.md to link it from this page.

Collections including this paper 4