arxiv:2402.00250

LRDif: Diffusion Models for Under-Display Camera Emotion Recognition

Published on Feb 1, 2024

Authors:

Abstract

This study introduces LRDif, a novel diffusion-based framework designed specifically for facial expression recognition (FER) within the context of under-display cameras (UDC). To address the inherent challenges posed by UDC's image degradation, such as reduced sharpness and increased noise, LRDif employs a two-stage training strategy that integrates a condensed preliminary extraction network (FPEN) and an <PRE_TAG>agile transformer network (UDCformer)</POST_TAG> to effectively identify emotion labels from UDC images. By harnessing the robust distribution mapping capabilities of Diffusion Models (DMs) and the spatial dependency modeling strength of transformers, LRDif effectively overcomes the obstacles of noise and distortion inherent in UDC environments. Comprehensive experiments on standard FER datasets including RAF-DB, KDEF, and FERPlus, LRDif demonstrate state-of-the-art performance, underscoring its potential in advancing FER applications. This work not only addresses a significant gap in the literature by tackling the UDC challenge in FER but also sets a new benchmark for future research in the field.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

No model linking this paper

Cite arxiv.org/abs/2402.00250 in a model README.md to link it from this page.

No dataset linking this paper

Cite arxiv.org/abs/2402.00250 in a dataset README.md to link it from this page.

No Space linking this paper

Cite arxiv.org/abs/2402.00250 in a Space README.md to link it from this page.

No Collection including this paper

Add this paper to a collection to link it from this page.