Facial action unit detection with convolutional neural networks
We propose a novel deep convolutional neural network architecture to study the problem of action unit detection. We leverage recent gains in large-scale object recognition by formulating the task of predicting the presence of a specific action unit in a still image as simple image-level binary class...
- Autores:
-
Romero Vergara, Andrés Felipe
- Tipo de recurso:
- Fecha de publicación:
- 2017
- Institución:
- Universidad de los Andes
- Repositorio:
- Séneca: repositorio Uniandes
- Idioma:
- eng
- OAI Identifier:
- oai:repositorio.uniandes.edu.co:1992/13826
- Acceso en línea:
- http://hdl.handle.net/1992/13826
- Palabra clave:
- Redes neurales (Computadores) - Investigaciones
Procesamiento de imágenes - Investigaciones
Sistemas de reconocimiento de configuraciones - Investigaciones
Expresión facial - Procesamiento de imágenes - Investigaciones
Ingeniería
- Rights
- openAccess
- License
- http://creativecommons.org/licenses/by-nc-nd/4.0/
Summary: | We propose a novel deep convolutional neural network architecture to study the problem of action unit detection. We leverage recent gains in large-scale object recognition by formulating the task of predicting the presence of a specific action unit in a still image as simple image-level binary classification. We first train a convolutional encoder on the problem of multi-view emotion recognition as a high-level representation of facial expressions. We show that our architecture generalizes across views, ethnicity, gender and age by merging and training jointly on three standard emotion recognition datasets: CK+, Bosphorus and RafD. Our system is the first fully multi-view emotion recognizer proposed in the literature. We then extend this shared learned representation with fully-connected layers trained to detect individual action units. Our approach is conceptually simpler and yet significantly more accurate than the best methods based on the dominant paradigm for the study of this problem, which relies on facial landmark detection as an intermediate task. We conduct experiments on the BP4D dataset, the largest and most challenging benchmark currently available for action unit detection, and report an absolute improvement of 16% over the previous state-of-the-art. |
---|