Automated Facial Affect Recognition (AFAR)
Automated measurement of face and head dynamics, detection of facial action units and expression, and affect detection are crucial to multiple domains (e.g., health, education, entertainment). Commercial tools are available but costly and of unknown validity. Open-source ones lack user-friendly GUI for use by non-programmers. For both types, evidence of domain transfer and options for retraining for use in new domains typically are lacking. Deep approaches have two key advantages. They typically outperform shallow ones for facial affect recognition. And pre-trained models provided by deep approaches can be fine tuned with new datasets to optimize performance. AFAR is an open-source, deep-learning based, user-friendly tool for automated facial affect recognition. It consists of a pipeline having four components: (i) face tracking, ii) face registration, (iii) action unit (AU) detection and (iv) visualization. Moreover, finetuning component allows the interested users to finetune the pretrained AU detection models with their own dataset. AFAR has been used in comparative studies of action unit detectors [1], [2] and to investigate cross-domain generalizability [3], assess treatment response to deep brain stimulation (DBS) for treatment-resistant obsessive compulsive disorder [4], and to explore facial dynamics in young children [5] and in adults in treatment for depression [6] among other research.
The problem I aimed to solve was: Can I make a script to identify AUs and emotions? I used the dlib library in Python to track facial features in the CK+ dataset. Following Tian’s paper, the script calculated key distances such as distance between the eyebrows, distance between the lips, and the distance between the corner of the lips and the eyes. With OpenCV, I used the Canny edge detection on the corners of the eyes and between the brows to find the deepening of lines and furrows. Lines and furrows are a transient feature. Adults will have more wrinkles than children and, alone, they are not a reliable indicator of AUs or emotions. However, for this dataset, adding the transient features increased accuracy.
The neural network was written in tensorflow and summarized in tensorboard. I was amused by the time it took me to concatenate a matrix, but I was pleased by how fit the module is for machine learning.
Traditional pipeline:
RGBD(Color image, Depth map) -(ASM,feature points extract)-> feature points (internal, contour) -(morphable model,with depth map??)-> inital mesh -(refined algorithm, Laplacian-based mesh deformation algrithm)-> refined mesh -(deformation transfer algorithm)-> other experssions mesh (refined) -(example-based facial rigging algorithm)-> linear blendershap model -(PCA)-> Blinear(identities and expressions) Face model