This project aims to identify human emotions using cameras. This work was done during deep learning internship at ZummitAfrica. We made use of the Ferplus dataset. Transfer learning with various neural network architectures were used for classification. They include, Vgg16,EfficientNetB0,ResNet50 and MobileNetV2. Results are as shown below.
At the end it saves the data as a time chart which showsthe emotions detected throughout the video to help management easily locate points of interest such as when the customer was annoyed or happy.
Other possible use cases of emotion detection include
- locating interesting areas(funny or shocking) in a video that many people are watching such as in a cinema.
other results can be found here
Architecture | Test Accuracy | Validation accuracy | Epochs | Model |
---|---|---|---|---|
Vgg16 | 80% | 82% | 20 | Emotion-Vgg16.tflite |
EfficientNetB0 | xx% | 76% | 50 | Emotion-EfficientNetB0.tflite |
ResNet50 | 76% | 77% | 20 | Emotion-Resnet50.tflite |
MobileNetV2 | 74% | 76% | 50 | Emotion-MobileNetV2.tflite |
Human expression of emotions is not normally long lived. someone may smile or be shocked for just a few seconds but capturing these emotions can really help in taking useful decisions. However, getting a more robust dataset inorder to improve the model will be quite difficult. Hence, further study will involve combining several open source datasets such as EMOTIC and AffectNet in order to improve the model.