Skip to content

In the vocalizations of disgust project, we classify audio segments into distinct types of disgust among other classes.

License

Notifications You must be signed in to change notification settings

vocalisations/disgust

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

67 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vocalisations of Disgust

In the Vocalisations of Disgust project, we classify audio and video segments into either 'moral' or 'pathogen' disgust types using computer vision. We use VideoMAE and XCLIP as feature extractors, and use randomforest and xgboost as a classifier using these features as input.

Data setup

We expect our data to be in a single metadata.csv file and a single folder containing all the video files. The metadata file should look like:

VideoID,Disgust category
v_ca_001_01,pathogen disgust
v_ca_001_02,pathogen disgust
v_ca_013_05,moral disgust
v_ca_013_06,moral disgust
v_ca_013_16,pathogen disgust

The folder containing the videos should have videos with file names `.mp4' or other kind of video format.

Installation

In the root of the folder, run:

pip install .

Inspect video durations (optional)

We can create a plot of the duration of the videos by running:

python disgust/video_statistics.py ./Videos

A couple of plots will be saved in the videos folder. Example output: _durations_10_20

Defacing (optional)

To prevent our model from learning facial expressions and force it to learn on the context of the video, we deface all videos.

python disgust/deface_all_videos.py ./Videos ./Videos_defaced

Compute features

For every video, features need to be computed. Two feature extractors are supported, VideoMAE and XCLIP. Features need to be calculated only once, as they will be stored in the same folder as the meta data file. As feature extraction can be halted and resumed at your convenience as data is saved after every encoded video. When (re)starting feature extraction, the script will look for existing features and will skip those videos. Features for VideoMAE and XCLIP are processed and stored independently in seperate files. To start extracting features run:

python disgust/encode_videos.py metadata.csv ./Videos_defaced xclip

or

python disgust/encode_videos.py metadata.csv ./Videos_defaced videomae

Train and evaluate classifier

To train and evaluate a classifier on the extracted features run:

python disgust/train_classifier.py metadata.csv ./Videos_defaced <feature_extractor_type> <classifier_type>

Where feature_extractor_type is one of ['xclip', 'videomae', 'all'], where all means using both feature types concatenated as a single feature vector. For classifier_type choose one of `['xgboost', 'pcaxgboost', 'pcarf', 'rf']'. For example:

python disgust/train_classifier.py metadata.csv ./Videos_defaced videomae rf

Some output will be printed to the command line for quick inspection. The complete output however, is saved in a performance folder, created inside the folder that contains the metadata file.

About

In the vocalizations of disgust project, we classify audio segments into distinct types of disgust among other classes.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages