Skip to content

Latest commit



92 lines (72 loc) · 4.33 KB

File metadata and controls

92 lines (72 loc) · 4.33 KB

Getting Started

This document provides basic tutorials for the usage of MMAction. For installation, please refer to For data deployment, please refer to

An example on UCF-101

We first give an example of testing and training action recognition models on UCF101.

Prepare data

First of all, please follow for data preparation.

Test a reference model

Reference models are stored in We download a reference model spatial stream BN-Inception at $MMACTION/modelzoo using:

wget -c -P ./modelzoo/

Then, together with provided configs files, we run the following code to test with multiple GPUs:

python tools/ configs/ucf101/ tsn_2d_rgb_bninception_seg3_f1s1_b32_g8-98160339.pth --gpus 8

Train a model with multiple GPUs

To reproduce the model, we provide training scripts as follows:

./tools/ configs/ucf101/ 8 --validate
  • --validate: performs evaluation every k (default=1) epochs during the training, which help diagnose training process.

More examples

The procedure is not limited to action recognition in UCF101. To perform spatial-temporal detection on AVA, we can train a baseline model by running

./tools/ configs/ava/ 8 --validate

and evaluate a reference model by running

wget -c wget -c -P modelzoo/
python tools/ modelzoo/fast_rcnn_ava2.1_nl_r50_c4_1x_f32s2_kin-e2495b48.pth --out ava_fast_rcnn_nl_r50_multiscale.pkl --gpus 8 --eval bbox

To perform temporal action detection on THUMOS14, we can training a baseline model by running

./tools/ configs/thumos14/ 8

and evaluate a reference model by running

wget -c wget -c -P modelzoo/
python tools/ configs/thumos14/ modelzoo/ssn_thumos14_rgb_bn_inception_tag-dac9ddb0.pth --gpus 8 --out ssn_thumos14_rgb_bn_inception.pkl --eval thumos14

More Abstract Usage

Inference with pretrained models

We provide testing scripts to evaluate a whole dataset.

Test a dataset

python tools/test_${ARCH}.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [other task-specific arguments]


  • ${ARCH} could be
    • "recognizer" for action recognition (TSN, I3D, ...)
    • "localizer" for temporal action detection/localization (SSN)
    • "detector" for spatial-temporal action detection (a re-implmented Fast-RCNN baseline)
  • ${CONFIG_FILE} is the config file stored in $MMACTION/configs.
  • ${CHECKPOINT_FILE} is the checkpoint file. Please refer to for more details.

Train a model

MMAction implements distributed training and non-distributed training, powered by the same engine of mmdetection.

Train with multiple GPUs (Recommended)

Training with multiple GPUs follows the rules below:

./tools/dist_train_${ARCH}.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]
  • ${ARCH} could be
    • "recognizer" for action recognition (TSN, I3D, ...)
    • "localizer" for temporal action detection/localization (SSN)
    • "detector" for spatial-temporal action detection (a re-implmented Fast-RCNN baseline)
  • ${CONFIG_FILE} is the config file stored in $MMACTION/configs.
  • ${GPU_NUM} is the number of GPU (default: 8). If you are using number other than 8, please adjust the learning rate in the config file linearly.