DTF-AT

Introduction

Pytorch Implementation of DTF-AT: Decoupled Time-Frequency Audio Transformer for Event Classification

Setting Up

Clone or download this repository and set it as the working directory, create a virtual environment and install the dependencies.

cd DTFAT/ 
conda env create -f dtfat.yml
conda activate dtfat

Data Preparation Audioset

Since the AudioSet data is downloaded from YouTube directly, videos get deleted and the available dataset decreases in size over time. So you need to prepare the following files for the AudioSet copy available to you.

Prepare data files as mentioned in AST

Validation

We have provided the best model. Please download the model weight and keep it in DTFAT/pretrained_models/best_model/model.

You can validate the model performance on your AudioSet evaluation data as follows,

cd DTFAT/egs/audioset
bash eval_run.sh

This script create log file with date time stamp in the same directory(eg:1692289183.log). You can find the mAP in the end of the log file.

Citing

We are using the AST repo for model training and timm(do not install timm) for model implementation and ImageNet-1K pretrained weights.

@inproceedings{gong21b_interspeech,
  author={Yuan Gong and Yu-An Chung and James Glass},
  title={{AST: Audio Spectrogram Transformer}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={571--575},
  doi={10.21437/Interspeech.2021-698}
}

@misc{rw2019timm,
  author = {Ross Wightman},
  title = {PyTorch Image Models},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  doi = {10.5281/zenodo.4414861},
  howpublished = {\url{https://github.com/rwightman/pytorch-image-models}}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
egs/audioset		egs/audioset
src		src
.gitignore		.gitignore
README.md		README.md
complete_arch_v2.png		complete_arch_v2.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DTF-AT

Introduction

Setting Up

Data Preparation Audioset

Validation

Citing

About

Releases

Packages

Languages

ta012/DTFAT

Folders and files

Latest commit

History

Repository files navigation

DTF-AT

Introduction

Setting Up

Data Preparation Audioset

Validation

Citing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages