NumberOfSpeakerEstimation

The goal of this project is to estimate the number of speakers that appear in an audio fragment. The first goal is to replicate the following study: CountNet: Estimating the Number of Concurrent Speakers Using Supervised Learning. On top of that, two experiments are executed with this model:

Investigating the effect of the number of unique speakers in the dataset on this model
TODO: Briefly explain Chihabs experiment

Structure of this repo

This repo contains the following folders:

In model, the code for the general model can be found.
In pretrained models, some pretrained models of this project can be found. More specifically:
- model-best-baseline.h5 is the baseline model, trained completely on LibriSpeech-360 Clean.
- model-best-{250, 750, 750}.h5 are the models used for investigating the effect of the number of unique speakers in the dataset.
- TODO: Briefly summarize pretrained models for Chihabs experiment
In src, all python files containing code can be found.

Besides these folders, there are the following notebook:

Creating Dataset.ipynb demonstrates the full pipeline of creating the dataset used for training the baseline model.
Experimental Datasets Unique Speakers.ipynb demonstrates all code used for testing the effect of the number of unique speakers in the dataset on the performance of the model.

How do I use this repo?

Evidently, running the notebooks is fairly straightforward. If you want to train a model, make sure the correct path to the data is set in model_trainer.py. After this, simply run the command ./run_model.sh train This will initiate a run on Weights & Biases. After training, make sure to download the trained model from this specific run.

If you want to test the model that has been trained, make sure that the correct path to the test set is set in model_test.py. Besides this, also make sure that the correct path to the pretrained model is set in model_test.py. After making sure these things are set correctly, simply run the command: ./run_model.sh.

Name		Name	Last commit message	Last commit date
Latest commit History 112 Commits
model		model
pretrained models		pretrained models
src		src
.gitignore		.gitignore
Chb_Model_experiment.ipynb		Chb_Model_experiment.ipynb
Chb_exp.ipynb		Chb_exp.ipynb
Creating Dataset.ipynb		Creating Dataset.ipynb
Experimental Datasets Unique Speakers.ipynb		Experimental Datasets Unique Speakers.ipynb
README.md		README.md
create_data.py		create_data.py
model_test.py		model_test.py
model_trainer.py		model_trainer.py
requirements.txt		requirements.txt
run_model.sh		run_model.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NumberOfSpeakerEstimation

Structure of this repo

How do I use this repo?

About

Releases

Packages

Contributors 2

Languages

jordai/NumberOfSpeakerEstimation

Folders and files

Latest commit

History

Repository files navigation

NumberOfSpeakerEstimation

Structure of this repo

How do I use this repo?

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages