Skip to content

eginhard/monotonic_alignment_search

Repository files navigation

Monotonic Alignment Search (MAS)

PyPI - License PyPI - Python Version PyPI - Version GithubActions GithubActions

Implementation of MAS from Glow-TTS for easy reuse in other projects.

Installation

pip install monotonic-alignment-search

Wheels are provided for Linux, Mac, and Windows.

Usage

MAS can find the most probable alignment between a text sequence t_x and a speech sequence t_y.

from monotonic_alignment_search import maximum_path

# value (torch.Tensor): [batch_size, t_x, t_y]
# mask  (torch.Tensor): [batch_size, t_x, t_y]
path = maximum_path(value, mask, implementation="cython")

The implementation argument allows choosing from one of the following implementations:

  • cython (default): Cython-optimised
  • numpy: pure Numpy

References

This implementation is taken from the original Glow-TTS repository. Consider citing the Glow-TTS paper when using this project:

@inproceedings{kim2020_glowtts,
    title={Glow-{TTS}: A Generative Flow for Text-to-Speech via Monotonic Alignment Search},
    author={Jaehyeon Kim and Sungwon Kim and Jungil Kong and Sungroh Yoon},
    booktitle={Proceedings of Neur{IPS}},
    year={2020},
}