The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text". We present a text based model and a text + audio based model for automatically detecting Finnish dialects.
Proudly presented by Rootroo Ltd
The data consists of several Finnish dialects, their transcriptions and audio files.
Here you can see the results of our models
If you need NLP solutions for smaller languages like Finnish, we have your back! Rootroo offers consulting related to a variety of NLP tasks. We have a strong academic background in the state-of-the-art AI solutions for every NLP need. Just contact us, we won't bite.
Everything has been released on Zenodo. Check out the Zenodo repository.
If you use the data, code or models, please cite our paper:
Hämäläinen, Mika; Alnajjar, Khalid; Partanen, Niko & Rueter, Jack (Accepted). Finnish Dialect Identification: The Effect of Audio and Text. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP).