Skip to content

ggml-base.en.bin vs ggml-base.bin, what's the difference? #1460

Answered by bjnortier
chen-rn asked this question in Q&A
Discussion options

You must be logged in to vote

Interesting. If you look at the WER tables in the paper [1] (below):

  • small.en performs better than small on 10 vs. 3 datasets for greedy
  • small.en performs better than small on 11 vs. 2 datasets for beam.
  • medium performs better than medium.en on 9 vs 4 datasets for greedy.
  • medium performs better than medium.en on 9 vs. 5 datasets for beam.


Greedy

multi .en same
Small 3 10 1
Medium 9 4 1


Beam

multi .en same
Small 2 11 1
Medium 9 5 0

So it's probably better to use medium instead of medium.en for general cases. If you're transcribing telephone conversations (with beam) might be better to use medium.en because it performs better on CallHome and Switchboard. I gues…

Replies: 4 comments

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Answer selected by bobqianic
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants
Converted from issue

This discussion was converted from issue #1405 on November 09, 2023 00:21.