-
Notifications
You must be signed in to change notification settings - Fork 22
Home
-
[es] Úsalo[fixed in commit b38f047]
Some verbal forms are not splitted correctly when they start with uppercase.Úsalo con precaución.
expected: 'Usa', 'lo', 'con', 'precaución', '.'
got: 'Úsalo', 'con', 'precaución', '.'With two clitics as well [fixed in commit 2e28d5d]
Ábreselo antes.
expected: 'Abre', 'se', 'lo', 'antes', '.'
got: 'Ábreselo', 'antes', '.' -
[gl] Este[fixed in commit e09dd4f]
The determiner este is incorrectly splitted at the beginning of the sentence.Este xeito.
expected: 'Este', 'xeito', '.'
got: 'Es', 'te', 'xeito', '.'Value of variable $excep
-
[es] Abráselo
Some imperative forms of abrasar are incorrectly lemmatized as abrir when combined with some clitics.Abráselo inmediatamente.
expected: 'Abrase', 'lo'
got: 'Abra', 'se', 'lo' -
[es] Enseñarlo[fixed in commit 27c9f8e]
Verb forms having non-ascii chars (ex. "ñ") and clitics attached are incorrectly splitted.Quiero enseñarlo.
expected: 'enseñar', 'lo'
got: 'enseñarlo' -
[gl] Splitted entities[fixed in commit 706a7bd]
Some entities are splitted even in non ambiguous positions (middle of the sentence).O concerto foi na Casa das Crechas
expected: 'Casa', 'de', 'as', 'Crechas'
got: 'Casa', 'de', 'as', 'Cre', 'che', 'as'Other examples: Follas Vellas, Ponte Caldelas, Alfama, Área Central, Rías Baixas, Torrente Ballester, Apóstolo, Orella
-
[es] Correos[fixed in commit 7a53f3a]
The entity Correos is ambiguous with the imperative form correos (corred + os)Tengo que ir a Correos inmediatamente.
expected: 'Correos'
got: 'Corred', 'os' -
[es] Reírse[fixed in commit 23fab4a]
Verbs with accented infinitives are incorrectly splitted when combined with one clitic.No quiere reírse de él.
expected: 'reír', 'se'
got: 'reírse'
-
[gl] a ría[fixed in commit ac8d345]A ría de Vigo.
expected: "ría ría NCFS000"
got: "ría rir VMII3S0" -
[pt] Mos
This entity is incorrectly splitted and tagged as PP+PP, even at positions without ambiguity (e.g. starting with an uppercase letter and in the middle of the sentence).(pt) Estive em Mos no verão.
expected: "Mos mos NP00000"
got: "Mos me+os PP+PP"
-
[es] Castilla-La Mancha[fixed in commit cb6508c]
This is a multi-token entity, but is treated as two independent entities.Yo soy de Castilla-La Mancha.
expected: "Castilla-La_Mancha NP00G00"
got: "Castilla-La castilla-la NP00V00
Mancha mancha NP00G00"