Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pt-PT] Improved antipattern in rule ID:POSSESSIVE_WITHOUT_ARTICLE #9376

Merged
merged 6 commits into from
Sep 25, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -1398,6 +1398,12 @@ Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.

<rulegroup id='GENERAL_GENDER_AGREEMENT_ERRORS' name="Concordância: Geral">
<url>https://pt.wikibooks.org/wiki/Portugu%C3%AAs/Concord%C3%A2ncia/Concord%C3%A2ncia_nominal</url>
<antipattern>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This antipattern technically allows for:

o que você acha dos outros regras?
quero dar mais consistência aos outros regras

Both of which are incorrect, so I don't think we should be introducing this.


The key word here is the verb. If you have a verb that governs a certain preposition (such as 'impor', in your example), then we should allow these constructions. I'd suggest adding an extra token accounting for this verb.

<token postag='SPS00:DA0MP0' postag_regexp='no'/>
<token>outros</token>
<token postag='AQ..P.+|NC.P.+' postag_regexp='yes'/>
<example>Eles preferem impor aos outros regras.</example>
</antipattern>
<antipattern> <!-- picky rule covered by 'AS_MILHARES_DE_PESSOAS' -->
<token postag='(SPS00:)?[DP][ADIPR].FP.+' postag_regexp='yes'/>
<token regexp="yes">&numeros_masculinos;</token>
Expand Down Expand Up @@ -8426,6 +8432,15 @@ Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
<example>as bagagens não convencionais agora desaparecerão</example>
</antipattern>

<antipattern>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key blocker of agreement here is the verb in the first-person singular, not the whole construction. I'd rethink this.

<token postag='(SPS00:)?[DP][ADIPRT]..P.+' postag_regexp='yes'/>
<token postag='AQ..P.+|NC.P.+' postag_regexp='yes'/>
<token min='1' max='2' postag='RG' postag_regexp='no'/>
<token postag='VMIS.+' postag_regexp='yes'/>
<token postag='VMN0000' postag_regexp='no'/>
<example>As unhas nunca mais pude arranjar.</example>
</antipattern>

<rule> <!-- #1: eu + verb -->
<antipattern>
<token regexp='yes' case_sensitive='yes'>e|ou</token>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1423,16 +1423,27 @@
</antipattern>
<!-- MARCOAGPINTO 2022-08-03 (Checked/Enhanced) (25-JUL-2022+) *END* -->

<!-- MARCOAGPINTO 2022-11-25 (Checked/Enhanced) (25-JUL-2022+) *START* -->
<antipattern>
<token postag='SENT_START|_PUNCT' postag_regexp='yes'/>
<token postag='SENT_START|_PUNCT|I|SPS00:DA.+' postag_regexp='yes'>
<exception regexp='yes'>como|já|mas|também</exception> <!-- Used specific words because POSes remove valid entries -->
</token>
<token postag='DP1.+' postag_regexp='yes'/>
<token regexp='yes'>car[ao]s?|estimad[ao]s?|excelentes?|excelentíssim[ao]s?|querid[ao]s?</token>
<token regexp='yes'>adoráv(el|eis)|amantes?|amig[ao]s?|car[ao]s?|clientes?|colaborador(es)?|colegas?|companheir[ao]s?|empregad[ao]s?|estimad[ao]s?|excelentes?|excelentíssim[ao]s?|filh[ao]s?|funcionári[ao]s?|inimig[ao]s?|irmã(s|o|os)?|mães?|médic[ao]s?|pais?|prim[ao]s?|professor(res)?|querid[ao]s?</token>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick 🥜] to avoid this type of thing, why not just a list of words with a POS tag filter for nouns/adjectives, and inflected="yes"?

<token min='0' max='2' postag='NC.+|AQ.+|NP.+|UNKNOWN' postag_regexp='yes'>
<exception postag_regexp='yes' postag='V.+|RG'/>
</token>
<token postag='_QUOT|_PUNCT|CC|SENT_END' postag_regexp='yes'>
<exception postag_regexp='yes' postag='V.+|RG'/>
<exception scope='next' postag_regexp='yes' postag='[DP]P.+'/>
</token>
<example>Meus caros músicos, se achar que é necessário uma música ungida para tocar, componha uma.</example>
<example>E eles dizem a ela com amor: “Minha querida criança”.</example>
<example>Obrigado por tudo, meu caro Joshua!</example>
<example>Olá, meus amigos e amigas de longa data.</example>
<example>Não iria antes ligar pro meu pai.</example>
<example>Valeu minha querida amiga, beijinhos!!!</example>
<example>Então meu caro Kotscho, somente Deus sabe a hora e quanto tempo de vida ainda teremos.</example>
</antipattern>
<!-- MARCOAGPINTO 2022-11-25 (Checked/Enhanced) (25-JUL-2022+) *END* -->

<antipattern>
<unify>
Expand Down Expand Up @@ -1699,6 +1710,20 @@
-->
<url>https://ciberduvidas.iscte-iul.pt/consultorio/perguntas/a-colocacao-dos-pronomes-atonos/11366</url>
<short>Erro de colocação pronominal</short>
<antipattern>
<token postag='V.+' postag_regexp='yes'>
<exception postag_regexp='yes' postag='VMP00.+|VMIS.+|RG|VM[NG]0000|NC.+|AQ.+'/>
</token>
<token regexp='yes'>e|ou</token>
<token>se</token>
<token postag='VMI[MPS]3.+|VMM03.+' postag_regexp='yes'>
<exception scope='next' postag_regexp='yes' postag='RM|NC.+|AQ.+|SPS00:DA.+|_PUNCT_COMMA|SENT_END'/>
</token>
<example>Os polícias escolhem os alvos, e se avançam ou se recuam para eles.</example>
<example>Poucas são as pessoas que pensam e se preocupam com esses animais que estão esperando a morte nos matadouros.</example>
<example>Algum tempo depois ele me reconhece e se precipita em minha direção, com gestos amistosos.</example>
<example>Ethan e seu melhor amigo, Benny Weir (Atticus Dean Mitchell), a seguem e se deparam com Sarah comendo um rato.</example>
</antipattern>

<!-- MARCOAGPINTO 2023-05-06 (Checked/Enhanced) (2-MAR-2023+) *START* -->
<antipattern>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8733,6 +8733,8 @@ USA
<example correction="devida">Não vos posso dar a atenção <marker>que merecem</marker>.</example>
</rule>
<!-- PARA TER ACONTECIDO para acontecer -->


<rulegroup id='PARA-POR_TER_PARTICIPIO-PASSADO' name='[Simplificar] Por + V.Ter + V.Part.Pass. → V.Ter baseado no V.Part.Pass.' type="style" tone_tags="academic">
<!--IDEA shorten_it-->
<!-- Created by Marco A.G.Pinto with Ricardo Joseh Lima suggestions, Portuguese rule 2022-09-01 + 2023-02-26 (Checked/Enhanced) (25-JUL-2022+) -->
Expand All @@ -8741,6 +8743,18 @@ USA
Isso por eu ter estado a ler o artigo. → Isso por eu estar a ler o artigo.
Isso por eu ter estado a ler o artigo. → Isso por estar a ler o artigo.
-->

<antipattern>
<token>por</token>
<token min='0' max='1' postag='RN' postag_regexp='no'/>
<token>ter</token>
<token postag='VMP00P.+' postag_regexp='yes'/>
<token min='0' max='1'>de</token>
<token postag='NC.+|AQ.+|NP.+' postag_regexp='yes'/>
<example>Posso ir diretamente a exame por ter carradas de ECTS.</example>
<example>A faixa diferencia-se dos demais temas de Carnaval dos anos anteriores por não ter batidas aceleradas ou um refrão repetitivo, tendo uma levada de black music "mid-time".</example>
</antipattern>

<rule>
<pattern>
<token regexp='no'>por</token>
Expand Down