-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pt] Improve compounding rules #9213
Conversation
- extract all colour names into a separate TXT file; - add new Java rule for the colour compounding.
ce66d73
to
e0394cc
Compare
@susanaboatto have you had a moment to review this? Afraid this PR might get stale. |
@@ -38791,6 +38791,71 @@ Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. | |||
|
|||
|
|||
<category id='COMPOUNDING' name="Palavras Compostas" type="misspelling"> | |||
<rule id="SEM_ABRIGO" name="Composição de sem-abrigo"> | |||
<pattern> | |||
<token postag_regexp="yes" postag="[ADP].+M[SP].+|Z.+|SENT_START"></token> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit weary of including adjectives here, I feel like some FPs will happen. Hopefully few enough to tackle with antipatterns. For example: Eu com casa e o bonito sem abrigo
Also with DP*
:
O meu em casa e o seu sem abrigo
Maybe adjectives and possessives should be dealt with separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, you're not wrong, I'll remove adjectives from here and see what happens.
Blocked by #9216⚠️