Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a grammar rule about verbs followed by a name in french #9876

Merged
merged 7 commits into from
Dec 18, 2023

Conversation

Sharcoux
Copy link
Contributor

@Sharcoux Sharcoux commented Dec 10, 2023

This rule adds support for the grammar rule detailed here: https://fr.wiktionary.org/w/index.php?title=Cat%C3%A9gorie:Verbes_en_fran%C3%A7ais_suivis_d%E2%80%99un_nom_commun_sans_d%C3%A9terminant&pageuntil=faire+pression#mw-pages. This rule should detect sentences like "J'ai fée des pâtes" and the likes, where a noun follows a verb.

This rule could probably be improved a bit with the help of someone with more knowledge. We could provide as suggestions the list of the nouns available after each verb.

I'm still trying to figure out how to test my changes.

@Sharcoux
Copy link
Contributor Author

I need help to finish this one. I don't understand how I can generate the suggestions based on the grammar of the current match.

For instance, if I have j'ai pris voiture. I'd like to suggest j'ai pris la voiture. But to know that the correct article is la and not le, I need to know the number of the matched noun. How can I do that?

@GillouLT
Copy link
Collaborator

GillouLT commented Dec 12, 2023

We don't really have a straightforward solution for it, you can look into the WordWithDeterminerFilter, but a determiner would be required in the pattern. Although, WordWithDeterminerFilter can be changed or a new rule filter can also be created for determiner additions.

The following line would also work for some cases:
<match no="2" postag="N ([mfe]) (s|p|sp)" postag_regexp="yes" postag_replace="D $1 $2">le</match>

@Sharcoux
Copy link
Contributor Author

I went to generating suggestions like "un camion|une camion|des camion", knowing that 2 of them would be wrong, but another rule would solve that afterwards. Tell me if another approach would have been better.

@Sharcoux Sharcoux force-pushed the add-rule-verbe-plus-nom branch 4 times, most recently from 2d82d95 to b360256 Compare December 13, 2023 00:59
@Sharcoux
Copy link
Contributor Author

The rule seems ready. We should test it on a corpus. There might be cases I forgot, but if we keep adding them, this rule should detect a lot of mistakes that are currently being ignored.

Copy link
Collaborator

@LucieSteib LucieSteib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Sharcoux, thanks for your PR! Please find here a review (in code conventional comments).

thought: All the rules here aim at changing the noun after the verb. But your original idea was to change the verb when confused with a homophone or near homophone, correct (fait/fée)? Maybe a less broad approach in that direction would be better than creating rules that will probably result in more false positives than true positives.

I'm still trying to figure out how to test my changes.
Please use the RuleEditor in ExpertMode to have at least a first check on the rules (you'll have to do that one by one since the tool can't handle rule groups). That would also maybe help with having a first idea of the corpus-based real situations we are dealing with.

<rule id="VERBE_PLUS_NOM_INCORRECT" name="verbe non listé suivi d'un nom incorrect" default="temp_off">
<pattern>
<token postag="V.*" postag_regexp="yes" inflected="yes">
<exception inflected="yes">attendre</exception>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chore(blocking): please ensure all the lists are in the readable format

Copy link
Collaborator

@LucieSteib LucieSteib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, LGTM.
@Sharcoux is there a reason why you force-push your commits? It’s unlikely you need to change the history from one commit to the very next? That might compromise the merging into master.
@GillouLT you can also have a look and request changes or give your approval.
Anyway we'll see some matches in the next day diff (some more exceptions might really need to be added or the rules changed to narrow the scope).

@Sharcoux
Copy link
Contributor Author

@Sharcoux is there a reason why you force-push your commits? It’s unlikely you need to change the history from one commit to the very next? That might compromise the merging into master.

I force push on my own repo. This should not have any impact on your end. The reason is just because, to be honest, I still haven't set up a dev environment for Java, and I'm using the CI to do the validity check. When I realize that I made a typo, I just amend the commit.

Copy link
Collaborator

@GillouLT GillouLT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also think we're good to go :)

@LucieSteib LucieSteib merged commit 88a3924 into languagetool-org:master Dec 18, 2023
2 checks passed
@Sharcoux
Copy link
Contributor Author

I don't know the rest of the process, but if some new problems are met, don't hesitate to list them here, I'll fix the MR to add the required exception.

@LucieSteib
Copy link
Collaborator

LucieSteib commented Dec 18, 2023

Hi Sharcoux,

You will see the diff for your rules tomorrow in https://internal1.languagetool.org/regression-tests/via-http/2023-12-17/ (with the new date :) )so you can make changes.

@Sharcoux
Copy link
Contributor Author

Discussion about this rule is also here: #9875

Some fixes are needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants