Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pt-PT] Improved rule ID:CONFUSÃO_CAIXA_EMBALAGEM and cleaned other rule #10932

Merged
merged 2 commits into from
Oct 13, 2024

Conversation

marcoagpinto
Copy link
Member

@marcoagpinto marcoagpinto commented Oct 13, 2024

Improved the rule with more verbs, but it still has zero hits.

Summary by CodeRabbit

  • New Features

    • Introduced new rules to enhance formal language use in Portuguese, including suggestions for replacing informal phrases.
    • Added a rule to suggest replacing "ciclo vicioso" with "círculo vicioso" for clarity.
    • Expanded rules to discourage the use of gerunds and avoid redundancy in phrases.
    • New suggestions for more formal alternatives to common informal phrases.
  • Improvements

    • Updated existing rules for better specificity and clarity in language processing, including refined exceptions for certain parts of speech.

Copy link
Contributor

coderabbitai bot commented Oct 13, 2024

Walkthrough

The pull request introduces comprehensive modifications to the style.xml file within the Portuguese language modules of LanguageTool. Key changes include updates to existing rules, the addition of new rules, and enhancements aimed at improving language specificity and clarity. Notable adjustments involve refining token patterns, altering activation statuses, and suggesting more formal language alternatives.

Changes

File Path Change Summary
languagetool-language-modules/pt/src/main/resources/org/languagetool/rules/pt/pt-PT/style.xml - Added rule CICLO_VICIOSO to suggest replacing "ciclo vicioso" with "círculo vicioso".
- Updated rule CONFUSÃO_CAIXA_EMBALAGEM with default="temp_off" and refined token pattern.
- Updated rule PRAZER_GOSTO for correct exception application.
- Expanded AVOID_GERUND rule group with new patterns.
- Added rule EM_DETALHE_DETALHADAMENTE_PT_PT to simplify "em detalhe".
- Updated FORMAL_SPEECH_PT_PT to include more patterns discouraging informal language.

Possibly related PRs

Suggested reviewers

  • p-goulart
  • susanaboatto

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
languagetool-language-modules/pt/src/main/resources/org/languagetool/rules/pt/pt-PT/style.xml (3)

Line range hint 89-102: Improved rule coverage and efficiency.

The changes to the CONFUSÃO_CAIXA_EMBALAGEM rule are well-implemented. The expanded verb list using a regular expression improves the rule's coverage and maintainability. Setting the rule to "temp_off" is a good precaution while testing these changes.

Consider adding a comment explaining why the rule is temporarily disabled and when it should be re-enabled.


Line range hint 123-133: Well-implemented new rule for common language misconception.

The addition of the CICLO_VICIOSO rule is a valuable improvement. It addresses a common language error and provides a helpful suggestion. The inclusion of a reference URL is excellent for providing context.

Consider adding an example that demonstrates a correct usage of "círculo vicioso" to further clarify the rule's application.


Line range hint 136-271: Improved accuracy of gerund detection with new antipatterns.

The changes to the AVOID_GERUND rule group are well-implemented. The addition of new antipatterns and refinements to existing rules should significantly reduce false positives while maintaining the rule's effectiveness in identifying unnecessary gerund usage.

Consider adding comments to explain the purpose of each antipattern, which would improve maintainability and make it easier for other contributors to understand and extend the rules in the future.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 20d222e and 39486d3.

📒 Files selected for processing (1)
  • languagetool-language-modules/pt/src/main/resources/org/languagetool/rules/pt/pt-PT/style.xml (2 hunks)
🧰 Additional context used
🔇 Additional comments (1)
languagetool-language-modules/pt/src/main/resources/org/languagetool/rules/pt/pt-PT/style.xml (1)

Line range hint 105-120: Refined rule to prevent false positives.

The addition of the exception <exception scope='previous' postag_regexp='yes' postag='V.+'/></token> in the PRAZER_GOSTO rule is a good improvement. This change should help prevent false positives by excluding cases where the previous word is a verb.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
languagetool-language-modules/pt/src/main/resources/org/languagetool/rules/pt/pt-PT/style.xml (3)

Line range hint 104-120: Good refinement of the 'PRAZER_GOSTO' rule.

The addition of the exception <exception scope='previous' postag_regexp='yes' postag='V.+'/></token> helps prevent false positives when "prazer" is preceded by verb forms. This should improve the rule's accuracy.

To enhance clarity, consider adding a comment explaining the purpose of this exception, such as:

<!-- Exclude cases where 'prazer' is preceded by a verb, e.g., "ter prazer em" -->

This will help other developers understand the reasoning behind the exception.


Line range hint 123-134: Excellent addition of the 'CICLO_VICIOSO' rule!

This new rule addresses a common language misconception in Portuguese, helping users improve their writing by suggesting the correct phrase "círculo vicioso" instead of "ciclo vicioso".

To further enhance this rule, consider adding a short message explaining why "círculo vicioso" is preferred over "ciclo vicioso". This could be done by modifying the <message> tag:

<message>Substitua por <suggestion>círculo vicioso</suggestion>. "Círculo vicioso" é a expressão correta para descrever uma situação problemática que se perpetua.</message>

This additional explanation will help users understand the reason for the correction, potentially improving their language skills.


Line range hint 137-255: Valuable refinements to the 'AVOID_GERUND' rule group.

The addition of multiple antipatterns significantly improves the rule's accuracy by excluding valid uses of the gerund. This should reduce false positives and make the rule more reliable.

To enhance maintainability and readability, consider grouping related antipatterns and adding comments to explain the purpose of each group. For example:

<!-- Antipatterns for specific verb combinations -->
<antipattern>
  <!-- existing antipattern -->
</antipattern>

<!-- Antipatterns for gerunds with pronouns -->
<antipattern>
  <!-- existing antipattern -->
</antipattern>

<!-- Add more grouped antipatterns with explanatory comments -->

This organization will make it easier for future contributors to understand and modify the rule as needed.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 39486d3 and 7d5bf4c.

📒 Files selected for processing (1)
  • languagetool-language-modules/pt/src/main/resources/org/languagetool/rules/pt/pt-PT/style.xml (2 hunks)
🧰 Additional context used
🔇 Additional comments (1)
languagetool-language-modules/pt/src/main/resources/org/languagetool/rules/pt/pt-PT/style.xml (1)

Line range hint 89-101: Excellent improvements to the 'CONFUSÃO_CAIXA_EMBALAGEM' rule!

The changes enhance the rule's functionality and accuracy:

  1. Adding default="temp_off" allows for easier testing and gradual deployment.
  2. Using a regular expression for verb matching increases the rule's coverage.
  3. The new suggestion with postag_replace should provide more accurate corrections.

These modifications should make the rule more effective in identifying and correcting confusion between "caixa" and "embalagem" in pharmaceutical contexts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant