Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[$250] Single backtick code block is rendered within triple backtick code block when there is emoji #49276

Open
6 tasks done
IuliiaHerets opened this issue Sep 16, 2024 · 21 comments
Assignees
Labels
Bug Something is broken. Auto assigns a BugZero manager. Daily KSv2 External Added to denote the issue can be worked on by a contributor Help Wanted Apply this label when an issue is open to proposals by contributors

Comments

@IuliiaHerets
Copy link

IuliiaHerets commented Sep 16, 2024

If you haven’t already, check out our contributing guidelines for onboarding and email [email protected] to request to join our Slack channel!


Version Number: 9.0.35-0
Reproducible in staging?: Y
Reproducible in production?: Y
Email or phone of affected tester (no customers): [email protected]
Issue reported by: Applause Internal Team

Action Performed:

  1. Go to staging.new.expensify.com
  2. Go to report.
  3. Send this message
`diamond`
  1. Note that there is only one code block.
  2. Send the same message with emoji behind the text.
`diamond`💍

Expected Result:

The single backtick code block will not be rendered within the triple backtick code block when there is an emoji.

Actual Result:

The single backtick code block is rendered within the triple backtick code block when there is an emoji.

Workaround:

Unknown

Platforms:

  • Android: Native
  • Android: mWeb Chrome
  • iOS: Native
  • iOS: mWeb Safari
  • MacOS: Chrome / Safari
  • MacOS: Desktop

Screenshots/Videos

Bug6604455_1726412999725.20240915_230535.mp4

View all open jobs on GitHub

Upwork Automation - Do Not Edit
  • Upwork Job URL: https://www.upwork.com/jobs/~021836063922739129987
  • Upwork Job ID: 1836063922739129987
  • Last Price Increase: 2024-09-17
Issue OwnerCurrent Issue Owner: @Ollyws
@IuliiaHerets IuliiaHerets added Daily KSv2 Bug Something is broken. Auto assigns a BugZero manager. labels Sep 16, 2024
Copy link

melvin-bot bot commented Sep 16, 2024

Triggered auto assignment to @trjExpensify (Bug), see https://stackoverflow.com/c/expensify/questions/14418 for more details. Please add this bug to a GH project, as outlined in the SO.

@IuliiaHerets
Copy link
Author

We think that this bug might be related to Live Markdown

@IuliiaHerets
Copy link
Author

@trjExpensify FYI I haven't added the External label as I wasn't 100% sure about this issue. Please take a look and add the label if you agree it's a bug and can be handled by external contributors

@trjExpensify
Copy link
Contributor

I can repro it, dropping to $125 as it's not really that valuable a problem to solve.

image

@trjExpensify trjExpensify added the External Added to denote the issue can be worked on by a contributor label Sep 17, 2024
@melvin-bot melvin-bot bot changed the title Single backtick code block is rendered within triple backtick code block when there is emoji [$250] Single backtick code block is rendered within triple backtick code block when there is emoji Sep 17, 2024
Copy link

melvin-bot bot commented Sep 17, 2024

Job added to Upwork: https://www.upwork.com/jobs/~021836063922739129987

@melvin-bot melvin-bot bot added the Help Wanted Apply this label when an issue is open to proposals by contributors label Sep 17, 2024
Copy link

melvin-bot bot commented Sep 17, 2024

Triggered auto assignment to Contributor-plus team member for initial proposal review - @Ollyws (External)

@Irbis007
Copy link

Hello, there is a solution to your problem. You can find emojis using UTF-8 encoding and replace them with an empty string using the replace method.

Copy link

melvin-bot bot commented Sep 17, 2024

📣 @Irbis007! 📣
Hey, it seems we don’t have your contributor details yet! You'll only have to do this once, and this is how we'll hire you on Upwork.
Please follow these steps:

  1. Make sure you've read and understood the contributing guidelines.
  2. Get the email address used to login to your Expensify account. If you don't already have an Expensify account, create one here. If you have multiple accounts (e.g. one for testing), please use your main account email.
  3. Get the link to your Upwork profile. It's necessary because we only pay via Upwork. You can access it by logging in, and then clicking on your name. It'll look like this. If you don't already have an account, sign up for one here.
  4. Copy the format below and paste it in a comment on this issue. Replace the placeholder text with your actual details.
    Screen Shot 2022-11-16 at 4 42 54 PM
    Format:
Contributor details
Your Expensify account email: <REPLACE EMAIL HERE>
Upwork Profile Link: <REPLACE LINK HERE>

@Irbis007
Copy link

Irbis007 commented Sep 17, 2024

Contributor details
Your Expensify account email: [email protected]
Upwork Profile Link: https://www.upwork.com/freelancers/~0113f16cb1bc8ffdab

Hello, there is a solution to your problem. You can find emojis using UTF-8 encoding and replace them with an empty string using the replace method.

Copy link

melvin-bot bot commented Sep 17, 2024

✅ Contributor details stored successfully. Thank you for contributing to Expensify!

@abhinaybathina
Copy link

abhinaybathina commented Sep 17, 2024

Edited by proposal-police: This proposal was edited at 2024-09-22 06:03:09 UTC.

Proposal

Please re-state the problem that we are trying to solve in this issue.

The single backtick code block will not be rendered within the triple backtick code block when there is an emoji

What is the root cause of that problem?

The problem is in the ExpensiMark file of expensify-common package, the regex in formatting rule of inlineCodeBlock is slightly incorrect. The previous regex failed due to overly greedy matching, inconsistent handling of nested backticks (single inside triple), and not accounting for emojis properly.

What changes do you think we should make in order to solve the problem?

We need to update the rule with the updated regex and replacement

  name: 'inlineCodeBlock',
  regex: /(\B|_|)&#x60;((?:&#x60;)*(?!&#x60;)(.*?)&#x60;)(\B|_|)(?!&#x60;|[^<]*<\/pre>|[^<]*<\/video>)(?=\s*<emoji>.*<\/emoji>|$)/gm,
  replacement: (_extras, _match, g1, g2, g3, g4) => {
      const g2Value = g2.trim() === '' ? g2.replaceAll(' ', '&nbsp;') : g3;
      return `${g1}<code>${g2Value}</code>${g4}`;
  },

What alternative solutions did you explore? (Optional)

Working solution

Screen.Recording.2024-09-22.at.11.22.30.AM.1.mov

@Ollyws
Copy link
Contributor

Ollyws commented Sep 19, 2024

@abhinaybathina Thanks for the proposal but won't your solution disable markdown emojis completely?

@abhinaybathina
Copy link

Hi @Ollyws, no it won't disable. The rule is just to format the emoji to wrap within tags. But because of this rule, it's
separates the text and emojis which is causing this issue.

@Ollyws
Copy link
Contributor

Ollyws commented Sep 19, 2024

That rule was intentionally added in Expensify/expensify-common#669, I don't think it's a good idea to disable it as it will cause the original issue to re-appear.

@abhinaybathina
Copy link

I couldn't see any changes related to this rule in Expensify/expensify-common#669. Please correct me if I am wrong

@Ollyws
Copy link
Contributor

Ollyws commented Sep 19, 2024

in Expensify/expensify-common@7d6659c, I don't know why it doesn't appear in 'Files Changed'...strange.

@abhinaybathina
Copy link

Thanks for sharing, I'll check and let you know

@ra-md
Copy link
Contributor

ra-md commented Sep 21, 2024

Proposal

Please re-state the problem that we are trying to solve in this issue.

Single backtick code block is rendered within triple backtick code block when followed by emoji

What is the root cause of that problem?

Currently, the regex for inline code blocks rendered within triple backtick code blocks in ExpensifyMark incorrectly matches cases where a single backtick code block is followed by an emoji, as in:

```
`diamond` 💍
```

This is wrongly converted to: <pre><code>diamond</code><emoji>💍</emoji></pre>

It should be converted into: <pre>&#x60;diamond&#x60;<emoji>💍</emoji></pre>

The issue occurs because the inline code block is followed by the "<" symbol from the <emoji> tag, which the regex fails to handle properly.

What changes do you think we should make in order to solve the problem?

Modify this rule to prevent matching inline code blocks when they are followed by a "<" symbol, such as an emoji tag.

should be updated to:

/(\B|_|)&#x60;((?:&#x60;)*(?!&#x60;).*?[\S| |\u00A0]+?.*?(?<!&#x60;)(?:&#x60;)*)&#x60;(\B|_|)(?!&#x60;|(?!<pre>)[^<]*(?:<(?!pre>)[^<]*)*<\/pre>|[^<]*<\/video>)/gm

This would replace the |[^<]*<\/pre>| part in the negative lookahead with |(?!<pre>)[^<]*(?:<(?!pre>)[^<]*)*<\/pre>|.

What alternative solutions did you explore? (Optional)

Result

Screenshot_20240921_133142

@Ollyws
Copy link
Contributor

Ollyws commented Sep 21, 2024

@ra-md Thanks for the proposal, but with your regex the $4 and $6 placeholders are being added to the string:

Screen.Recording.2024-09-21.at.14.33.49.mov

@ra-md
Copy link
Contributor

ra-md commented Sep 21, 2024

My proposal is based on the inlineCodeBlock rule from Expensify-common v2.0.90 (latest version). The current version used by Expensify is v2.0.84, which has a slight difference in its inlineCodeBlock rule.

v2.0.84

{
    name: 'inlineCodeBlock',
    // Use the url escaped version of a backtick (`) symbol. Mobile platforms do not support lookbehinds,
    // so capture the first and third group and place them in the replacement.
    // but we should not replace backtick symbols if they include <pre> tags between them.
    regex: /(\B|_|)&#x60;((?:&#x60;)*)(?!&#x60;)(.*?\S+?.*?)(?<!&#x60;)((?:&#x60;)*)(&#x60;)(\B|_|)(?!&#x60;|[^<]*<\/pre>|[^<]*<\/video>)/gm,
    replacement: '$1<code>$2$3$4</code>$6',
},

v2.0.90

{
    name: 'inlineCodeBlock',

    // Use the url escaped version of a backtick (`) symbol. Mobile platforms do not support lookbehinds,
    // so capture the first and third group and place them in the replacement.
    // but we should not replace backtick symbols if they include <pre> tags between them.
    // At least one non-whitespace character or a specific whitespace character (" " and "\u00A0")
    // must be present inside the backticks.
    regex: /(\B|_|)&#x60;((?:&#x60;)*(?!&#x60;).*?[\S| |\u00A0]+?.*?(?<!&#x60;)(?:&#x60;)*)&#x60;(\B|_|)(?!&#x60;|[^<]*<\/pre>|[^<]*<\/video>)/gm,
    replacement: (_extras, _match, g1, g2, g3) => {
        const g2Value = g2.trim() === '' ? g2.replaceAll(' ', '&nbsp;') : g2;
        return `${g1}<code>${g2Value}</code>${g3}`;
    },
},

Please modify the replacement function to match the one in v2.0.90.

@abhinaybathina
Copy link

@Ollyws I've updated my proposal here #49276 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something is broken. Auto assigns a BugZero manager. Daily KSv2 External Added to denote the issue can be worked on by a contributor Help Wanted Apply this label when an issue is open to proposals by contributors
Projects
Status: No status
Development

No branches or pull requests

6 participants