Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Common characters being overridden in chinese simplified and chinese Traditional word lists #410

Open
anupsv opened this issue Jul 2, 2024 · 8 comments

Comments

@anupsv
Copy link

anupsv commented Jul 2, 2024

While going through the code, this code here is causing a problem but hiding it. There are common characters between the 2 word lists and when the map is created in the referenced code, if there is a collision in keys, it is silently overridden to the new file being read. This issue can be seen if the code checks for collision of keys.

@yorickdowne
Copy link
Contributor

@anupsv Can I invite you to discuss this with us at eth-educators#119. If you have suggestions for a solution, or good ways to test the issue, that will be highly welcome.

@anupsv
Copy link
Author

anupsv commented Aug 30, 2024

Hi @yorickdowne, sure happy to join and provide suggestions for the fix. I do have a fix for the issue as well.

@yorickdowne
Copy link
Contributor

yorickdowne commented Sep 2, 2024

@anupsv Oh amazing. What I’m looking for is a sample mnemonic that triggers the issue in current code, and a description of what should happen and what does happen instead. With that I can write a test case.

And then with the fix in, that test will succeed.

@anupsv
Copy link
Author

anupsv commented Sep 2, 2024

No problem. I can produce the required test case and the fix for it.

@yorickdowne
Copy link
Contributor

@anupsv We've written some code to be able to have two languages per symbol/word. We are missing a test case. Please give us a mnemonic that caused a failure before, and describe the failure you saw.

Thank you!

@anupsv
Copy link
Author

anupsv commented Sep 11, 2024

@yorickdowne This was one of the vectors that had failed.
的 的 的 的 的 的 的 的 的 的 的 在

@anupsv
Copy link
Author

anupsv commented Sep 11, 2024

@yorickdowne Related to the fix, I was thinking which more and more common characters, we'd need to look at the entire mnemonic if needed but I hadn't restricted to 2 languages. Was trying to solve for anycase but if you think limiting to 2 languages is good enough for now, sounds good!

@yorickdowne
Copy link
Contributor

yorickdowne commented Oct 4, 2024

I don’t know that we’re limiting to 2, would need to look at the code again. At any rate, we are now prompting users for their desired language if the mnemonic is valid in 2 or more.

Thank you for the report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants