Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About encode /r/n #1655

Open
H0LIXI opened this issue Oct 24, 2024 · 2 comments
Open

About encode /r/n #1655

H0LIXI opened this issue Oct 24, 2024 · 2 comments

Comments

@H0LIXI
Copy link

H0LIXI commented Oct 24, 2024

problem
I found that in EPP version 7, when reading a cell without modifying its content and then saving it, all the newline characters get re-encoded.

Using Microsoft Excel, create a table and enter content with a newline. Save it directly. When you check the sharedStrings.xml file in hexadecimal, you can see that the newline character is 0D0A, which corresponds to \r\n.
After saving with EPP version 7, 0D0A gets re-encoded as 5F 76 30 30 30 44 5F 0A, which corresponds to x000D\n.
After saving with EPP version 6, 0D0A gets rewritten as 0A, which corresponds to \n. (This might be because version 6 uses XmlNodeList to read, which converts \r\n to \n, while version 7 uses XmlReader and can read the original \r\n.)

I'm not quite sure why \r needs to be re-encoded as x000D. Perhaps there's some historical reason for this. The encoding happens in ConvertUtil.ExcelEncodeString.

Desired Solution
I hope to retain the original \r\n because if the newline characters throughout the file change, it will cause a lot of differences when viewed in comparison tools like BeyondCompare5, which can be confusing.
Alternatively, could we refer to the behavior in EPP version 6, but replace \r\n with \n when saving? (Because BeyondCompare can recognize \r\n and \n as the same.)

@JanKallman
Copy link
Contributor

I'll have a look at the encoding again. The x000D (for cr) encoding is used for characters below 0x20, but should probably not be used for crlf.

JanKallman added a commit that referenced this issue Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@JanKallman @H0LIXI and others