Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure extracting base64 encoded image attached to email in HTML CSS #1323

Open
JAF84 opened this issue Jul 31, 2024 · 2 comments
Open

Failure extracting base64 encoded image attached to email in HTML CSS #1323

JAF84 opened this issue Jul 31, 2024 · 2 comments
Labels

Comments

@JAF84
Copy link

JAF84 commented Jul 31, 2024

ClamAV failed to extract a base64 encoded image attached to email in HTML CSS.

The attached mail.zip contains two files:

  • part.html fails to extract the image because of new line in the src string, here:
    src="data:image/png;
    base64,iVBORw0KGgoA
    
  • part2.html correctly extracts the image. The new line was removed, so the src looks like this:
    src="data:image/png;base64,iVBOR
    

The difference is a newline char before the "base64".

br Johannes

@micahsnyder
Copy link
Contributor

Hi Johannes,

I reviewed the contents of mail.zip and see two HTML files:

  • mail/part.html
  • mail/part1.html

It looks to me like you're reporting an issue with ClamAV failing to extract a PNG file embedded in the HTML using base64 encoded CSS. We'd added support for extracting that in ClamAV 1.1.

part.html is the one that fails to extract the image, while part2.html correctly extracts it. The difference is that part.html has some whitespace (a new line, which is normalized into a single space) in the mime arguments.

The diff of the two files shows it clearly (note that part2 is on the left):
image

The clamscan --debug output also shows where this fails, because the mime argument has that space in it:
image

We'll need to add some logic in there to strip any whitespace in the mime args. I think that'll fix it.
The code in question is right here in mbox.c:
image

I don't have time to work on this right now as I'm fighting other fires. Going to mark this as a bug for now.

@micahsnyder micahsnyder changed the title little problem with "data URI scheme" => content is not been checked Failure extracting base64 encoded image attached to email in HTML CSS Jul 31, 2024
@JAF84
Copy link
Author

JAF84 commented Aug 1, 2024

Hello Micah,

thank you, yes this is exact the issue.

part.html was to original/bad-mail-file
part2.html was a modification of me, just for demonstration about the issue how it should work.

br johannes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants