Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: UID 11544 has defects preventing it from being processed! #165

Closed
poidl opened this issue Oct 20, 2023 · 2 comments
Closed

ERROR: UID 11544 has defects preventing it from being processed! #165

poidl opened this issue Oct 20, 2023 · 2 comments

Comments

@poidl
Copy link

poidl commented Oct 20, 2023

General informations

  • system/distribution (with version): Arch Linux
  • offlineimap version (offlineimap -V): offlineimap v8.0.0, imaplib2 v3.06, Python v3.11.5 19 Sep 2023
  • Python version: Python v3.11.5

Logs, error

Oct 20 09:47:34 myhostname offlineimap[3647]: Copy message UID 11544 (4224/5482) Remote:INBOX -> Local:INBOX
Oct 20 09:47:34 myhostname offlineimap[3647]: UID 11544 has defects: [StartBoundaryNotFoundDefect(), MultipartInvariantViolationDefect()]
Oct 20 09:47:34 myhostname offlineimap[3647]: ERROR: UID 11544 (<[email protected]>) has defects preventing it from being processed!
Oct 20 09:47:34 myhostname offlineimap[3647]:   UnicodeEncodeError: 'ascii' codec can't encode characters in position 102-104: ordinal not in range(128)

Steps to reproduce the error

  • Construct email with header
Content-Type: multipart/alternative; boundary="yikVu2khOYniio2Jx"

but no start or stop boundary. For example, take any plain/text email and change the 'Content-Type' to the above

  • Then put any non-8-bit UTF-8 character into the email body.

Questions

I'm a beginner trying to archive my email, and there is one specific type of emails from a mailing list that caused the above error logs. I tried to find one of the troublesome emails folling this guide:

https://www.offlineimap.org/server/imap/error/2016/01/27/error-no-such-number.html

After identifying the email, I found it in my Kmail. I opened the raw message in KMail and noticed the following headers:

X-Virus-Scanned: amavisd-new at redacted.example.com
X-Amavis-Alert: BAD HEADER SECTION, MIME error: error: unexpected end of preamble

Then I opened

https://github.com/OfflineIMAP/offlineimap3/blob/master/offlineimap/folder/IMAP.py

changed the following section

        if len(ndata1.defects) > 0:
            # We don't automatically apply fixes as to attempt to preserve the original message
            self.ui.warn("UID {} has defects: {}".format(uids, ndata1.defects))
            if any(isinstance(defect, NoBoundaryInMultipartDefect) for defect in ndata1.defects):
                # (Hopefully) Rare defect from a broken client where multipart boundary is
                # not properly quoted.  Attempt to solve by fixing the boundary and parsing
                self.ui.warn(" ... applying multipart boundary fix.")
                ndata1 = self.parser['8bit-RFC'].parsebytes(self._quote_boundary_fix(data[0][1]))

to

        if len(ndata1.defects) > 0:
            # We don't automatically apply fixes as to attempt to preserve the original message
            self.ui.warn("UID {} has defects: {}".format(uids, ndata1.defects))
            if any(isinstance(defect, NoBoundaryInMultipartDefect) for defect in ndata1.defects):
                # (Hopefully) Rare defect from a broken client where multipart boundary is
                # not properly quoted.  Attempt to solve by fixing the boundary and parsing
                self.ui.warn(" ... applying multipart boundary fix.")
                ndata1 = self.parser['8bit-RFC'].parsebytes(self._quote_boundary_fix(data[0][1]))
            if myid.split('@')[1] == 'sender.example.com>':
                if any(isinstance(defect, MultipartInvariantViolationDefect) for defect in ndata1.defects):
                    ndata1.replace_header("content-type", 'plain/text')

which resulted in offlineimap3 downloading the file. My lines may very well be nonsense, I only started to read about emails today. The point of the modification was to download the actual message to see what's going on.

Then I thought: offlineimap3 should indeed throw an error here, if the email isn't constructed correctly.

But then I had some doubts, and now I'm not sure what to think anymore:

  • First, note that the error logged above only appears if there is a non-ascii character in the (intended) body. If there are all ascii characters, the file is downloaded and stored fine. If I understand correctly, it's because even if the "body" is interpreted as metadata/headers, it does not throw an exception if the parsing as ascii works. But isn't the email still corrupt? For example, I tried to search it with notmuch, specifically using the body: search term, and it could not find anything. Perhaps parses the email again, separately from offlineimap3, and also interprets the (intended) body as metadata. When I trid searching with Kmail, again filtering the body, it worked.
  • Second, for archiving purposes, I'd like to download pretty much everything, even poorly constructed emails.

The forum software that sent the email is phpBB, but only specific types of emails (of type "watch forum") have the weired content-type, the others are fine with

Content-Type: text/plain; charset="us-ascii"

Edit UTC 15:17,Friday, 20 October 2023 : I notified the forum administrator and they said they are looking if a bug is filed in phpBB regarding this problem.

Edit 15:56:48 UTC Friday, 20 October 2023
Related:
#160
#107

@thekix
Copy link
Member

thekix commented Nov 22, 2023

Hi,

after apply #107, is this problem solved?

Best regards,
kix

@poidl
Copy link
Author

poidl commented Nov 22, 2023

Sorry I can't test it any longer. Feel free to close, I'll reopen an issue if the bug appears again.

Thanks!

@thekix thekix closed this as completed Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants