You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Oct 20 09:47:34 myhostname offlineimap[3647]: Copy message UID 11544 (4224/5482) Remote:INBOX -> Local:INBOX
Oct 20 09:47:34 myhostname offlineimap[3647]: UID 11544 has defects: [StartBoundaryNotFoundDefect(), MultipartInvariantViolationDefect()]
Oct 20 09:47:34 myhostname offlineimap[3647]: ERROR: UID 11544 (<[email protected]>) has defects preventing it from being processed!
Oct 20 09:47:34 myhostname offlineimap[3647]: UnicodeEncodeError: 'ascii' codec can't encode characters in position 102-104: ordinal not in range(128)
but no start or stop boundary. For example, take any plain/text email and change the 'Content-Type' to the above
Then put any non-8-bit UTF-8 character into the email body.
Questions
I'm a beginner trying to archive my email, and there is one specific type of emails from a mailing list that caused the above error logs. I tried to find one of the troublesome emails folling this guide:
if len(ndata1.defects) > 0:
# We don't automatically apply fixes as to attempt to preserve the original message
self.ui.warn("UID {} has defects: {}".format(uids, ndata1.defects))
if any(isinstance(defect, NoBoundaryInMultipartDefect) for defect in ndata1.defects):
# (Hopefully) Rare defect from a broken client where multipart boundary is
# not properly quoted. Attempt to solve by fixing the boundary and parsing
self.ui.warn(" ... applying multipart boundary fix.")
ndata1 = self.parser['8bit-RFC'].parsebytes(self._quote_boundary_fix(data[0][1]))
to
if len(ndata1.defects) > 0:
# We don't automatically apply fixes as to attempt to preserve the original message
self.ui.warn("UID {} has defects: {}".format(uids, ndata1.defects))
if any(isinstance(defect, NoBoundaryInMultipartDefect) for defect in ndata1.defects):
# (Hopefully) Rare defect from a broken client where multipart boundary is
# not properly quoted. Attempt to solve by fixing the boundary and parsing
self.ui.warn(" ... applying multipart boundary fix.")
ndata1 = self.parser['8bit-RFC'].parsebytes(self._quote_boundary_fix(data[0][1]))
if myid.split('@')[1] == 'sender.example.com>':
if any(isinstance(defect, MultipartInvariantViolationDefect) for defect in ndata1.defects):
ndata1.replace_header("content-type", 'plain/text')
which resulted in offlineimap3 downloading the file. My lines may very well be nonsense, I only started to read about emails today. The point of the modification was to download the actual message to see what's going on.
Then I thought: offlineimap3 should indeed throw an error here, if the email isn't constructed correctly.
But then I had some doubts, and now I'm not sure what to think anymore:
First, note that the error logged above only appears if there is a non-ascii character in the (intended) body. If there are all ascii characters, the file is downloaded and stored fine. If I understand correctly, it's because even if the "body" is interpreted as metadata/headers, it does not throw an exception if the parsing as ascii works. But isn't the email still corrupt? For example, I tried to search it with notmuch, specifically using the body: search term, and it could not find anything. Perhaps parses the email again, separately from offlineimap3, and also interprets the (intended) body as metadata. When I trid searching with Kmail, again filtering the body, it worked.
Second, for archiving purposes, I'd like to download pretty much everything, even poorly constructed emails.
The forum software that sent the email is phpBB, but only specific types of emails (of type "watch forum") have the weired content-type, the others are fine with
Content-Type: text/plain; charset="us-ascii"
Edit UTC 15:17,Friday, 20 October 2023 : I notified the forum administrator and they said they are looking if a bug is filed in phpBB regarding this problem.
Edit 15:56:48 UTC Friday, 20 October 2023
Related: #160 #107
The text was updated successfully, but these errors were encountered:
General informations
offlineimap -V
): offlineimap v8.0.0, imaplib2 v3.06, Python v3.11.5 19 Sep 2023Logs, error
Steps to reproduce the error
but no start or stop boundary. For example, take any plain/text email and change the 'Content-Type' to the above
Questions
I'm a beginner trying to archive my email, and there is one specific type of emails from a mailing list that caused the above error logs. I tried to find one of the troublesome emails folling this guide:
https://www.offlineimap.org/server/imap/error/2016/01/27/error-no-such-number.html
After identifying the email, I found it in my Kmail. I opened the raw message in KMail and noticed the following headers:
Then I opened
https://github.com/OfflineIMAP/offlineimap3/blob/master/offlineimap/folder/IMAP.py
changed the following section
to
which resulted in offlineimap3 downloading the file. My lines may very well be nonsense, I only started to read about emails today. The point of the modification was to download the actual message to see what's going on.
Then I thought: offlineimap3 should indeed throw an error here, if the email isn't constructed correctly.
But then I had some doubts, and now I'm not sure what to think anymore:
body:
search term, and it could not find anything. Perhaps parses the email again, separately from offlineimap3, and also interprets the (intended) body as metadata. When I trid searching with Kmail, again filtering the body, it worked.The forum software that sent the email is phpBB, but only specific types of emails (of type "watch forum") have the weired content-type, the others are fine with
Edit UTC 15:17,Friday, 20 October 2023 : I notified the forum administrator and they said they are looking if a bug is filed in phpBB regarding this problem.
Edit 15:56:48 UTC Friday, 20 October 2023
Related:
#160
#107
The text was updated successfully, but these errors were encountered: