Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance encoding guessing rate and decoding email body parts #268

Open
pi-infected opened this issue Oct 9, 2024 · 0 comments
Open

Enhance encoding guessing rate and decoding email body parts #268

pi-infected opened this issue Oct 9, 2024 · 0 comments

Comments

@pi-infected
Copy link

pi-infected commented Oct 9, 2024

Hi,

I am processing huge quantity error and I've noticed that sometimes there is a problem with encoding detector & decoding email body parts.

I've enhanced this part by monkey patching flanker, and I have decreased the encoding problem rate.

I guess you will be interested by updating this part with my code 👍

# Monkey patching
import charset_normalizer
from flanker.mime.message import utils
from flanker.mime.message import errors

def _guess_and_convert_with(value, detector=charset_normalizer):
  """
  Try to guess the encoding of the passed value with the provided detector
  and decode it.

  The detector is charset_normalizer module.
  """
  result = detector.from_bytes(value).best()

  if not result:
    raise errors.DecodingError("Failed to guess encoding")

  try:
    value = str(result)
    return value

  except (UnicodeError, LookupError) as e:
    raise errors.DecodingError(str(e))

def _guess_and_convert(value):
  """
  Try to guess the encoding of the passed value and decode it.

  Uses charset_normalizer to guess the encoding.
  """
  return _guess_and_convert_with(value, detector=charset_normalizer)

utils._guess_and_convert_with = _guess_and_convert_with
utils._guess_and_convert      = _guess_and_convert
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant