Skip to content

ericlingit/eml_parser

 
 

Repository files navigation

Code Health https://travis-ci.org/GOVCERT-LU/eml_parser.svg?branch=static_types Documentation Status

eml_parser serves as a python module for parsing eml files and returning various information found in the e-mail as well as computed information.

Extracted and generated information include but are not limited to:

  • attachments - hashes - names
  • from, to, cc
  • received servers path
  • subject
  • list of URLs parsed from the text content of the mail (including HTML body/attachments)

Please feel free to send me your comments / pull requests.

Install the latest version using pip:

pip install eml_parser[file-magic]

Note: If you don't want to / cannot use file-magic (e.g. if you are using python-magic), install via:

pip install eml_parser

Note for OSX users:

Make sure to install libmagic, else eml_parser will not work.

Warning:

This release is only compatible with Python3. The last release to be compatible with
Python2 is v1.2. If you do require Python2 support, please download that version.
You are strongly encouraged though to use Python3 as there are many parsing improvements
and much better RFC support.

Example on how to use:

import datetime
import json
import eml_parser


def json_serial(obj):
    if isinstance(obj, datetime.datetime):
        serial = obj.isoformat()
        return serial


with open('sample.eml', 'rb') as fhdl:
    raw_email = fhdl.read()

parsed_eml = eml_parser.eml_parser.decode_email_b(raw_email)

print(json.dumps(parsed_eml, default=json_serial))

Which gives for a minimalistic EML file something like this:

{
  "body": [
    {
      "content_header": {
        "content-language": [
          "en-US"
        ]
      },
      "hash": "6c9f343bdb040e764843325fc5673b0f43a021bac9064075d285190d6509222d"
    }
  ],
  "header": {
    "received_src": null,
    "from": "[email protected]",
    "to": [
      "[email protected]"
    ],
    "subject": "Sample EML",
    "received_foremail": [
      "[email protected]"
    ],
    "date": "2013-04-26T11:15:47+00:00",
    "header": {
      "content-language": [
        "en-US"
      ],
      "received": [
        "from localhost\tby mta.example.com (Postfix) with ESMTPS id 6388F684168\tfor <[email protected]>; Fri, 26 Apr 2013 13:15:55 +0200"
      ],
      "to": [
        "[email protected]"
      ],
      "subject": [
        "Sample EML"
      ],
      "date": [
        "Fri, 26 Apr 2013 11:15:47 +0000"
      ],
      "message-id": [
        "<[email protected]>"
      ],
      "from": [
        "John Doe <[email protected]>"
      ]
    },
    "received_domain": [
      "mta.example.com"
    ],
    "received": [
      {
        "with": "esmtps id 6388f684168",
        "for": [
          "[email protected]"
        ],
        "by": [
          "mta.example.com"
        ],
        "date": "2013-04-26T13:15:55+02:00",
        "src": "from localhost by mta.example.com (postfix) with esmtps id 6388f684168 for <[email protected]>; fri, 26 apr 2013 13:15:55 +0200"
      }
    ]
  }
}

About

python eml parser module

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%