Email Header Analysis might be a very important segment of a forensic investigation and although now there are a lot of tools online that can do this for you, it is a skill to be able to read and understand the email header yourself.
For starters, let’s define some basic email terminology:
Mail User Agent (MUA) - client application running on your computer that is used to send and receive email.Mail Transfer Agent (MTA) - accepts messages from a sender and routes them along to their destinations.Sender Policy Framework (SPF) - defines a mechanism by which an organization can specify server(s) that are allowed to send emails on behalf of that domain. If an email fails an SPF check, it can be an indicator of a spam. Domain Keys Identified Mail (DKIM) - provides a cryptographic method of verifying if a received email actually originated from the sending domain. DKIM uses DNS text records, in a manner as the following: The owner of a domain, would generate a public-private key pair and publish the public key within such a record. Mail servers sending mails on behalf of a domain, would hash the message body of a given message and then encrypt the hash with the corresponding private key, which only those servers would have access to. This creates a signature. The recipient could then obtain the public key via DNS and decrypt the signature. Then he would calculate that same hash and if the hash matched the value within the decrypted signature, confidence would be high that the message was unaltered in transit and originated from the sending domain.
The most important 3 different types of data that appear in email headers are:
1. RFC defined header fields2. X-header fields — experimental headers, added for spam-filtering, authentication results, tracking and more3. IPv4 or IPv6 addresses
Here, i will try and elaborate the most common fields in almost every email, and how should those fields be interpreted.
Message headers usually have this format:
Tip #1: An Email header is analyzed from bottom to the top
Received field at the top indicates the server that is most recent and the one closest to the destination from the whole “email path” and the Received field that is first from the bottom is the one that is closest to the source of that email. The number of these Received headers will vary depending on the MTA agents the emails has traversed from the path from the source to the destination.
Tip #2: looking at the domain of the sender (from field in Received ) if the IP address is from the address space allocated from that domain, that can be one reassurance that the message is legitimate.
For example if the domain of the Received field closest to the destination is apple.com with a certain IP address starting with 17. #.#.# we can check to see if the NetRange is compatible with the one Apple has allocated.
That can be checked with whois from the terminal by typing this command:
And the output should be something like this:
Breaking down these field it reassembles to the following components:
v= the version of the DKIM signaturea= the algorithm used to generate the signature, sha256 is the common value, although a=rsa-sha256 can also be seenc= the canonicalization algorithm, this indicates any modifications that may be present within the email, it is not uncommon for mail server to make small modifications to the message while in transit, but this field will indicate whether such message is acceptable, the value of relaxed, means that these changes are acceptable and will not invalidate the signature, the value of simple means that these changes will not be acceptable. Usually this field has format like c=relaxed/relaxed where the first value is for the header and the second one is for the body. d= is the domain owned by the sender, c= specifies the selector, together these are used to locate the public key via DNS text recordt= is a timestap in the DKIM field and should always match (or be in close proximity to) the adjacent timestamp in the Received field next to itbh= is the body hash which is computed based on the hashing algorithm used and then encoded into base64b= tag contains the DKIM signature, which is calculated based upon the header fields specified in the h= tag, as well as the DKIM-Signature header field itself, which in turn contains the body hash.
These are generated by the first MTA traversed by the message, referred to as MSA, Mail Submission Agent. Message Identifier pertains to exactly one version of a particular message. Subsequent revisions of a message, each receive new message identifiers, so finding a repeating Message-ID within the same email system, could be an indication of forgery.
This is also known as a bounce address and is NOT the same as the From address. Usually it will match the return path, otherwise might be alerting.
Experimental headers and can be used for some administrator purposes. The most common one might be X-Originating-IP and used for storing the IP address of the sender. These are populated by the mailing services like Office365, Google etc.
Domain based Message Authentication, Reporting and Conformance is an extension of both SPF and DKIM, it can be configured such that an action can be taken based on the results of SPF and DKIM.
Something to always have in mind is that emails are maybe the most dangerous point and a platform of constant attack. The best defense is usually common sense, but, of course, it doesn’t hurt to get a little technical and be able to prove that a threat really exists.