Postmark Algorithm Details

I

Ian Boyd

Is Microsoft going to releasing the algoritm used to generate the
x-cr-hashedpuzzle
and
x-cr-puzzleid

header entries for the new Postmark feature, so the entire world can benefit
from this idea? Or will you be keeping it a secret, so that only Outlook
users benefit?


Outlook 2007 adds to (some) outgoing e-mails the two header entries listed
above. This are computationally expensive to generate, and used as a
deterrant for spammers who are trying to send millions of spams at one time.
The receiving Outlook 2007 calculates the same "puzzle" to see if it
matches. If it's a valid postmark, it's a safe bet the e-mail isn't a spam.

Outlook 2007 only adds the postmark to outgoing messages that it thinks
might get classified as spam on the receiving end.



From http://office.microsoft.com/en-us/outlook/HA100625921033.aspx

<quote>
Sending e-mail
Before messages leave your Outbox, Office Outlook 2007 stamps each
message with an e-mail postmark. The postmark incorporates unique
characteristics of the message, including the list of recipients and the
time when the message was sent. As a result, the postmark is valid only for
that e-mail message. It takes some extra computer processing time to
construct the postmark. As a result, it takes a little longer for messages
to leave your Outbox. This is the computational cost incurred with using
Outlook E-mail Postmarking.

Receiving e-mail
When a recipient e-mail application that supports Outlook E-mail
Postmarking receives a postmarked message, it will recognize the postmark.
The postmark indicates to the recipient e-mail application that the message
is not likely to be spam and is taken into account when the message is
evaluated by the e-mail application's spam filter.

Why wouldn't spammers use this feature for their own benefit? Good question.
Here's why: Spammers rely on being able to send thousands of spam messages
per hour. To generate a postmark for each message and continue to send these
at the same rate as when they were sent without postmarks, spammers would
need to spend a significant amount of money to acquire more computers.
Therefore, spammers are less likely to send messages that are postmarked.
</quote>


Some of us want to incorporate this technique into our own e-mail client
programs - but it's only going to work if we all use the same algorithm. And
since Microsoft has taken the lead on this, it's their job to make their
algorithm open to the world.


keywords:
Outlook 2007 postmark x-cr-hashedpuzzle x-cr-puzzleid
 
M

Milly Staples [MVP - Outlook]

Note - this is not Microsoft and the posters here do not work for Microsoft.

As for your question, I tend to doubt that Microsoft will publish anything that the spammers can read to use to evade the spam filters. You are free to contact Microsoft directly to inquire about licensing this technology but I tend to doubt that you will be successful or monetarily inclined to do so. It can get very expensive.


--
Milly Staples [MVP - Outlook]

Post all replies to the group to keep the discussion intact. All
unsolicited mail sent to my personal account will be deleted without
reading.

After furious head scratching, Ian Boyd asked:

| Is Microsoft going to releasing the algoritm used to generate the
| x-cr-hashedpuzzle
| and
| x-cr-puzzleid
|
| header entries for the new Postmark feature, so the entire world can
| benefit from this idea? Or will you be keeping it a secret, so that
| only Outlook users benefit?
|
|
| Outlook 2007 adds to (some) outgoing e-mails the two header entries
| listed above. This are computationally expensive to generate, and
| used as a deterrant for spammers who are trying to send millions of
| spams at one time. The receiving Outlook 2007 calculates the same
| "puzzle" to see if it matches. If it's a valid postmark, it's a safe
| bet the e-mail isn't a spam.
|
| Outlook 2007 only adds the postmark to outgoing messages that it
| thinks might get classified as spam on the receiving end.

<snip>

| Some of us want to incorporate this technique into our own e-mail
| client programs - but it's only going to work if we all use the same
| algorithm. And since Microsoft has taken the lead on this, it's their
| job to make their algorithm open to the world.
|
|
| keywords:
| Outlook 2007 postmark x-cr-hashedpuzzle x-cr-puzzleid
 
R

Roady [MVP]

"Or will you be keeping it a secret, so that only Outlook users benefit?"
Which "you" are you talking to? Microsoft doesn't (actively) live here.
Contact Microsoft directly if you want a statement on this or check on how
this technique is patented.

--
Robert Sparnaaij [MVP-Outlook]
Coauthor, Configuring Microsoft Outlook 2003


-----
 
I

Ian Boyd

The postmark headers are of the form:

x-cr-hashedpuzzle: [b01] [b02] [b03] [b04] [b05] [b06] [b07] [b08]
[b09] [b10] [b11] [b12] [b13] [b14] [b15] [b16];n;[recipients];
Sosha1_v1;7;[PuzzleID];[sender];[datetime];[subject]
x-cr-puzzleid: [PuzzleID]

[b01]-[b16]
Sixteen individually base64 encoded byte values. i've seen each one range
between 2 and 4 bytes long.

n
The number of recipients in the next field

[recipients]
Is a base64 encoded unicode string of the e-mail recipients. Each recipient
is semi-colon separated

"Sosha1_v1"
Undoubtedly the algorithm name.

"7"
No idea.

[PuzzleID]
A copy of the PuzzleID given in the x-cr-puzzleid header field

[sender]
Is a base64 encoded unicode string of the senders e-mail address

[datetime]
Date and time the e-mail was sent (saved?)
e.g. Mon, 29 May 2006 04:14:05 GMT

[subject]
Base64 encoded version of the e-mail's subject line



So, taking an example from a HP presentation on Outlook 2007:
(http://www.mreach-art.com/HP-Exchange Academy/Final C05-Outlook-V4 0-1.pdf)

x-cr-hashedpuzzle: ASam AmlY A3ly BhAS Bx+p CoEw Cv9o Cxb6
D9Qv EhJ1 FWiW Fymo Hro4 HuWC ITRj Izc3;
1;YgB5AC0AZQB4AHAAcgBlAHMAcwBpAG8AbgAtAHcAZQBiA
C0AZABlAHMAaQBnAG4AZQByAEAAZwBvAG8A
ZwBsAGUAZwByAG8AdQBwAHMALgBjAG8AbQA=;
Sosha1_v1;7;{5064D5BE-8988-439C-A862-F2A7DED6F06F};
YwBkAHcAaQBzAGUAQAB3AGkAcwBlAHIAdwBhAHkAcwAu
AGMAbwBtAA==;Mon, 29 May 2006 04:14:05 GMT;
bgBlAHcAIAB0AHUAdABvAHIAaQBhAGwA
x-cr-puzzleid: {5064D5BE-8988-439C-A862-F2A7DED6F06F}

we extract:

16 words
=============
ASam 01 26 a6
AmlY 02 69 58
A3ly 03 79 72
BhAS 06 10 12
Bx+p 07 1f a9
CoEw 0a 81 30
Cv9o 0a ff 68
Cxb6 0b 16 fa
D9Qv 0f d4 2f
EhJ1 12 12 75
FWiW 15 68 96
Fymo 17 29 a8
Hro4 1e ba 38
HuWC 1e e5 82
ITRj 21 34 63
Izc3 23 37 37

n
===
1 recipient

Recipients
=========
(e-mail address removed)

Algorithm
========
Sosha1_v1;7;

PuzzleID
=========
{5064D5BE-8988-439C-A862-F2A7DED6F06F};

Sender
=======
(e-mail address removed)

DateTime
=========
Mon, 29 May 2006 04:14:05 GMT

Subject
======
new tutorial



It's odd that each of [b01]-[b16] is an individually encoded sequest of 2,
3, or 4 bytes.

Just to prove that it's not one long Base64 sequence:
COKu 08 e2 ae
Derl 0d ea e5
IMx1 20 cc 75
XzD5 5f 30 f9
ZfAL 65 f0 0b
b+18 6f ed 7c
qQ3H a9 0d c7
ADT4oA== 00 34 f8 a0
ADVjgw== 00 35 63 83
AEt4TA== 00 4b 78 4c
AEuWrg== 00 4b 96 ae
AJdQeg== 00 97 50 7a
AJgFGA== 00 98 05 18
AK0vxA== 00 ad 2f c4
AMFomw== 00 c1 68 9b
ANULeA== 00 d5 0b 78

and
cYU= 71 85
pto= a6 da
ANj5 00 d8 f9
APqd 00 fa 9d
AW5h 01 6e 61
Awh4 03 08 78
B9wj 07 dc 23
DRzK 0d 1c ca
DYc8 0d 87 3c
D1VN 0f 55 4d
EEn9 10 49 fd
EYNW 11 83 56
FEa4 14 46 b8
HItp 1c 8b 69
Hhi0 1e 18 b4
IL33 20 bd f7


Why didn't you has the e-mail addresses before base64 encoding them?
Spammers can now google for "puzzleid" and get hundreds of working e-mail
addresses. People aren't going to like that you're leaking out personal
information. It would have make much more sense to SHA1 them first. That way
it's computationally more expesive, and the e-mail addresses are not
unknowingly leaked out.

i still have no idea what [b01]-[b16] are. It's telling that there's always
16 of them, but odd that they're variable length. It seems that messages
with more recipients hav ethe longer b64 words (the ones generating 8 base64
letters was 11 recipients.
 
M

Milly Staples [MVP - Outlook]

As said before, this is NOT Microsoft - so why are you publishing your findings here making it ALL THE EASIER for spammers to figure this out? Or are you just trying to PO some folks?

--
Milly Staples [MVP - Outlook]

Post all replies to the group to keep the discussion intact. All
unsolicited mail sent to my personal account will be deleted without
reading.

After furious head scratching, Ian Boyd asked:

| The postmark headers are of the form:
|
| x-cr-hashedpuzzle: [b01] [b02] [b03] [b04] [b05] [b06] [b07] [b08]
| [b09] [b10] [b11] [b12] [b13] [b14] [b15] [b16];n;[recipients];
| Sosha1_v1;7;[PuzzleID];[sender];[datetime];[subject]
| x-cr-puzzleid: [PuzzleID]
|
| [b01]-[b16]
| Sixteen individually base64 encoded byte values. i've seen each one
| range between 2 and 4 bytes long.
|
| n
| The number of recipients in the next field
|
| [recipients]
| Is a base64 encoded unicode string of the e-mail recipients. Each
| recipient is semi-colon separated
|
| "Sosha1_v1"
| Undoubtedly the algorithm name.
|
| "7"
| No idea.
|
| [PuzzleID]
| A copy of the PuzzleID given in the x-cr-puzzleid header field
|
| [sender]
| Is a base64 encoded unicode string of the senders e-mail address
|
| [datetime]
| Date and time the e-mail was sent (saved?)
| e.g. Mon, 29 May 2006 04:14:05 GMT
|
| [subject]
| Base64 encoded version of the e-mail's subject line
|
|
|
| So, taking an example from a HP presentation on Outlook 2007:
| (http://www.mreach-art.com/HP-Exchange Academy/Final C05-Outlook-V4 0-1.pdf)
|
| x-cr-hashedpuzzle: ASam AmlY A3ly BhAS Bx+p CoEw Cv9o Cxb6
| D9Qv EhJ1 FWiW Fymo Hro4 HuWC ITRj Izc3;
| 1;YgB5AC0AZQB4AHAAcgBlAHMAcwBpAG8AbgAtAHcAZQBiA
| C0AZABlAHMAaQBnAG4AZQByAEAAZwBvAG8A
| ZwBsAGUAZwByAG8AdQBwAHMALgBjAG8AbQA=;
| Sosha1_v1;7;{5064D5BE-8988-439C-A862-F2A7DED6F06F};
| YwBkAHcAaQBzAGUAQAB3AGkAcwBlAHIAdwBhAHkAcwAu
| AGMAbwBtAA==;Mon, 29 May 2006 04:14:05 GMT;
| bgBlAHcAIAB0AHUAdABvAHIAaQBhAGwA
| x-cr-puzzleid: {5064D5BE-8988-439C-A862-F2A7DED6F06F}
|
| we extract:
|
| 16 words
| =============
| ASam 01 26 a6
| AmlY 02 69 58
| A3ly 03 79 72
| BhAS 06 10 12
| Bx+p 07 1f a9
| CoEw 0a 81 30
| Cv9o 0a ff 68
| Cxb6 0b 16 fa
| D9Qv 0f d4 2f
| EhJ1 12 12 75
| FWiW 15 68 96
| Fymo 17 29 a8
| Hro4 1e ba 38
| HuWC 1e e5 82
| ITRj 21 34 63
| Izc3 23 37 37
|
| n
| ===
| 1 recipient
|
| Recipients
| =========
| (e-mail address removed)
|
| Algorithm
| ========
| Sosha1_v1;7;
|
| PuzzleID
| =========
| {5064D5BE-8988-439C-A862-F2A7DED6F06F};
|
| Sender
| =======
| (e-mail address removed)
|
| DateTime
| =========
| Mon, 29 May 2006 04:14:05 GMT
|
| Subject
| ======
| new tutorial
|
|
|
| It's odd that each of [b01]-[b16] is an individually encoded sequest
| of 2, 3, or 4 bytes.
|
| Just to prove that it's not one long Base64 sequence:
| COKu 08 e2 ae
| Derl 0d ea e5
| IMx1 20 cc 75
| XzD5 5f 30 f9
| ZfAL 65 f0 0b
| b+18 6f ed 7c
| qQ3H a9 0d c7
| ADT4oA== 00 34 f8 a0
| ADVjgw== 00 35 63 83
| AEt4TA== 00 4b 78 4c
| AEuWrg== 00 4b 96 ae
| AJdQeg== 00 97 50 7a
| AJgFGA== 00 98 05 18
| AK0vxA== 00 ad 2f c4
| AMFomw== 00 c1 68 9b
| ANULeA== 00 d5 0b 78
|
| and
| cYU= 71 85
| pto= a6 da
| ANj5 00 d8 f9
| APqd 00 fa 9d
| AW5h 01 6e 61
| Awh4 03 08 78
| B9wj 07 dc 23
| DRzK 0d 1c ca
| DYc8 0d 87 3c
| D1VN 0f 55 4d
| EEn9 10 49 fd
| EYNW 11 83 56
| FEa4 14 46 b8
| HItp 1c 8b 69
| Hhi0 1e 18 b4
| IL33 20 bd f7
|
|
| Why didn't you has the e-mail addresses before base64 encoding them?
| Spammers can now google for "puzzleid" and get hundreds of working
| e-mail addresses. People aren't going to like that you're leaking
| out personal information. It would have make much more sense to SHA1
| them first. That way it's computationally more expesive, and the
| e-mail addresses are not unknowingly leaked out.
|
| i still have no idea what [b01]-[b16] are. It's telling that there's
| always 16 of them, but odd that they're variable length. It seems
| that messages with more recipients hav ethe longer b64 words (the
| ones generating 8 base64 letters was 11 recipients.
 
I

Ian Boyd

As said before, this is NOT Microsoft - so why are you publishing
your findings here making it ALL THE EASIER for spammers
to figure this out? Or are you just trying to PO some folks?

Your followups came during the time i was investigating,
and composing my own follow-up. i simply didn't see them anyone
say before that "this is NOT Microsoft"

Why am i posting here? Because Google will crawl it, and i hope i
can help others by sharing what i've learned.
 
I

Ian Boyd

Contact Microsoft directly if you want a statement on this or check on how
this technique is patented.

i really, truly, and honestly, have no idea how to contact the appropriate
person at Microsoft.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top