I have hacked up my local posterity bzr branch to process messages to extract contacts (and mailing lists). These contacts result in synthetic tags (to, from, and cc) applied to each message. My changes also include maintaining per-time interval (day, ISO week, month, year, ever) sparse counts for each tag.
Visophyte (bzr trunk) has been augmented to create bar-graph style sparklines (as coined/created/etc. by Tufte). Visterity, my happy-go-visualizy posterity plugin (also in the visophyte repository), consumes the new posterity contact statistics and produces what you see above.
If you’re not sure what you’re seeing, each bar is a week. The grey-colored bars ‘above’ the invisible line are messages from that contact (to anyone). The red-colored bars ‘below’ the line are messages to that contact (from anyone), while the lighter-red-colored-bars below the line are messages cc’ed or bcc’ed to that contact (from anyone).
It is important to reiterate that, at this time, these from/to/cc relations have nothing to do with the person whose email repository it is (mine, in this case). If messages are red, that doesn’t mean I sent them the message; it just means someone sent them a message and it somehow passed through my account. Of course, when viewing a list of contacts, I’m only really going to care about that person’s interactions with me. So that is a must-have and up on the near-term to-do list. The current set of tags are really most useful in attempting to visualize messages sent through a mailing list or other broadcast medium, which is also of great interest to me.
I should probably also note that when writing the list-handling logic here, I forgot to have the code generate an implied ‘to’ when the person replied only to the list but the author of the previous message could be presumed to be the intended recipient of the message. Which explains why so many of the people in the example image up there do not end up receiving many messages. I would have fixed this, but re-processing my full downloaded gmail corpus of something like 68,000 messages takes a while.
Also, the blurred out guy in the example doesn’t need to be blurred out. I ended up eliding a useless column that I had decided to blur for no clear reason, but in my sleep-desiring state I also screwed up and blurred a column that really didn’t need to be blurred. (All the people shown have posted to public mailing lists, so their e-mail addresses are already out their, and their names aren’t exactly going to get them more spam. And OCR would be required anyways, as the sparklines are images rather than inline-SVG for caching reasons.)
UPDATE: Uh, as I quickly re-read the post and looked at the sparkline, I realized I completely flipped (both color-and above/below) the sparkline from what I had originally intended. Thankfully my original intent didn’t make all that much sense either, at least color-wise, so I think I can sleep easy without correcting that. Better color suggestions,etc. are appreciated.