<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>visophyte: shiny? shiny. &#187; Email</title>
	<atom:link href="http://www.visophyte.org/blog/category/visualizing/email/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.visophyte.org/blog</link>
	<description>Andrew Sutherland writes things but (almost) always includes pictures to look at.</description>
	<lastBuildDate>Sun, 05 Feb 2012 05:25:52 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>So&#8217;s your facet: Faceted global search for Mozilla Thunderbird</title>
		<link>http://www.visophyte.org/blog/2009/09/03/sos-your-facet-faceted-global-search-for-mozilla-thunderbird/</link>
		<comments>http://www.visophyte.org/blog/2009/09/03/sos-your-facet-faceted-global-search-for-mozilla-thunderbird/#comments</comments>
		<pubDate>Fri, 04 Sep 2009 06:08:50 +0000</pubDate>
		<dc:creator>Andrew Sutherland</dc:creator>
				<category><![CDATA[Email]]></category>
		<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[Thunderbird]]></category>
		<category><![CDATA[facets]]></category>
		<category><![CDATA[gloda]]></category>
		<category><![CDATA[simile]]></category>
		<category><![CDATA[timeline]]></category>

		<guid isPermaLink="false">http://www.visophyte.org/blog/?p=374</guid>
		<description><![CDATA[Following in the footsteps of the MIT SIMILE project&#8217;s Exhibit tool (originally authored by David Huynh) and Thunderbird Seek extension (again by David Huynh), we are hoping to land faceted global search for Thunderbird 3.0 (a la gloda) in beta 4. I think it&#8217;s important to point out how ridiculously awesome the Seek extension is.  [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.visophyte.org/blog/wp-content/uploads/2009/09/faceting-gloda-hover-davida-1.png"><img class="alignnone size-thumbnail wp-image-376" title="faceting-gloda-hover-davida-1" src="http://www.visophyte.org/blog/wp-content/uploads/2009/09/faceting-gloda-hover-davida-1-600x551.png" alt="faceting-gloda-hover-davida-1" width="600" height="551" /></a></p>
<p>Following in the footsteps of the <a href="http://web.mit.edu/">MIT</a> <a href="http://simile.mit.edu/">SIMILE</a> project&#8217;s <a href="http://www.simile-widgets.org/exhibit/">Exhibit</a> tool (originally authored by <a href="http://davidhuynh.net/">David Huynh</a>) and <a href="http://code.google.com/p/simile-seek/">Thunderbird Seek extension</a> (again by <a href="http://davidhuynh.net/">David Huynh</a>), we are hoping to land faceted global search for Thunderbird 3.0 (a la gloda) in beta 4.</p>
<p>I think it&#8217;s important to point out how ridiculously awesome the Seek extension is.  It is the only example of faceted browsing or search in an e-mail client that I am aware of.  (Note: I have to assume there are some research e-mail clients out there with faceting, but I haven&#8217;t seen them.)  Given the data model available to extensions in Thunderbird 2.0 and the idiosyncratic architecture of the UI code in 2.0, it&#8217;s not only a feature marvel but also a technical marvel.</p>
<p>Unfortunately, there was only so much Seek could do before it hit a wall given the limitations it had to work with.  Thunderbird 2.0&#8242;s per-folder indices are just that, per-folder.  They also require (fast) O(n) search on any attribute other than their unique key.  Although Seek populated an in-memory index for each folder, it was faced with having to implement its own global indexer and persistent database.</p>
<p>Gloda is now at a point where a global database should no longer be the limiting factor for extensions, or the core Thunderbird experience&#8230;</p>
<p><a href="http://www.visophyte.org/blog/wp-content/uploads/2009/09/faceting-gloda-action-tag-hover-bienvenu-1.png"><img class="alignnone size-thumbnail wp-image-377" title="faceting-gloda-action-tag-hover-bienvenu-1" src="http://www.visophyte.org/blog/wp-content/uploads/2009/09/faceting-gloda-action-tag-hover-bienvenu-1-600x551.png" alt="faceting-gloda-action-tag-hover-bienvenu-1" width="600" height="551" /></a></p>
<p>The screenshots are of a fulltext search for &#8220;gloda&#8221; in my message store.  The first screenshot is without any facets applied and me hovering over one of David Ascher&#8217;s e-mail address.  The second is after having selected the &#8220;!action&#8221; tag and hovering over one of David Bienvenu&#8217;s e-mail address.  Gloda has a concept of contact aggregation of identities but owing to a want of UI for this in the address-book right now, it doesn&#8217;t happen.  We do not yet coalesce (approximately) duplicate messages, which explains any apparent duplicates you see.</p>
<p>The current state of things is a result of development effort by myself and <a href="http://ascher.ca/blog/">David Ascher</a> with design input from <a href="http://clarkbw.net/blog/">Bryan Clark</a> and <a href="http://www.andreasn.se/blog/">Andreas Nilsson</a> (with hopefully much more to come soon <img src='http://www.visophyte.org/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> .  Although we aren&#8217;t using much code from our previous <a href="http://www.visophyte.org/blog/tag/exptoolbar/">exptoolbar</a> efforts, a lot of the thinking is based on the work David, Bryan, and myself did on that.  Much thanks to <a href="http://mesquilla.com/">Kent James</a>, <a href="http://sid0.blogspot.com/">Siddharth Agarwal</a>, and <a href="http://blog.davidbienvenu.org/">David Bienvenu</a> for their recent and ongoing improvements to the gloda (and mailnews) back-end which help make this hopefully compelling UI feature actually usable through efficient and comprehensive indexing that does not make you want to throw your computer through a window.</p>
<p>If you use <a href="http://s3.mozillamessaging.com/build/try-server/2009-09-03_16:12-asutherland@asutherland.org-facet-v5/asutherland@asutherland.org-facet-v5-mail-try-linux.tar.bz2">linux</a> or <a href="http://s3.mozillamessaging.com/build/try-server/2009-09-03_16:12-asutherland@asutherland.org-facet-v5/asutherland@asutherland.org-facet-v5-mail-try-mac.dmg">OS X</a>, I just linked you to try server builds.  The windows try server was sadly on fire and so couldn&#8217;t attend the build party.  The bug tracking the enhancement is bug <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=474711">474711</a> and has repository info if you want to spin your own build.  New try server builds will also be noted there.  Please keep in mind that this is an in-progress development effort; it is not finished, there are bugs.  Accordingly, please direct any feedback/discussion to the dev-apps-thunderbird <a href="https://lists.mozilla.org/listinfo/dev-apps-thunderbird">list</a> / <a href="news://news.mozilla.org/mozilla.dev.apps.thunderbird">newsgroup</a> rather than the bug.  Please beware that increases in awesomeness require that your gloda database be automatically blown away if you try the new version.  And first you have to <a href="https://wiki.mozilla.org/Thunderbird:Using_Gloda">turn gloda on</a> if you have not already.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.visophyte.org/blog/2009/09/03/sos-your-facet-faceted-global-search-for-mozilla-thunderbird/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
		<item>
		<title>thunderbird, gloda, exptoolbar, protovis, paninaro, oh oh oh</title>
		<link>http://www.visophyte.org/blog/2009/04/01/thunderbird-gloda-exptoolbar-protovis-paninaro-oh-oh-oh/</link>
		<comments>http://www.visophyte.org/blog/2009/04/01/thunderbird-gloda-exptoolbar-protovis-paninaro-oh-oh-oh/#comments</comments>
		<pubDate>Wed, 01 Apr 2009 17:52:47 +0000</pubDate>
		<dc:creator>Andrew Sutherland</dc:creator>
				<category><![CDATA[Email]]></category>
		<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[Thunderbird]]></category>
		<category><![CDATA[Visualizing]]></category>
		<category><![CDATA[exptoolbar]]></category>
		<category><![CDATA[gloda]]></category>
		<category><![CDATA[protovis]]></category>

		<guid isPermaLink="false">http://www.visophyte.org/blog/?p=226</guid>
		<description><![CDATA[Thunderbird.  With the global database, gloda.  Using the exptoolbar extension.  Using the protovis javascript visualization library.  For reals!  Not a prank!  Just grab the most recent XPI or grab the repo.  And be using a nightly (beta 2 might work?) What you are looking at: The exptoolbar search results page, augmented with a visualization. Each [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.visophyte.org/blog/wp-content/uploads/2009/04/exptoolbar-protovis-gloda-256.png"><img class="alignnone size-thumbnail wp-image-274" title="exptoolbar-protovis-gloda-256" src="http://www.visophyte.org/blog/wp-content/uploads/2009/04/exptoolbar-protovis-gloda-256-514x600.png" alt="exptoolbar-protovis-gloda-256" width="514" height="600" /></a></p>
<p>Thunderbird.  With the global database, gloda.  Using the exptoolbar extension.  Using the <a href="http://vis.stanford.edu/protovis/">protovis javascript visualization library</a>.  For reals!  Not a prank!  Just <a href="http://clicky.visophyte.org/momo/xpis/exptoolbar/">grab the most recent XPI</a> or <a href="http://hg.mozilla.org/users/bugmail_asutherland.org/exptoolbar/">grab the repo</a>.  And be using a nightly (beta 2 might work?)</p>
<p>What you are looking at:</p>
<ul>
<li>The exptoolbar search results page, augmented with a visualization.</li>
<li>Each conversation with search results gets its own wedge.
<ul>
<li>Wedges can be distinguished because of the alternating background colors.</li>
<li>Conversations that you sent a message to will have a red shading to them.  The examples may be somewhat misleading because the account where a lot of my sent mail ends up is not part of the profile used to create the screenshots.</li>
</ul>
</li>
<li>Each message is placed in its conversation wedge&#8230;
<ul>
<li>The radius is based on the &#8216;age&#8217; of the message using a log-ish scale.  Interpolation is actually linear at each level (one day, one week, one month, three months, one year, 5 years, &#8216;forever&#8217;.)</li>
<li>The angular placement within the wedge is based on the author of the message.  Across all wedges the placement is the same.  This helps &#8216;bursty&#8217; parts of conversations (which are extremely likely) be made more obvious, while also helping to provide some understanding of conversation dynamics.</li>
</ul>
</li>
<li>Message shapes are determined by whether the message is starred (diamond), sent by a &#8216;popular&#8217; contact (circle), or an unpopular one (cross).  The use of popularity is a temporary measure because current gloda in trunk does not cache address-book lookups, and they are expensive.  Once the new gloda search code lands with those changes, we can rely on the existence of an address book entry.  (Starring a contact using the new message reader adds them to your address book.)</li>
<li>Message opacity is determined by whether the message is a &#8216;hit&#8217; or not.  All messages in a conversation are eventually retrieved, though initially we only have the hits.</li>
<li>Message color is determined by applied tags (using the closest tango color for the first tag), or whether the message is starred (closest tango color to yellow, where I think I had removed the yellow tango colors for some unknown reason, so we get green I guess).  It&#8217;s grey if the message has no tag or star.</li>
<li>The subject of the conversation is displayed in the wedge.</li>
</ul>
<p><a href="http://www.visophyte.org/blog/wp-content/uploads/2009/04/exptoolbar-protovis-seek-thunderbird-256.png"><img class="alignnone size-thumbnail wp-image-275" title="exptoolbar-protovis-seek-thunderbird-256" src="http://www.visophyte.org/blog/wp-content/uploads/2009/04/exptoolbar-protovis-seek-thunderbird-256-600x594.png" alt="exptoolbar-protovis-seek-thunderbird-256" width="600" height="594" /></a></p>
<p>Things that are happy:</p>
<ul>
<li><a href="http://vis.stanford.edu/protovis/">protovis</a> is delightful, even at this early stage of its development.</li>
<li>The radar-styled visualization looks fairly neat, and basically proves at least feature parity with my <a href="http://www.visophyte.org/blog/2007/11/02/radial-radar-email-vis-with-care-factors/">visophyte-based radar visualization</a>.</li>
</ul>
<p>Things that are sad (aka caveats):</p>
<ul>
<li>It would probably be better if the visualization was not radar-inspired.  Besides the perceptual reasons, the subjects are harder to read than they would be in an equivalent linear-styled visualization.</li>
<li>The visualization is not interactive.  protovis officially has no interaction support yet, but if you look in the (only available minified?) source, it&#8217;s almost there.  It might be entirely there, but it didn&#8217;t work for me immediately after a quick reading of the (indented) source.</li>
<li>There is some low probability failure that occurs during the visualization updating as gloda backfills the message collections.  If it happens on the last update, you can end up with a half-built visualization.  Re-running the search will generally resolve the issue.</li>
<li>The visualization does a pretty solid job of taking up all the screen real estate and has no way to be disabled, so you have to scroll past it every time.</li>
</ul>
<p>Future work:</p>
<ul>
<li>Interactivity.</li>
<li>Perhaps showing the gravatars for the people involved in a conversation at the outer rim of the wedge, positioning them based on the author positioning we determined.</li>
<li>Perhaps lose the radar motif.</li>
<li>Your thoughts / patches!</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.visophyte.org/blog/2009/04/01/thunderbird-gloda-exptoolbar-protovis-paninaro-oh-oh-oh/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Thunderbird and gloda go to meme-town</title>
		<link>http://www.visophyte.org/blog/2008/12/12/thunderbird-and-gloda-go-to-meme-town/</link>
		<comments>http://www.visophyte.org/blog/2008/12/12/thunderbird-and-gloda-go-to-meme-town/#comments</comments>
		<pubDate>Fri, 12 Dec 2008 20:30:01 +0000</pubDate>
		<dc:creator>Andrew Sutherland</dc:creator>
				<category><![CDATA[clicky]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[Shiny]]></category>
		<category><![CDATA[Thunderbird]]></category>
		<category><![CDATA[Visualizing]]></category>
		<category><![CDATA[exptoolbar]]></category>
		<category><![CDATA[gloda]]></category>
		<category><![CDATA[visophyte-js]]></category>
		<category><![CDATA[word cloud]]></category>

		<guid isPermaLink="false">http://www.visophyte.org/blog/?p=153</guid>
		<description><![CDATA[Sure, a word cloud of your blog posts is cool&#8230; but what if you could take any search of your e-mail, and turn that into a word cloud?  And then, if you click on one of those words, your search constraints would be revised to use the word you clicked on (and you&#8217;d get a [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.visophyte.org/blog/wp-content/uploads/2008/12/gloda-word-cloud-gloda-1.png"><img class="alignnone size-medium wp-image-154" title="A word cloud magicked up by gloda and visophyte-js" src="http://www.visophyte.org/blog/wp-content/uploads/2008/12/gloda-word-cloud-gloda-1-300x283.png" alt="" width="300" height="283" /></a></p>
<p>Sure, a word cloud of your blog posts is cool&#8230; but what if you could take any search of your e-mail, and turn that into a word cloud?  And then, if you click on one of those words, your search constraints would be revised to use the word you clicked on (and you&#8217;d get a useful search result, not another word cloud)?  And what if that layout algorithm were not as good as wordle?  The future is now, people!  (At least if you install like 5 extra extensions out of mercurial.)</p>
<p>The screenshot above is from Thunderbird trunk with a hacked exptoolbar extension (generalized, committed changes happening soon), visophyte-js, and the new glodacloud extension.  It is a proof-of-easy-gloda-extensions as suggested by David Ascher.</p>
<p>The layout algorithm is what we in the business of making up terminology call a recursive sub-optimal tic-tac-toe subdivision thinger.  We under-use a neat (and somewhat slow) hack to find the bounds of the words through use of canvas.mozPathText and canvas.isPointInPath to sample a grid to know where the text is and isn&#8217;t.  It&#8217;s under-used because all we use it for right now is to find the actual height above the baseline that the text stretches to (because metrics only gives us the width).  We are lazy and don&#8217;t check below the baseline at all, and totally squander our chance to be cool and put small words in the gaps in larger words.  But given the amount of time spent, I&#8217;m very happy.</p>
<p>Oh, and of course it uses JS and Canvas.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.visophyte.org/blog/2008/12/12/thunderbird-and-gloda-go-to-meme-town/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>I&#8217;ll be wanting that latte machine now&#8230;</title>
		<link>http://www.visophyte.org/blog/2008/11/29/ill-be-wanting-that-latte-machine-now/</link>
		<comments>http://www.visophyte.org/blog/2008/11/29/ill-be-wanting-that-latte-machine-now/#comments</comments>
		<pubDate>Sat, 29 Nov 2008 08:46:12 +0000</pubDate>
		<dc:creator>Andrew Sutherland</dc:creator>
				<category><![CDATA[Email]]></category>
		<category><![CDATA[Shiny]]></category>
		<category><![CDATA[Thunderbird]]></category>
		<category><![CDATA[Visualizing]]></category>
		<category><![CDATA[exptoolbar]]></category>
		<category><![CDATA[gloda]]></category>
		<category><![CDATA[visophyte-js]]></category>

		<guid isPermaLink="false">http://www.visophyte.org/blog/?p=147</guid>
		<description><![CDATA[in context credits where credits due: thread arcs a la the nice people at the IBM CUE group the search view prototype is implemented by David Ascher.  the positioning of the visualization is on me as a quick hack, though. the search view prototype is designed by Bryan Clark, and he has even better stuff [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignnone size-full wp-image-148" title="exptoolbar-visophyte-js-thread-arcs-just-arc" src="http://www.visophyte.org/blog/wp-content/uploads/2008/11/exptoolbar-visophyte-js-thread-arcs-just-arc.png" alt="" width="155" height="75" /></p>
<p>in context</p>
<p><a href="http://www.visophyte.org/blog/wp-content/uploads/2008/11/exptoolbar-visophyte-js-thread-arcs.png"><img class="alignnone size-medium wp-image-150" title="exptoolbar-visophyte-js-thread-arcs" src="http://www.visophyte.org/blog/wp-content/uploads/2008/11/exptoolbar-visophyte-js-thread-arcs-300x124.png" alt="" width="300" height="124" /></a></p>
<p>credits where credits due:</p>
<ul>
<li><a href="http://www.research.ibm.com/remail/threadarcs.html">thread arcs</a> a la the nice people at the <a href="http://domino.research.ibm.com/cambridge/research.nsf/pages/cue.html?Open">IBM CUE group</a></li>
<li>the search view prototype is implemented by <a href="http://ascher.ca/blog/">David Ascher</a>.  the positioning of the visualization is on me as a quick hack, though.</li>
<li>the search view prototype is designed by <a href="http://clarkbw.net/blog/">Bryan Clark</a>, and he has even better stuff on the way</li>
</ul>
<p>The actual implementation is a first step of adapting knowledge from my python &#8220;visophyte&#8221; library to a JS implementation using canvas.  I am trying a more batch-oriented style of processing this time that uses explicit attributes for value-passing between logic blocks.  This is in comparison to the python implementation which is more functional in nature.  We&#8217;ll see how it turns out.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.visophyte.org/blog/2008/11/29/ill-be-wanting-that-latte-machine-now/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>gloda&#8217;s first (primitive) visualization</title>
		<link>http://www.visophyte.org/blog/2008/07/08/glodas-first-primitive-visualization/</link>
		<comments>http://www.visophyte.org/blog/2008/07/08/glodas-first-primitive-visualization/#comments</comments>
		<pubDate>Wed, 09 Jul 2008 01:14:49 +0000</pubDate>
		<dc:creator>Andrew Sutherland</dc:creator>
				<category><![CDATA[Email]]></category>
		<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[Thunderbird]]></category>
		<category><![CDATA[Visualizing]]></category>
		<category><![CDATA[expmess]]></category>
		<category><![CDATA[gloda]]></category>

		<guid isPermaLink="false">http://www.visophyte.org/blog/?p=110</guid>
		<description><![CDATA[A primitive visualization augments the gloda &#8220;other messages by author&#8221; listing by showing the messages sent by the author over time.  Messages are stacked by day.  The currently selected message is in darkest blue and also very wide.  Other messages from the same thread/conversation are in lighter blue and less wide.  Messages not in the [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignnone size-full wp-image-111" title="gloda-m2pre-prim-timeline-vis" src="http://www.visophyte.org/blog/wp-content/uploads/2008/07/gloda-m2pre-prim-timeline-vis.png" alt="Author activity over time, current thread in blue, selected message in darkest blue." width="312" height="248" /></p>
<p>A primitive visualization augments the gloda &#8220;other messages by author&#8221; listing by showing the messages sent by the author over time.  Messages are stacked by day.  The currently selected message is in darkest blue and also very wide.  Other messages from the same thread/conversation are in lighter blue and less wide.  Messages not in the conversation are light grey and rather narrow.</p>
<p>It&#8217;s not clickable, it lacks any form of scale or any feedback at all, and there are scaling issues.  (If anyone wants to save me the effort of figuring out how to get the canvas to maintain a 1:1 pixel mapping to the actual display and still &#8216;flex&#8217; by adding/losing pixels, please do drop me a message or leave a comment.)  These will all change, but not yet.</p>
<p>I&#8217;ve pushed the changes to the mercurial repos and updated the stable tag, but I&#8217;m not publishing updated xpi&#8217;s, so you&#8217;ll need to roll your own if you care.  (The DB schema has not changed and so does not need to be blown away.)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.visophyte.org/blog/2008/07/08/glodas-first-primitive-visualization/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Adding stews (hackish destructive accumulation/reduction) to CouchDB</title>
		<link>http://www.visophyte.org/blog/2007/12/22/adding-stews-hackish-destructive-accumulationreduction-to-couchdb/</link>
		<comments>http://www.visophyte.org/blog/2007/12/22/adding-stews-hackish-destructive-accumulationreduction-to-couchdb/#comments</comments>
		<pubDate>Sat, 22 Dec 2007 08:30:37 +0000</pubDate>
		<dc:creator>Andrew Sutherland</dc:creator>
				<category><![CDATA[CouchDB]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Software]]></category>

		<guid isPermaLink="false">http://www.visophyte.org/blog/2007/12/22/adding-stews-hackish-destructive-accumulationreduction-to-couchdb/</guid>
		<description><![CDATA[As all misguidedly-lazy programmers are wont to do, I decided that it would be easier to &#8216;enhance&#8217; CouchDB to meet my needs rather than to rewrite visotank to use SQLAlchemy. Also, I wanted to understand what CouchDB was doing under the hood with views and try my hand at some Erlang. CouchDB as currently implemented [...]]]></description>
			<content:encoded><![CDATA[<p>As all misguidedly-lazy programmers are wont to do, I decided that it would be easier to &#8216;enhance&#8217; <a href="http://couchdb.com/">CouchDB</a> to meet my needs rather than to rewrite visotank to use SQLAlchemy.  Also, I wanted to understand what CouchDB was doing under the hood with views and try my hand at some Erlang.</p>
<p align="center"> <img src="http://www.visophyte.org/blog/wp-content/uploads/2007/12/big-spiral.png" alt="This Has Nothing To Do With Anything" /></p>
<p>CouchDB as currently implemented maintains a lot of information for each mapped document.  There is a B-tree associated with each View Group whose keys are Document Ids and whose Values are a list of {View Id, Actual-Key-You-Mapped-In-That-View} tuples for every key mapped from that document for every view in the view group.  Next, each View has a B-tree associated with it whose keys are {Actual-Key-You-Mapped, Document Id} tuples and whose values are the Actual-Value-You-Mapped.</p>
<p>This is all well and good, but is a poor fit for one of my key use-cases: reducing e-mail message traffic to date-binned summary statistics so I can render graphics.  If I want the weekly-messages-sent count for a given &#8216;author&#8217;, <em>map(message.author, blah)</em> will allow me to filter only to messages sent by that author, but no matter what <em>blah</em> is, I will still get one per message.</p>
<p>Long blog post short, I have implemented a hackish first-pass reduce/accumulate solution to my problem.  The idea is that &#8216;stews&#8217; allow you to aggregate mapped data that shares the same key.  I&#8217;m a little fuzzy on exactly what the definition of &#8216;reduce&#8217; is in the map/reduce papers (it&#8217;s been a while, if ever), so we&#8217;ll call this &#8216;accumulate&#8217; (in the SICP/Scheme sense). It is a hack because:</p>
<ul>
<li>It does not unify views and &#8216;stews&#8217;.  Whereas views are defined under &#8216;_design&#8217; and accessed via &#8216;_view&#8217;, stews are defined under &#8216;_pot&#8217; and accessed via &#8216;_stew&#8217;.</li>
<li>Values can only be integers right now, and it&#8217;s assumed you want to add them.  (No custom JavaScript logic!)</li>
<li>I have not yet dealt with modified/removed documents.  Which is to say that if you modify or remove a stew-mapped document, your accumulated values will climb ever-skyward.</li>
<li>It is in no way, shape, or form intended to be anything other than a learning experiment.  (It is my hope that <a href="http://damienkatz.net/">Damien Katz</a> magically solves my problems <a href="http://damienkatz.net/2007/12/couchdb_roundup.html">in the next release</a>.  Having said that, I&#8217;m not opposed to trying to actually implement a more solid feature along these lines; coding in Erlang is wicked awesome. (sounds better with a fake accent))</li>
</ul>
<p>It just so happens that these constraints are perfectly in line with visotank&#8217;s needs.  Using stews and otherwise limiting my use of views, CouchDB is less ridiculous in its view-update times and the fully-populated (view/stew-wise) from-scratch &#8216;messages&#8217; database tops out at 77M rather than 1.2G.</p>
<p align="center"> <img src="http://www.visophyte.org/blog/wp-content/uploads/2007/12/little-spiral.png" alt="This also has nothing to do with anything" /></p>
<p>Anyways, if anyone is interested in the code (or the comments I added to the existing couch_view_group.erl logic), my bzr branch for <a href="http://couchdb.com/">CouchDB</a> is at:  <a href="http://www.visophyte.org/rev_control/bzr/couchdb/visbrero-couchdb/">http://www.visophyte.org/rev_control/bzr/couchdb/visbrero-couchdb/</a> .  My bzr branch for <a href="http://code.google.com/p/couchdb-python/">couchdb-python</a>, adding a simple unit test for stews is at: <a href="http://www.visophyte.org/rev_control/bzr/couchdb-python/visbrero/">http://www.visophyte.org/rev_control/bzr/couchdb-python/visbrero/</a> .</p>
<p><strong>Update</strong>!  The bzr repository is powerful messed up, so a better choice might be my changes in patch form:  <a href="http://www.visophyte.org/rev_control/patches/couchdb/visbrero-couchdb-stews-1.patch">http://www.visophyte.org/rev_control/patches/couchdb/visbrero-couchdb-stews-1.patch</a></p>
<p><strong>Update 2</strong>! The bzr repository accessible at <a href="http://clicky.visophyte.org/rev_control/bzr/couchdb/visbrero-couchdb/">http://clicky.visophyte.org/rev_control/bzr/couchdb/visbrero-couchdb/</a> works and there&#8217;s a checkout with working copy (that you can browse) at <a href="http://clicky.visophyte.org/rev_control/bzr-checkouts/couchdb/visbrero-couchdb/">http://clicky.visophyte.org/rev_control/bzr-checkouts/couchdb/visbrero-couchdb/</a> .   Note that these locations are not guaranteed to be valid for all time, but will be good for at least a month or two.</p>
<p>I fear my (sleepy) explanation may not be sufficient, so the unit test I added to couchdb-python may speak better to this end:</p>
<p><code>self.db['tom1'] = {'author': 'tom', 'subject': 'cheese'}<br />
self.db['tom2'] = {'author': 'tom', 'subject': 'cats'}<br />
self.db['tom3'] = {'author': 'tom', 'subject': 'mice'}<br />
self.db['bob1'] = {'author': 'bob', 'subject': 'hats'}<br />
self.db['jon1'] = {'author': 'jon', 'subject': 'hats'}<br />
self.db['kim1'] = {'author': 'kim', 'subject': 'cats'}<br />
self.db['kim2'] = {'author': 'kim', 'subject': 'cows'}<br />
self.db['_pot/test'] = {'views': {<br />
'authors': 'function(doc) { map(doc.author, 1) }',<br />
'subjects': 'function(doc) { map(doc.subject, 1) }'<br />
}}<br />
authors = dict([(row.key, row.value) for row in self.db.view('_stew/test/authors')])<br />
self.assertEqual(authors['tom'], 3)<br />
self.assertEqual(authors['bob'], 1)<br />
self.assertEqual(authors['jon'], 1)<br />
self.assertEqual(authors['kim'], 2)<br />
subjects = dict([(row.key, row.value) for row in self.db.view('_stew/test/subjects')])<br />
self.assertEqual(subjects['cheese'], 1)<br />
self.assertEqual(subjects['cats'], 2)<br />
self.assertEqual(subjects['mice'], 1)<br />
</code></p>
<p>Uh, the spiral visualizations have nothing to do with the post.  They are new insofar as I have never posted them before, but they are in fact rather quite old.  They have a new aspect in that they now work with the cairo renderer, having relied upon &#8216;special&#8217; (horrible) custom renderers in the old agg backend.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.visophyte.org/blog/2007/12/22/adding-stews-hackish-destructive-accumulationreduction-to-couchdb/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>SVG in visotank</title>
		<link>http://www.visophyte.org/blog/2007/12/04/svg-in-visotank/</link>
		<comments>http://www.visophyte.org/blog/2007/12/04/svg-in-visotank/#comments</comments>
		<pubDate>Tue, 04 Dec 2007 09:58:16 +0000</pubDate>
		<dc:creator>Andrew Sutherland</dc:creator>
				<category><![CDATA[clicky]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Visualizing]]></category>
		<category><![CDATA[CouchDB]]></category>
		<category><![CDATA[SVG]]></category>
		<category><![CDATA[visophyte]]></category>
		<category><![CDATA[visotank]]></category>

		<guid isPermaLink="false">http://www.visophyte.org/blog/2007/12/04/svg-in-visotank/</guid>
		<description><![CDATA[visotank now has AJAX-loaded SVG graphics. The hooks are there to actually do something when you click on stuff, but it doesn&#8217;t do anything. The visualization is ripped from my visterity plugin for posterity; it&#8217;s not supposed to be new or exciting. The fact that the SVG is loaded via AJAX is new (visterity didn&#8217;t [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://www.visophyte.org/blog/wp-content/uploads/2007/12/visotank-conversation-timeline-snippet.png" alt="visotank-conversation-timeline-snippet.png" /></p>
<p>visotank now has AJAX-loaded SVG graphics.  The hooks are there to actually do something when you click on stuff, but it doesn&#8217;t do anything.  The visualization is ripped from my visterity plugin for posterity; it&#8217;s not supposed to be new or exciting.  The fact that the SVG is loaded via AJAX is new (visterity didn&#8217;t have that) and exciting.  Pretty much everything else is simply legwork relating to using application/xhtml+xml instead of txt/html and the ramifications of that, especially with AJAX.</p>
<p>I&#8217;ve updated what is running at <a href="http://clicky.visophyte.org:8080/">http://clicky.visophyte.org:8080/</a>, but it looks like my VPS has some issues, so I wouldn&#8217;t be surprised if it things hang or are very slow when not yet cached.  (Specifically, I think it has very serious IO issues, but its absurd amounts of memory available avoid that problem from impacting things too much.)  (Normal slowness like its refusal to pipeline requests and there being hundreds of images is not a VM problem.)  Also, the SVG stuff is unlikely to work on anything but Firefox 2; at least 3.0a8 gets angry for me on gutsy.</p>
<p>To see the SVG graphs, the steps are: 1) select at least one contact in the contacts list, 2) click on the &#8216;conversations&#8217; tab in the bottom half, select a conversation (you can only select one), and 3) click on the &#8216;conversation&#8217; tab in the bottom half.  I should note that you might want to wait for all of the sparkbars to load before proceeding to the next step&#8230;</p>
<p><a href="http://www.visophyte.org/blog/wp-content/uploads/2007/12/visotank-screenshot-conversation-timeline.png" title="visotank-screenshot-conversation-timeline.png"><img src="http://www.visophyte.org/blog/wp-content/uploads/2007/12/visotank-screenshot-conversation-timeline.thumbnail.png" alt="visotank-screenshot-conversation-timeline.png" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.visophyte.org/blog/2007/12/04/svg-in-visotank/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>more (clicky!) mailing-list visualization a la visotank, couchdb</title>
		<link>http://www.visophyte.org/blog/2007/12/02/more-clicky-mailing-list-visualization-a-la-visotank-couchdb/</link>
		<comments>http://www.visophyte.org/blog/2007/12/02/more-clicky-mailing-list-visualization-a-la-visotank-couchdb/#comments</comments>
		<pubDate>Mon, 03 Dec 2007 04:45:34 +0000</pubDate>
		<dc:creator>Andrew Sutherland</dc:creator>
				<category><![CDATA[clicky]]></category>
		<category><![CDATA[CouchDB]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Visualizing]]></category>
		<category><![CDATA[mailman]]></category>
		<category><![CDATA[sparkline]]></category>
		<category><![CDATA[visophyte]]></category>
		<category><![CDATA[visotank]]></category>

		<guid isPermaLink="false">http://www.visophyte.org/blog/2007/12/02/more-clicky-mailing-list-visualization-a-la-visotank-couchdb/</guid>
		<description><![CDATA[Visotank now allows you to select some authors of interest from a sortable list of contacts, and then show the conversations they were involved in. You get the previously shown sparkbars for the author&#8217;s activity. You also get sparkbars showing the conversation activity, with each author assigned a color and consistent stacking position in that [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.visophyte.org/blog/wp-content/uploads/2007/12/visotank-shot-1.png" title="visotank-shot-1.png"><img src="http://www.visophyte.org/blog/wp-content/uploads/2007/12/visotank-shot-1.thumbnail.png" alt="visotank-shot-1.png" /></a></p>
<p>Visotank now allows you to select some authors of interest from a sortable list of contacts, and then show the conversations they were involved in.  You get the previously shown sparkbars for the author&#8217;s activity.  You also get sparkbars showing the conversation activity, with each author assigned a color and consistent stacking position in that sparkbar.  Click on the screenshots for zoomed versions of the screenshots to see what I mean.</p>
<p>You can click on things yourself at <a href="http://clicky.visophyte.org:8080/">http://clicky.visophyte.org:8080/</a>.  Please only go there if you&#8217;re okay with restarting your Firefox session (especially true if Firebug is on.)  All tables/images are the real thing and not fetched on demand&#8230; which results in Firefox having to pull down a lot of images.  Click on some rows in the contacts table to select them.  Then, in the lower tab group, click on the &#8220;conversations&#8221; tab.  This will then fetch all the conversations those selected users were involved in.  The system will truncate more than 10 users, so don&#8217;t go crazy.  The tabs are re-fetched on switch, so if you change your contact selections, in the lower tab group, click away to &#8220;HowTo&#8221;, then back to &#8220;Conversations&#8221;.  The &#8220;Conversation&#8221; tab does nothing and is a big lie.  Great UI, I know.</p>
<p><a href="http://www.visophyte.org/blog/wp-content/uploads/2007/12/visotank-shot-2.png" title="visotank-shot-2.png"><img src="http://www.visophyte.org/blog/wp-content/uploads/2007/12/visotank-shot-2.thumbnail.png" alt="visotank-shot-2.png" /></a></p>
<p>I think you will find that sparkbar visualizations of the conversation traffic with a weekly granularity are rather useless.  I think a reasonable solution would be a &#8216;zoomed&#8217; sparkbar with an indication  of the actual uniform timeline scale included.  Since the images currently show about 2 years of data, a thread that happened 1 year ago would be centered in the middle of the image, but with its actual horizontal scale being inconsistent with that position.  Future work, as always.</p>
<p>I have used Pylon&#8217;s Beaker caching layer to attempt to make things reasonably responsive.  While CouchDB view updates are sadly quite lengthy (many many minutes when dealing with 16k messages; python-dev from Jan 2006 through Nov 2007), that is thankfully a one-off sort of thing.  (The data-set is immutable once imported and I don&#8217;t change schemas that often.)  The main performance hit is that I can only issue one range of keys to query in a request, so if I am trying to snipe a subset of non-consecutive information, I have to issue multiple requests.  (I don&#8217;t believe POSTed views can operate against views in the database&#8230;)</p>
<p>Regrettably, I think my conclusion about CouchDB is that it (or something like it) will be truly fantastic in the future, but it is not going to get there soon enough for anything that hopes to be &#8216;productized&#8217; anytime soon.  The next thing I want to look at is using a triple-store to model some of the email data schema; my efforts from the visterity hacking suggest it could be quite useful.  Of course, even if triple stores work out, I suspect a more traditional SQL database will still be required for some things.  Combined with a thin custom aggregation and caching layer, that could work out well.</p>
<p>Note: I should emphasize that my CouchDB schema could be more optimized, but part of the experiment is/was to see if the views saved me from having to jump through clever hoops.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.visophyte.org/blog/2007/12/02/more-clicky-mailing-list-visualization-a-la-visotank-couchdb/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>first steps to interactive fun using CouchDB</title>
		<link>http://www.visophyte.org/blog/2007/11/12/first-steps-to-interactive-fun-using-couchdb/</link>
		<comments>http://www.visophyte.org/blog/2007/11/12/first-steps-to-interactive-fun-using-couchdb/#comments</comments>
		<pubDate>Mon, 12 Nov 2007 08:46:20 +0000</pubDate>
		<dc:creator>Andrew Sutherland</dc:creator>
				<category><![CDATA[CouchDB]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Visualizing]]></category>
		<category><![CDATA[mailman]]></category>
		<category><![CDATA[sparkline]]></category>
		<category><![CDATA[visophyte]]></category>

		<guid isPermaLink="false">http://www.visophyte.org/blog/2007/11/12/first-steps-to-interactive-fun-using-couchdb/</guid>
		<description><![CDATA[First, let me say that Pylons with its Paste magic is delightful; lots of nice round edges helped me get something simple up and running in no time, and using genshi to boot. The new tool, visotank, is ingesting the python-dev mailman archives (as previously visualized) and putting them into CouchDB. The near-term goal is [...]]]></description>
			<content:encoded><![CDATA[<p> <img src="http://www.visophyte.org/blog/wp-content/uploads/2007/11/visotank-first-python-dev-sparkbars1.png" alt="visotank-first-python-dev-sparkbars1.png" /></p>
<p>First, let me say that <a href="http://pylonshq.com/">Pylons</a> with its <a href="http://pythonpaste.org/">Paste</a> magic is delightful; lots of nice round edges helped me get something simple up and running in no time, and using <a href="http://genshi.edgewall.org/">genshi</a> to boot.</p>
<p>The new tool, visotank, is ingesting the python-dev mailman archives (as <a href="/blog/2007/09/16/python-dev-mailman-archive-thread-arc-visualizations/">previously </a><a href="/blog/2007/10/01/cairo-bakes-pretty-pies/">visualized</a>) and putting them into <a href="http://couchdb.com/">CouchDB</a>.  The near-term goal is to allow for interactive exploration/visualization of the archives.  The current result, as pictured, is simply sparkline barcharts of people&#8217;s posting history.  Left-to-right, present-to-past, weekly, one (vertical) pixel per message, truncating at the image height (12 pixels).</p>
<p>Although the input processing thus far is specific to mailing list archives, the couchdb schema in use is for generic e-mail traffic.  The messages are even coerced into rfc2822 format for &#8216;raw&#8217; storage.</p>
<p>The ability to use &#8216;map&#8217; multiple times in couchdb views to spread information is delightful.  What I really would like is more <em>reduce</em> functionality or, more specifically, just <em>accumulate</em>.  The sparkbars get their data from statistics with keys [contact id, timestamp of time period] and value <em>1</em>, one per message.  I would love for couchdb to provide a way to aggregate all those values with identical keys into a single row with the sum as the value.  I&#8217;ll look into this and the view implementation before writing any more on the subject, but if someone out there already knows a way to do this, please let me know.</p>
<p><img src="http://www.visophyte.org/blog/wp-content/uploads/2007/11/visotank-first-python-dev-sparkbars2.png" alt="visotank-first-python-dev-sparkbars2.png" /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.visophyte.org/blog/2007/11/12/first-steps-to-interactive-fun-using-couchdb/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>radial (radar) email vis, with care factors</title>
		<link>http://www.visophyte.org/blog/2007/11/02/radial-radar-email-vis-with-care-factors/</link>
		<comments>http://www.visophyte.org/blog/2007/11/02/radial-radar-email-vis-with-care-factors/#comments</comments>
		<pubDate>Fri, 02 Nov 2007 20:59:18 +0000</pubDate>
		<dc:creator>Andrew Sutherland</dc:creator>
				<category><![CDATA[Email]]></category>
		<category><![CDATA[Posterity]]></category>
		<category><![CDATA[Visualizing]]></category>
		<category><![CDATA[radar]]></category>
		<category><![CDATA[visophyte]]></category>

		<guid isPermaLink="false">http://www.visophyte.org/blog/2007/11/02/radial-radar-email-vis-with-care-factors/</guid>
		<description><![CDATA[It&#8217;s a radial e-mail visualization intended to be the basis for a &#8220;situational awareness&#8221; overview of your e-mail. I&#8217;ve added the beginnings of a &#8216;care factor&#8217;* (&#8220;do I care about this person/message?&#8221;) concept to messages and contacts, which is used to assist in focusing your attention only to messages/people you care about. Right now, the [...]]]></description>
			<content:encoded><![CDATA[<p> <img src="http://www.visophyte.org/blog/wp-content/uploads/2007/11/radial-care-factor-vis.png" alt="radial-care-factor-vis.png" /></p>
<p>It&#8217;s a radial e-mail visualization intended to be the basis for a &#8220;situational awareness&#8221; overview of your e-mail.  I&#8217;ve added the beginnings of a &#8216;care factor&#8217;* (&#8220;do I care about this person/message?&#8221;) concept to messages and contacts, which is used to assist in focusing your attention only to messages/people you care about.  Right now, the care factor is simply whether you have ever sent the contact/author of a message an e-mail directly (to = 1.0), indirectly (cc = 0.5), or not at all (nada/ninguno=0.0).  That can obviously be expanded upon in many directions; involvement of people you care about in message threads (with that person), intensity of your communication with that person, explicit interest-levels via tags, social network propagation (Google&#8217;s OpenSocial) without the person previously having existed in your e-mail corpus, etc.</p>
<p>Some more details about the visualization:</p>
<ul>
<li>Things close to the center happened more recently.  Things further away happened in the past.  This seems like the most reasonable &#8216;radar&#8217; metaphor for e-mail.  If we were dealing with to-do items with due dates, then it would make sense that they are moving inward.  However, the reality of e-mail is that if you don&#8217;t deal with them soon, they &#8216;fall off your radar&#8217;.  My first thought to fuse the two would be to have messages associated with to-do tasks stick out quite obviously, latch once they hit the &#8216;edge&#8217;, and generally grow more ominous and threatening as time goes by.  Of course, it&#8217;s probably not helpful to make people&#8217;s to-do lists seem like something they can&#8217;t escape&#8230;
<ul>
<li>The central grey circle is a void to ensure that angle is still meaningful even when the time is at a minimum; otherwise things would stack up and be generally extra confusing.</li>
</ul>
</li>
<li>The angle is mapped to a single author/contact.  This is currently random, but my intent is to allow clustering of contacts and quasi-persistent angular locations.  So messages from your family might tend to come from the North, your friends the East, mailing lists the West, and ads from the South.  (Let&#8217;s assume you get no spam.)  Actual geographic relationships would be a neat trick, but practically foolish.</li>
<li>Messages with a low care-factor are made more subtle by having reduced opacities.  I forgot to make the edges linking messages to their parent more subtle&#8230;</li>
<li>Contacts with a high care-factor get their (anonymized) name in a strong color and their slice of the pie highlighted with a color.  Contacts with a low care-factor have their names displayed more subtly and just get a grey hue for their outer-ring marker/label.  The intent with the slice coloring is mainly to be intensity based with only one or two hues in use; I think using more colors will quickly overwhelm the display.</li>
<li>Time markers are in use, but may not be obvious.  The blue ring labeled &#8217;30&#8242; (along the x-axis) indicates that&#8217;s October 30th.  The inner white ring is November 1st, but I&#8217;m not clear on why it wasn&#8217;t labeled as such (aka bug).  The time marker logic needs to be refactored to provide more usable single &#8220;ruler&#8221; labeling (the timeline use currently is biased towards 2 rulers, which is where the month and year went).  See the test program output from below for a better example of time display, although the month/year are still AWOL in another ruler.</li>
</ul>
<p><img src="http://www.visophyte.org/blog/wp-content/uploads/2007/11/radial-blah-blah-blah.png" alt="radial-blah-blah-blah.png" /></p>
<p>And there&#8217;s the test program.  Note that edges connect a message to its parent, and currently always flow clock-wise for time.  So the innermost red message is the parent of the inner-most green message.  I&#8217;m a bit conflicted about this; the consistency is nice, but the relationship would probably be more obvious if we took the shortest path.  Also, since e-mail reply relationships are causal, it&#8217;s not like there&#8217;s any doubt which message was a reply to the other.</p>
<p>* I say &#8216;care factor&#8217; because I did this work on a red-eye flight where my tiredness overwhelmed my natural defense against puns, and since Halloween was recent, and there was that tv show called &#8216;scare factor&#8217;, etc. etc.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.visophyte.org/blog/2007/11/02/radial-radar-email-vis-with-care-factors/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

