{"id":485,"date":"2010-04-06T02:44:54","date_gmt":"2010-04-06T10:44:54","guid":{"rendered":"http:\/\/www.visophyte.org\/blog\/?p=485"},"modified":"2016-04-20T01:22:11","modified_gmt":"2016-04-20T05:22:11","slug":"performance-annotated-sqlite-explaination-visualizations-using-systemtap","status":"publish","type":"post","link":"https:\/\/www.visophyte.org\/blog\/2010\/04\/06\/performance-annotated-sqlite-explaination-visualizations-using-systemtap\/","title":{"rendered":"performance annotated SQLite EXPLAINation visualizations using systemtap"},"content":{"rendered":"<p><a href=\"http:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2010\/04\/gloda-fulltext-search-before.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignleft size-thumbnail wp-image-488\" title=\"gloda-fulltext-search-before\" src=\"http:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2010\/04\/gloda-fulltext-search-before-250x600.png\" alt=\"\" width=\"250\" height=\"600\" srcset=\"https:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2010\/04\/gloda-fulltext-search-before-250x600.png 250w, https:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2010\/04\/gloda-fulltext-search-before-125x300.png 125w, https:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2010\/04\/gloda-fulltext-search-before-427x1024.png 427w, https:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2010\/04\/gloda-fulltext-search-before.png 1224w\" sizes=\"auto, (max-width: 250px) 100vw, 250px\" \/><\/a>For the Thunderbird 3.1 release cycle we are not just fixing UX problems but also resolving various performance issues.\u00a0 Building on <a href=\"http:\/\/www.visophyte.org\/blog\/2009\/02\/04\/visualization-of-control-flowdata-flow-analysis-of-sqlite-explain-ations\/\">my previous work<\/a> on a visualization of SQLite opcode control flow graphs using graphviz I give you&#8230; the same thing!\u00a0 But updated to more recent versions of SQLite and integrating performance information retrieved through the use of <a href=\"http:\/\/sourceware.org\/systemtap\/\">systemtap<\/a> with utrace.<\/p>\n<p>In this case we are using systemtap to extract the number of times each opcode is executed and the number of btree pages that are requested during the course of executing the opcode.\u00a0 (We do not differentiate between cache hits and misses because we are looking at big-O right now.)\u00a0 Because virtual tables (like those used by FTS3) result in nested SQLite queries and we do not care about analyzing the queries used by FTS3, we ignore nested calls and attribute all btree page accesses to the top-level opcode under execution.<\/p>\n<p>Because the utility of a tool is just as dependent on ease of use as its feature set, I&#8217;ve cleaned things up and made it much easier to get information out of Thunderbird\/gloda with this patch which should land soon and provides the following:<\/p>\n<ul>\n<li>The gloda datastore will write a JSON file with the EXPLAINed results of all its SQL queries to the path found in the preference <em>mailnews.database.global.datastore.explainToPath<\/em>.\u00a0 This preference is observed so that setting it during runtime will cause it to create the file and begin explaining all subequently created queries.\u00a0 Clearing\/changing the preference closes out the current file and potentially opens a new one.<\/li>\n<li>Gloda unit tests will automatically set the preference to the value of the environment variable <em>GLODA_DATASTORE_EXPLAIN_TO_PATH<\/em> if set.<\/li>\n<li>A schema dump is no longer required for meta-data because we just assume that you are using a SQLite DEBUG build that tells us everything we want to know about in the &#8216;comment&#8217; column.<\/li>\n<li>grokexplain.py now uses optparse and has more internal documentation and such.<\/li>\n<\/ul>\n<p>So what do the pretty pictures show?<\/p>\n<p><a href=\"http:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2010\/04\/gloda-fulltext-search-after.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignright size-thumbnail wp-image-489\" title=\"gloda-fulltext-search-after\" src=\"http:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2010\/04\/gloda-fulltext-search-after-220x600.png\" alt=\"\" width=\"220\" height=\"600\" srcset=\"https:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2010\/04\/gloda-fulltext-search-after-220x600.png 220w, https:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2010\/04\/gloda-fulltext-search-after-110x300.png 110w, https:\/\/www.visophyte.org\/blog\/wp-content\/uploads\/2010\/04\/gloda-fulltext-search-after-376x1024.png 376w\" sizes=\"auto, (max-width: 220px) 100vw, 220px\" \/><\/a><\/p>\n<ul>\n<li>Before: A gloda fulltext query search retrieves all of the results data before applying the LIMIT.\u00a0 This results in a lot more data transplanted into temporary results tables than we will end up using; wasted bookkeeping.\u00a0 Additionally, we incur the row lookup costs for both our <em>messages<\/em> data storage table and our fulltext <em>messagesText<\/em> table for all hits, even the ones we will not return.\u00a0 (Noting that there was no escaping hitting both tables since we were using offsets() and it hits the fulltext table as part of its operation.)<\/li>\n<li>After: We perform an initial query phase where we minimize wasted bookkeeping by only retrieving and using the bare minimum required to compute the LIMITed list of document id&#8217;s.\u00a0 Additionally, by use of the FTS3 matchinfo() function instead of the offsets() function we are able to avoid performing row lookup on the <em>messagesText<\/em> table for results that will not be returned to the user.\u00a0 Use of the matchinfo() function requires a custom ranking function which allows us to be somewhat more clever about boosting results based on fulltext matches too.<\/li>\n<li>The poor man&#8217;s bar graphs in the pictures are expressing a hand-rolled logarithmic scale for the number of invocations (left bar) and number of btree pages accessed (right bar).\u00a0 On the right side of each line the actual numbers are also presented in the same order.\u00a0 The goal is not to convey good\/bad so much as to draw the eye to hot spots.<\/li>\n<\/ul>\n<p>Notes for people who want more to read:<\/p>\n<ul>\n<li>SQLite has built-in infrastructure to track the number of times an opcode is executed as well as the wall-clock time used; you do not have to use systemtap.\u00a0 It&#8217;s a conditional compilation kind of thing, just -DVDBE_PROFILE and every statement you execute gets its performance data appended to vdbe_profile.out when you reset the statement.\u00a0 It doesn&#8217;t do the btree pages consulted trick, but it&#8217;s obviously within its power with some code changes.<\/li>\n<li>Our use of systemtap is primarily a question of discretionary reporting control, the ability to integrate the analysis across other abstraction layers, and various build concerns.\u00a0 The JSON output is not a driving concern, just something nice we are able to do for ourselves since we are formatting the output.<\/li>\n<li>The tool has really crossed over into the super-win-payoff category with this fix investigation.\u00a0 (Having said that, I probably could have skipped some of the data-flow stuff last time around.\u00a0 But these projects are both for personal interest and amusement as well as practical benefit, so what are you gonna do?\u00a0 Also, that effort could pay off a bit more by propagating comments along register uses so that the LIMIT counter register r8 and the compute-once r7 in the after diagram would not require human thought.)<\/li>\n<\/ul>\n<p>References:<\/p>\n<ul>\n<li>The <a href=\"http:\/\/hg.mozilla.org\/users\/bugmail_asutherland.org\/grokexplain\/\">grokexplain repo<\/a>.\u00a0 Used like so: <em>python grokexplain.py &#8211;vdbe-stats=\/tmp\/glodaNewSearchPerf.json \/tmp\/glodaNewSearchExplained.json -o \/tmp\/glodasearch-newcheck<\/em><\/li>\n<li>The <a href=\"http:\/\/hg.mozilla.org\/users\/bugmail_asutherland.org\/tb-test-help\/file\/f6bf91d1a962\/systemtap\/sqlite-perf.stp\">systemtap script in its repo<\/a>.\u00a0 Used like so: <em>sudo stap -DMAXSTRINGLEN=1024 sqlite-perf.stp \/path\/to\/thunderbird-objdir\/mozilla\/dist\/lib\/libsqlite3.so &gt; \/tmp\/glodaNewSearchPerf.json<\/em><\/li>\n<li><a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=550648\">The bug<\/a> with the gloda explain logic patch and the improved SQL query logic.\u00a0 I also used the tooling to fix another (now-fixed) bug, but that problem turned out to be less interesting.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>For the Thunderbird 3.1 release cycle we are not just fixing UX problems but also resolving various performance issues.\u00a0 Building on my previous work on a visualization of SQLite opcode control flow graphs using graphviz I give you&#8230; the same &hellip; <a href=\"https:\/\/www.visophyte.org\/blog\/2010\/04\/06\/performance-annotated-sqlite-explaination-visualizations-using-systemtap\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[7,3,8,6,4],"tags":[25,83,72,24,76],"class_list":["post-485","post","type-post","status-publish","format-standard","hentry","category-debugging","category-mozilla","category-program-execution","category-thunderbird","category-visualizing","tag-graphviz","tag-grokexplain","tag-performance","tag-sqlite","tag-systemtap"],"_links":{"self":[{"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/posts\/485","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/comments?post=485"}],"version-history":[{"count":10,"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/posts\/485\/revisions"}],"predecessor-version":[{"id":929,"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/posts\/485\/revisions\/929"}],"wp:attachment":[{"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/media?parent=485"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/categories?post=485"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.visophyte.org\/blog\/wp-json\/wp\/v2\/tags?post=485"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}