----- js_GC data pages: 96731 total vm pages: 234849 ### KiB 11376 839 BF_format log4moz.j:487 15906 725 mime_emitter_writeBody jsmimeemitter.j:381 148 576 MimeMessage_coerceBodyToPlaintext mimemsg.j:341 4460 492 gloda_ds_queryFromQuery datastore.j:2988 9655 387 <MYSTERIOUS> <BEYOND THE VEIL>:0 911 237 gloda_content_quoted connotent.j:267 1723 144 gloda_indexMessage index_msg.j:2597 187 141 gloda_content_content connotent.j:290 170 108 gloda_content_getContent connotent.j:166 1917 73 gloda_identity_toString datamodel.j:785 1775 69 Log4Moz_enumerateProps log4moz.j:220 value |-------------------------------------------------- count 0 |@ 366 1 |@@@@ 1371 2 |@@@@@ 1584 4 |@@@@@@@@@@@@@@@@@@@@@@@@ 6863 8 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 11784 16 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 12794 32 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 14016 64 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 8662 128 |@@@@@@@@@@@@@@ 4093 256 |@@@@@ 1567 512 |@ 309 1024 | 105 2048 | 58 4096 | 28 8192 | 17 16384 | 12 32768 | 0 65536 | 0
The script. It tells you what JS code has been creating strings since the last GC and the overall distribution of the marginal costs of those strings, leaving out puny code that allocates less than 64k’s worth. Marginal costs because spidermonkey has support for dependent strings. “a += b” will actually result in the storage of a being realloc’d with a pointing into the storage of a’. I forgot to check for the js_ConcatStrings case at first and, let me tell you, the stats were not pretty. Poor mime_emitter_writeBody was accused of allocating 11 gigabytes of strings in one intra-GC cycle!
What is interesting about this script is that it is actually pulling all of the function name/script name/line number information directly out of a JSContext object. What is exciting is that it was pretty easy to do thanks to systemtap’s understanding of DWARF debug symbols. This is in severe comparison to my efforts using VProbes where there was no such support and I had to write a gdbpython script to begin to approach the same thing. Also, because systemtap’s guru mode allows you to bail into real C code when you need, I was able to write a function that truncates the jschar (16-bit unicode) strings that make up the function names into ASCII that is both straightforward and efficient. It also allowed for a quick/easy/fast basename() implementation too.
It is worth noting that spidermonkey does have built-in (conditionally compiled) static probe points originally created for DTrace and that they do provide more generic object lifetime hooks. Although they were created for DTrace, systemtap is also able to use them. The limitations with those probes is that once any probe is active using a hook, a non-trivial cost is incurred because filtering can only be performed inside the probe and you may receive orders of magnitude more calls than you actually care about. The cost is worst for probes that provide function names since that requires a call to js_GetStringBytes every time (although there is some caching.) Keep in mind that I’m not saying the systemtap userspace probes are free, but since they can be placed at arbitrary points, you can have probes that fire only in the cases you actually care about, only incur data-processing costs when desired, and are disturbingly easy to implement since you can just reach into the data structures you care about.
Since everyone loves an implementation fight, DTrace, great and magical though it is, is basically incapable of pulling off the JS stack-walking trick that systemtap makes possible. Unless DTrace has grown these features since I last checked, it lacks the DWARF magic to do such things easily and the language intentionally avoids the control-flow constructs that would allow it to do anything quite so complex. And you can entirely forget about arbitrary C code for the unicode string truncation…
This post is brought to you by efforts to decrease the memory footprint of Thunderbird when gloda’s indexer is active. Because Thunderbird loves you. But it loves your memory more. For now. That’s what we’re trying to fix.
Nifty! What are high-level conclusions were you able to draw from this data, if any?
Does log4moz always generate that many pages, or is that only in the non-default case of having gloda logging preffed on?
Conclusions are on the motivating bug:
https://bugzilla.mozilla.org/show_bug.cgi?id=557001#c10
That is the non-default case of gloda logging at maximum. Since we go out of our way to avoid making calls to log4moz when debugging is not enabled, I would expect much less log4moz traffic in that case.
I should note that the page counts are the overall page counts for the process at the time of the GC (before it runs) and not deltas or anything like that.
That’s superb. congrats