understanding where unit tests go wrong with object diffs

In our last chroniquery adventure, we used roc‘s chronicle-recorder to trace the execution of an xpcshell unit test and chroniquery (repo info) to use that trace to help us understand what was happening in some particularly confusing C-ish code. Actually, we’re doing that this time too, but the bag of tricks now contains additional tricks.

The exciting things happening in the screenshot above, in order of increasing awesomeness:

The argument list is exposing out-parameters (via a heuristic though, it doesn’t know the XPCOM details.) ex: “*folderInfo: 0x0” used to be “folderInfo: some pointer to hold an out-parameter that leaves us none the wiser.”
Chronifer now sees through ‘this’-adjusting trampolines/wrappers. Previously, function calls might be hidden because the stack pointer change that chronicle uses to determine function calls jumped to the trampoline whose bytecode was not associated with the actual function being called, and so would potentially be assumed to be boring. The trampoline would modify rdi then perform a trivial jump (it could have just fallen through) into the function proper, which the existing trampoline detector would not notice. (That detector only would notice an instant jump of more than 128 bytes; intended for dynamic linking indirection purposes.)
Pretty printers allow us to translate nsresult error codes to strings, display the strings referenced by nsString/nsCString instead of the raw structure (mData/mLength/mFlags), pierce nsCOMPtr’s, etc.
Pretty printers allow us to detect exceptional values and show them in red! For example, that NS_ERROR_INVALID_POINTER is red because the nsresult pretty printer’s is_exceptional method returned True because the error bit was high.
The configuration method that lets us say what directories/files/classes/functions are interesting/boring has been augmented to let us specify arguments and the sub-fields of arguments to dump. For example, in the above screenshot, we have added “dump_this=True” to the section labeled “[class@nsMsgLocalMailFolder]”. As a result, any time a method is called on nsMsgLocalMailFolder, we dump the contents of its ‘this’. If we only wanted a few fields from “this”, we could have used “dump_this=mPath,mIsServer,mName,mURI” instead. You can see the end of a full object dump at the top of the screenshot.
The object dump output is diffed against the result of previous calls. This is why we have that text in yellow. The first entire GetDBFolderInfoAndDB call you see on the screen is a succesful call (no red!) diffed against a previous successful call. Mainly we see the name-derived things changing and things we expect to be distinct (file path, database). Then we have an error case, with a few things we probably would not have noticed if we were doing this by hand. I was somewhat surprised that mHaveParsedURI, the reference count, and m_numOfflineMsgLines showed up. The others are not entirely shocking, but it’s useful to have them all there. Without the diff, we would be looking at 65 lines of attribute values, and that is with us already benefitting from the pretty printers.

Exciting possibilities opened by enhancements here:

The pretty printers and internal cleanups/refactoring make it feasible to express complicated data-based constraints simply. For example, with a tractable amount of work I could opt to only see calls on the folder named “gabba3” using the constraint “mURI == ‘mailbox://nobody@Local%20Folders/gabba0′”. This could be made particularly efficient if we make an assumption that the attribute is invariant once observed and then allow us to key off the pointer/memory region.
Code-sharing of pretty printers between my chroniquery code and my archer gdb plugins.
Now that archer-gdb has python inferior control, potential use of archer-gdb as an alternate back-end to chroniquery.
It would be neat to be able to expose chroniquery’s functionality via a web interface so that things could be expanded/drilled down without having to re-run things. This would also help offset the non-trivial startup costs to chronicle-recorder’s own chronicle-query. (Although arguably the startup time is nothing compared to the inefficiency of all the work chronicle-query does for a call and throws away when the next call pretty much wants that same data.)

visophyte: shiny? shiny.

Andrew Sutherland writes things but (almost) always includes pictures to look at.

understanding where unit tests go wrong with object diffs

1 thought on “understanding where unit tests go wrong with object diffs”