Valgrind, you saved my day.
While writing code to extract the GPU microcode out of the kernel from flash (on 360, in order to not require the GPU library to ship with any copyrighted files), I’ve came across a very ugly bug: My code worked perfectly, until I’ve changed the stack layout by inserting a variable or changed gcc’s optimization setting. Suddently, libmspack refused to unpack my data. I’ve carefully audited my code, but couldn’t find any oddity or abuse. It didn’t seem like a stack overflow. Data passed to libmspack was ok, it just didn’t worked anymore.
Half an hour later, totally puzzled, I fired up valgrind, and it didn’t yield anything spectacular (like an overwrite) - at least on the first sight. On the second, I’ve saw a
==14873== Conditional jump or move depends on uninitialised value(s)
==14873== at 0x804CAC7: lzxd_decompress (lzxd.c:545)
==14873== by 0x80490DC: main (in ./tool)
Now, the line in question was
if (lzx->length && (lzx->length - lzx->offset) < (off_t)frame_size) {
and i’m pretty sure that i’ve passed valid value lzxd_init (where lzx->length comes from). frame_size and lzx->offset also had fixed values, so this couldn’t be the problem. I’ve double checked that the values passed weren’t uninitialized, but they were all not only initialized but also perfectly valid.
With some debug statements, I’ve made sure that lzx->length was the offending variable, and then I finally found it: sizeof(lzx->length) was different in the library than in my sourcecode. lzx->length is an off_t, and thus the upper 32bit were undefined. Yes, the traditional off_t LFS pitfall… And I’ve trapped right into it. A “#define _FILE_OFFSET_BITS 64” on top of my source fixed it.