spider91's wwise_ima_adpcm gave me the final clue, that Wwise uses one less sample per block than standard MS IMA:
MS IMA has the weird feature that every block, which is fairly small, has in its header the first sample in full 16-bit PCM form, which allows for very accurate seeking. The rest of the block is 4 byte chunks with 8 4-bit samples each, so the sample count of a block is always 8x+1.
In Wwise IMA, the final decoded sample is thrown away, and in fact that last nibble is always 0. Thus, sample count is 8x. I assume they did this because it lets them keep some buffer aligned to nice round numbers.
The upshot of which is that it is impossible to convert Wwise IMA ADPCM to standard Microsoft IMA ADPCM; there are these extra samples hanging out which throw the whole file off. There is actually a space in the extra format data for "samples per block", but I don't think many decoders care about that (sox
rejects anything without that extra +1 sample-per-block, libavformat seemingly ignores it).
Anyway, I'm glad to finally be able to put this behind me and retire the old ima_rejiggers, which never worked properly.
Here's
ima_rejigger5, which does the decoding to PCM and supports both RIFF and RIFX wems, so it should be usable with The Wonderful 101. It isn't super-necessary but I wanted to get some closure here.
edit;
I'm realizing belatedly that my whole digression into reinterleaving stuff with the rejiggering tools was a huge wrong turn, this IMA turns out to be more common to the stuff we see on consoles all the time than the weird +1 thing standardized in RIFF. Sorry for causing and prolonging the confusion. I hope things are correct now, they certainly sound good.
edit2:
One final thought: I may have been a little too harsh on myself, this is definitely an unusual format, somewhere between MS IMA and normal IMA. Normally when you see IMA/DVI with block headers with a sample in it, that sample is just
history, used only for seeking; you have to take that history and compute a new sample from the first nibble of ADPCM data. It is a good idea that MS came up with to not waste that and actually use it as part of the stream, so it is less redundant, but in so doing they wound up with really oddly sized blocks. I think AK came along and realized the best of both worlds, as they've managed to do in such creative ways with Vorbis.