I have implemented a feature in quickbms 0.3.8 that I guess could be useful to some advanced users.
as many of you know, the main problem when handling a compressed file is figuring the algorithm used for compressing it.
in some cases it's simple like for the zlib data which starts with 0x78 or with the deflate algorithm in general where it's enough to use tools like offzip to recognize it, in other cases is less easy but still possible like the lzma data which has a 5 bytes header with some zeroes in it or with lzss that is used often and the plain-text data in it is enough visible or with xmemdecompress that seems used in many x360 games but usually it's hard or not possible to figure them on the fly.
so recently after an obsession (really!) in the searching of various algorithms to implement in quickbms I thought that wouldn't be bad to use this vast collection of algorithms in sequence for knowing if and what algorithm was used or at least having an idea of what of them has a result closer to the original uncompressed file.
the following is a simple example of quickbms script for doing this "scan job":
http://aluigi.org/papers/bms/comtype_scan.bms
it's job is very simple, it reads the input file and performs a decompression using any of the available algorithms and each output will be placed in an output file named with a sequential number which is the same one used internally by quickbms to index the algorithm.
the number associated to the algorithm is visible in the source code of quickbms.c where I have numbered them in steps of 5 like:
Code: Select all
...
COMP_LZO1,
COMP_LZO1A,
COMP_LZO1B, // scan 5
COMP_LZO1C,
COMP_LZO1F,
COMP_LZO1X,
COMP_LZO1Y,
COMP_LZO1Z, // scan 10
COMP_LZO2A,
...
so it's clear that the input file must be the direct raw compressed data block without headers and other useless stuff, or in any case it must have the format supported by quickbms (this is valid only for those algorithms that don't have a standard like quad, balz, paq6, ppmd and so on).
unfortunately, although I have tried to fix some of them a bit, the algorithms implemented come from third parties and so are not written correctly for supporting invalid data, the consequence is that a file compressed with zlib will cause the crash of quickbms when the script is scanning the XMemDecompress algorithm and the same will happen with other non-well-written algorithms.
so IN ANY CASE is needed the hand of the user (you) for removing the crashed algorithms from the bms script.
that means that in the case of the previous example we need to substituite
with
for skipping that algorithm which causes the crash and continuing with the others and doing the same (uncommenting or adding them) with any other crashed algorithm.
and now a practical example:
a couple of days ago chrrox had a doubt about the algorithm used in that Rumble Rose game so I dumped the compresse data from the files he provided (you can see it some bytes after the "BPE" signature) and I launched
quickbms -o comtype_scan.bms data.dat z:\ and I noticed that the generated file with name "55" (the bpe algorithm) had a perfectly comprehensible data in it.
ok then I noticed that only the first 0x400 compressed bytes gave this perfect result but that scan of some seconds was enough to guess the correct algorithm without having that game or without spending time to do the manual check on each algorithm.
well I hope it helps and let me know any doubt or idea.
oh and as I have said before this stuff is intended only for advanced users!