Page 1 of 1
[PS2] Saint Seiya Sanctuary/Hades compression with LZ77
Posted: Fri Feb 10, 2012 5:54 pm
by gustavofarias
In both games there are lots of images and textures compressed using some variation of LZ77 algorithm. There is a plugin that comes with Noesis that can decompress some of those files. But it can open only the first image when the file has multiple images, and can't open some files which have a different structure than expected.
My goal is to develop a [Java] tool to decompress the files Noesis can't. I would appreciate if someone could explain me in more detail the code below, that was extracted from Noesis plugin and seem to be a good starting point for me.
Code: Select all
typedef struct gmiCmpHdr_s
{
BYTE id[4]; //"CMPS"
int cmpType;
int decompSize;
int resv;
} gmiCmpHdr_t;
//decompress data
static BYTE *Model_GMI_Decompress(BYTE *src, int &len, noeRAPI_t *rapi)
{
gmiCmpHdr_t *cmpHdr = (gmiCmpHdr_t *)src;
int srcLen = len;
int dstLen = cmpHdr->decompSize;
BYTE *dst = (BYTE *)rapi->Noesis_PooledAlloc(dstLen);
int srcPtr = sizeof(gmiCmpHdr_t);
int dstPtr = 0;
while (srcPtr < srcLen && dstPtr < dstLen)
{
BYTE ctrl = src[srcPtr++];
for (int i = 0; i < 8 && srcPtr < srcLen && dstPtr < dstLen; i++)
{
if (ctrl & (1<<i))
{ //literal
dst[dstPtr++] = src[srcPtr++];
}
else
{ //ofs+len
WORD ol = *(WORD *)(src+srcPtr);
srcPtr += sizeof(WORD);
int len = 3 + ((ol>>8) & 15);
int relOfs = (ol & 255) | ((ol>>12) << 8);
int ofs = dstPtr - ((dstPtr-18-relOfs) & 4095);
for (int j = 0; j < len && dstPtr < dstLen; j++)
{
if (ofs+j < 0 || ofs+j >= dstPtr)
{
dst[dstPtr++] = 0;
}
else
{
dst[dstPtr++] = dst[ofs+j];
}
}
}
}
}
assert(srcPtr <= srcLen);
assert(dstPtr <= dstLen);
len = dstLen;
return dst;
}
Re: [PS2] Saint Seiya Sanctuary/Hades compression with LZ77
Posted: Fri Feb 10, 2012 7:36 pm
by Insanius
You need to understand how structs work for this to make any sense.
First, look in the code for gmiCmpHdr_t and noeRAPI_t. They're like classes, except they usually only define variables. A struct is like a window, which is placed over a part of memory and tells you what kinds of data are there. In java you would have to read the data a few bytes at a time and interpret it manually, then use getters and setters to build your object. Using a struct, you direct it at a part of memory and all of the variables will be defined with whatever data is there.
Secondly, -> is used to access a class member. It's the same as using . in java.
Lastly, the code uses pointer math to scan through all the data. In java you would have to use a stream to read in bytes one by one, or at least use a ByteBuffer. I would definitely recommend studying how pointers work in C in order for all of this code to make sense.
Re: [PS2] Saint Seiya Sanctuary/Hades compression with LZ77
Posted: Sat Feb 11, 2012 2:10 am
by MrAdults
You're mistaken about Noesis not handling the files with multiple images. It loads them all. You just have to select textures in the data viewer to see them.
Beyond that... I think the bitwise operations (which is where most of the complexity is) should translate right over to Java.
Re: [PS2] Saint Seiya Sanctuary/Hades compression with LZ77
Posted: Wed Feb 15, 2012 2:55 am
by gustavofarias
It's extremely simple, about as standard-fare as you can get with LZ77-based algorithms (8-bit flag with 16-bit offset+length pairs for non-literals)
What is the "flag" in LZ77? How long is a literal and the Window? How a NULL offset+length is represented? How do I identify if the next byte is an offset+length pair or a literal?
Could you use the attachment below to show me an example of the
flag,
offset+length pair and
literal?
Re: [PS2] Saint Seiya Sanctuary/Hades compression with LZ77
Posted: Sun Feb 26, 2012 12:20 am
by gustavofarias
I did my homework and studied the algorithm. Please correct me if I'm wrong. The 8 bit flag tells which of the next 8 bytes/words are literals and which are off+length pairs.
I still have some very low level questions:
Why do you add
3 to the 4-bit length?
Code: Select all
int relOfs = (ol & 255) | ((ol>>12) << 8);
How did you figure that you should take the first 4 bits and the last 8 bits?
Code: Select all
int ofs = dstPtr - ((dstPtr-18-relOfs) & 4095);
Why all this calculation to get the offset? Where does this
18 come from?
Thanks for all the help to date.
Re: [PS2] Saint Seiya Sanctuary/Hades compression with LZ77
Posted: Mon Feb 27, 2012 3:40 pm
by gustavofarias
A question about endianess:
Code: Select all
for (int i = 0; i < 8 && srcPtr < srcLen && dstPtr < dstLen; i++)
{
if (ctrl & (1<<i))
In this code the bits of
ctrl are processed from right to left.
In the above code, how are the two bytes of the WORD organized? Big or Little Endian? And the bits in each byte?
Random question:
Is this the same as
srcPtr += 2;?
Re: [PS2] Saint Seiya Sanctuary/Hades compression with LZ77
Posted: Tue Feb 28, 2012 1:00 pm
by gustavofarias
Here is the first version of the translation of the Noesis decompression code to a Java code. It doesn't work yet. Maybe you can help me find the bugs.
Code: Select all
private static void decompress(File compressedFile) throws Exception {
// create a stream to read the input
DataInputStream is = new DataInputStream(new FileInputStream(compressedFile));
// skip the first 8 bytes
is.skipBytes(8); // C M P S 01 00 00 00
// read the next int which is the decompressed size
int dstLen = Integer.reverseBytes(is.readInt()); // The 4 bytes are read in the reverse order
// skip 4 more bytes
is.skipBytes(4); // 00 00 00 00
// for the sake of simplicity reads the entire file to a byte array, ignoring the first 16 bytes that have already been read.
int srcLen = (int)compressedFile.length() - 16;
System.out.println("Source length: " + srcLen);
byte[] src = new byte[srcLen];
is.read(src);
// creates a byte array to store the decompressed bytes
byte[] dst = new byte[dstLen];
System.out.println("output length: " + dstLen);
int srcPtr = 0; // by skipping 16 bytes I can start from 0
int dstPtr = 0;
while (srcPtr < srcLen && dstPtr < dstLen) {
byte ctrl = src[srcPtr++];
for (int i = 0; i < 8 && srcPtr < srcLen && dstPtr < dstLen; i++) {
if ( (ctrl & (1 << i)) != 0 ) {
dst[dstPtr++] = src[srcPtr++];
}
else {
// The line below is different from the original. I'm just reading two bytes from te input
short ol = (short)(src[srcPtr] << 8 | src[srcPtr + 1]); // should I reverse the byte order?
// I've read 2 bytes, so I mode the pointer 2 bytes ahead
srcPtr += 2;
int len = 3 + ((ol >> 8) & 15); // why +3?
int relOfs = (ol & 255) | ((ol >> 12) << 8);
int ofs = dstPtr - ((dstPtr - 18 - relOfs) & 4095); // can't get this 18 and the rest
for (int j = 0; j < len && dstPtr < dstLen; j++) {
if (ofs + j < 0 || ofs + j >= dstPtr) {
dst[dstPtr++] = 0;
}
else {
dst[dstPtr++] = dst[ofs + j];
}
}
}
}
}
// check consistency
if (srcPtr > srcLen || dstPtr > dstLen) {
throw new Exception("Out of bounds.");
}
// write the decompressed bytes to a file
OutputStream os = new FileOutputStream( new File(compressedFile.getParent(), compressedFile.getName() + "_DECMPS") );
os.write( dst );
os.flush();
}