I was focusing mainly on the PCM file, which does appear to have some audio content when I tried to play it with an audio program that supports RAW PCM.
I understand what you are saying about the semi-archive nature of the file - I agree with you there. Similarly to Mr Mouse, I had noticed some of the header and linking information contained in these files, but I didn't go as far as to see the RIFF headers - I will take a look into this alittle further, and see what I come up with.
No, I did not assume that at all
As I said, it's not the header size, it's the relative offset of the actual file (the offset of the dds file header), the "DDS". That's how DDS files start. So it points to the start of the file and then comes the size of the file.
Yeah sorry, I realise that you didn't call it the header size - that was just my representation of the information you gave (when someone says that you need to skip a few bytes, i normally look at it from the other direction that says that there is something there which is unknown at the moment)
But anyway, I was making the assumption about incorrent labling of the fields, because for the file I was analysing, the 'relative offset of the actual file' was not pointing to the DDS header - rather it was pointing further through the file (maybe another 100 or so bytes past the DDS header).
I will take a look at this again tomorrow - I didn't mean to accuse or abuse people
You should see diablo files. They are password encrypted and the password changes per archive. (Good thing diablo only has one ). Formating the password a certain way creates a nice MD5 hash that rotates continuously through the file. AND THEN you can start extracting, only, like this file, it doesn't have any filenames in the header. The files are stored in the diablo executables, but the index is in the archive file.... The way they search, is run the filename, through 3 different hash code generators. Then with the first one, that tells them where in the index might the file be . Then you have to verify that the 2 OTHER hashes in the index match the 2 others you just generated. If they match, then you found the index location which contains the offset into the file and the size. Then you can go and extract it
"By nature men are alike. Through practice they have become far apart." Confucius (Analect 17:2)
Note that all pointers should be incremented with 0x1f4 (the size of the archive header)
At 128 : offset of tail
At 132 : Size of tail
At 404 : number of SOUND entries (RIFF and additionals)
At 408 : size of the sound data
At 420 : number of TGA entries
At 424 : size of TGA data
Filetypes in PCS PCM
Found at the TAIL:
Hex values Type Additional variables
0204 0080 : Info This is the last type in the TAIL, and actually
a pointer list to the stuff right before the TAIL:
- Pointer to :
- Number of TGA files
- Pointer to a pointer to a list of offsets of
the TGA files
- This list is supplemented with two additional
values:
- Number of large chunks (unknown)
- pointer to a list of offsets of these
chunks (size = offset 2nd - offset 1st)
The last file ends where the first of the
AMF files begin (type 2100 0080)
- Pointers to each entry in the same list of offsets
of the TGA files! (so: Number-of-TGA-files times!)
- Pointer to a pointer to the list of offsets of the large
chunks!
- Pointers to each entry in the list of offsets of the
large chunks
It seems to me that this part is about TGA files
and possibly binaries? Other graphic type?.
0E01 0080 : Unknown This is followed by one 4-byte variable:
It points to some script part
2400 0080 : Unknown This covers the large chunks as mentioned above,
Three 4-byte values:
- pointer to the start of the large chunk
- Pointer to a pointer in the large chunk
The pointer in points to points to the first list in the chunk
The list has entries of 32 bytes in length, and you can
get the number of entries from the variable right before
that pointer.
- pointer to a pointer in the large chunk
Likewise, the pointer in points to points to the
second list in the large chunk, most likely a list of
2-byte values (integers). Again, the number of entries
you can get from the variable right before this pointer.
1301 0080 : WAV one 4-byte value (pointer to the file)
note that this points to the RIFF id-tag,
the 4-byte value right after the RIFF tag is the size of
the file (minus the header of the file, 8 bytes)
so you can simply calculate the whole size.
Note that there are two types of RIFF:
WAVE and AMPC.
THe names of the files are written in the headers:
after "bbname" in WAVEs comes the size of the string that follows
after "name" in AMPC comes the size of the string that follows
1801 0080 : Unknown one 4-byte value that points to a string (or a header?)
1501 0080 : Unknown sometimes three 4-byte values, sometimes two 4 byte values
0100 0080 : TGA two 4-byte values :
Pointer to file header,
Pointer to tail of file header.
- this tail has two 4-byte values of interest:
Pointer to actual file,
Size of actual file
1701 0080 : Table This is a list of pointers to
A. table entries of variable bytes in size, with the first value
a pointer to a string in a string table
B. table entries of variable size with the first pointer
a pointer to an entry in yet another table (of 88-byte entry-size)
C. Table entries to yet another table
2100 0080 : AMF Three 4-byte values:
Pointer to header of AMF file,
Pointer to data part of AMF file
Pointer to header part of AMF file (has pointers to other areas!)
Note: Many are followed by the next entry type after these
values, but there are a lot of 2100 0080 files that have more
variables, and are not "broken" by a 0400 0080 entry. However,
ALL 2100 0080 types are followed by 19 4-byte values before the
next one. All these 19 values are pointers to the insides of
the AMF file, which in turn may be pointers to other areas in the
AMF file. I'm guessing these are sequences of animation.
Strikingly, whenever a 2100 0080 tail entry is interrupted by a
0400 0080, the next value (ultimately the first of the 0400 0080)
points to the same area in the AMF file as that the 4th value of
an uninterrupted 2100 0080 would point to!
0400 0080 : AMF-support type, read above
A note : After all the 0100 0080 types (TGAs) have been listen comes a list of pointers to
.anim files. I haven't quite understood yet how this list is made up, in numbers. The
number of pointers to sections in the .anim files differs per anim file.