Tekken Hybrid

gjinka · Post by **gjinka** » Fri Jan 06, 2012 6:53 pm

EDIT: This thread has transformed from Tekken 3 to Tekken Hybrid, which has the Tekken 3 models, just hi-res, but still the same design.

So skip all these pointless posts and start reading from the 4th post counting from the bottom of this page.

---------------------------------------
ORIGINAL POST:

Hi guys, so basically I just read the Definitive guide to exploring file formats, and watched Rheini's flash tutorial. I have some programming knowledge, but yeah I'm brand spanking new to file format exploring.

I'm going to explore a file format in this topic and hopefully get some help when I get stuck. I don't want someone to do the work for me though, I want to gain some experience myself, because there doesn't seem to be any other tutorial out there so this is the only way to learn more. But any hints and tips are welcome, as well as any help when I ask for it (I feel like I will a lot).

For my first ever format I choose Tekken 3's *.BIN iso format. because that's my favorite game when I was a kid

Please tell me if it's a bad idea. From what I know each PSX game has it's own iso (bin) format. If I'm wrong let me know and I'll try another format.

Here is a 4 MB slice from the beginning of the file, if anyone will want to help out:
http://www.2shared.com/file/WxY0_vLF/tekken3-part.html

Again, note that I'm completely new, so I'm terrible at this. Instead of telling how terrible I am please help me improve. Thanks.

-------------------------------------------------------

EDIT: OK I feel like Im getting somewhere, looks like this structure repeats in all of the file.
That's probably not padding, but file data including padding, though it was all zeros in the first few "entries".

gjinka · Post by **gjinka** » Sat Jan 07, 2012 8:44 am

OK, this is as far as I could go for my first try:

Code: Select all

loop:

    byte {12}   header
    
    uint {4}    unknown
    
    uint {4}    size of data following, usually power of two
    
    uint {4}    size of data following, repeated
    
    byte {size} sometimes nulls, but not always
    
    byte {280}  useful data which i don't get

Need tips now.

huckleberrypie · Post by **huckleberrypie** » Sat Jan 07, 2012 10:01 am

If you're talking about the .bin files that you load on an emu, they're already documented, and all you had to do is to load them on a virtual disk drive utility.

gjinka · Post by **gjinka** » Sat Jan 07, 2012 11:02 am

d'oh.

Nice start I guess. Well than I'll try the files inside it.

they're already documented

where?
I created a virtual drive with daemon tools, but I can't read most of the files. "Invalid ms dos functions" when trying to copy most files. Maybe PSX ISO/BIN is not the same as your standard one. I haven't seen any PSX specific iso tool though, I've only seen BIN extractors for specific games, hence why I thought each game uses it's own format.

finale00 · Post by **finale00** » Sat Jan 07, 2012 10:31 pm

The disk might be protected such that normal image readers can't load it up.
I don't remember what people use to dump the data though.

howfie · Post by **howfie** » Sat Jan 07, 2012 11:20 pm

How about trying to rip models from Tekken Hybrid? I would help out on that one. Most people around here don't look at games from older systems. Hybrid should be a good challenge

.

gjinka · Post by **gjinka** » Sat Jan 07, 2012 11:38 pm

I thought of older games because I assumed newer ones are more complex, more likely to contain custom stuff, like encryption, etc. And this is my first try, so...

howfie · Post by **howfie** » Sat Jan 07, 2012 11:56 pm

oh no, that's not the case at all! some modern games are extremely easy! some games encrypt their data, some use funky custom compression schemes, and some use funky formats, but that is only a mild percentage of all games. for your first time, if you have a jailbroken ps3 or a regular xbox360, rip one of your games to pc. look for obvious vertex and index buffers. if you can't find them, try another game. if you can, give it a go.

finale00 · Post by **finale00** » Sun Jan 08, 2012 1:40 am

I imagine older games, with their rather restrictive space constraints, would employ all sorts of crazy techniques to try to squeeze as much data as they can into it.

gjinka · Post by **gjinka** » Sun Jan 08, 2012 8:06 am

Well you have to think both ways: over the last 10 years more complex compression and encryption algorithms were probably created. And also I think as game budgets rise the developers become more likely to reinvent the wheel by creating custom encryption/compression, or add extra protection.
But the main difficulty I see that more video/audio encodings exist now and more things exist in 3d files, like all sorts of normal maps, other maps like glow maps, you have "binormals", shader materials and other stuff which I might not even know about.

Of course if you don't mind working with a newbie and answering my newbie questions, I'd love to work with you guys.

PS. Still interested what can be used to extract PSX ISOs.

EDIT: OK, I checked out Tekken Hybrid and I think it would make more sense as the models and scene backgrounds from TTT1 are just highpoly versions of the T3 ones, audio is probably the same too (but higher quality).

I wonder if the two games included use the same formats, or TT1 just got it's sourcecode ported to PS3 SDK and file formats are the same.
Anyway, should we work on the PS3 or X360 version?

howfie · Post by **howfie** » Mon Jan 09, 2012 1:32 am

Games are projects that need to be developed on time and within budget. That is software engineering 101. Most games don't implement hardcore encryption or hardcore compression schemes because it costs extra money and extra time to hire someone who specializes in these fields. Unless, of course, they get lucky and hire a nerdy game programmer who knows everything. I think obfuscating a file format is just as effective as using encryption in protecting game assets.

Ps3. Tekken hybrid might be hard. U ready lol? What language u want to use?

howfie · Post by **howfie** » Mon Jan 09, 2012 2:54 am

Starting with the remake of Tekken Tag Tournament HD.

Step #1: Identify Model Files

The character models appear to be the CHR files in the USRDIR\data\asset\dai3\mdl directory. How do I know? Well, filenames are also character names and I think I found obvious vertex and index buffers in the CHR files (see pics of Nina Williams first file). Also, after the index buffer, appears to be a list of headerless textures. With a little experience, you will be able to recognize these patterns just by looking at the hexadecimal as well. There also appears to be other important data in the KMD and NGI files, but for now, let's just focus on the CHR files and later on we'll try to figure out what are in the KMD and NGI files.

Obvious vertex buffer.

Obvious index buffer.

Obvious headerless texture.

Step #2: See If Rich Has Already Done It

Before starting to code anything, we should take a look to see if Rich has already done it. In general, I try not to step on something somebody else is working on or has already done. Tekken is a AAA-game and I tend to leave these alone since usually other guys tend to pick them up before I can even save up enough money to buy the game! I tried loading the CHR files in Noesis and it was a no go so I think we're good here.

Step #3: Identify Vertex Buffer and Vertex Format(s)

We are going to hard-code a model rip using Nina Williams, for which can be found in the character model directory as nina1u.chr. Then, when we know it's possible to rip vertices and faces, we'll worry about how to get to the vertex and index data in a more general way. It's stupid to waste a week or more trying to figure out how to get to the data only to find out that the vertex data is obfuscated or encrypted or whatever.

Let's find the vertex data now. Open nina1u.chr in HXD and start scrolling down. We see nothing but junk until about offset 0x14FC. Look at the picture below and you will see the pattern. The pattern on data 0x40 bytes long. However, this is not the vertex data... it's something else because well... I tried it and it wasn't that interesting.

So keep scrolling down 0x40 bytes at a time until you no longer see those repetitious 0xFFFF7F3F and 0x0000803F values at the end of the pattern. You will come to offset 0x2F3C. However, this time, the pattern is not 0x40 bytes, it's 0x20 bytes instead (see picture below).

Now, I'll tell you why I think 0x2F3C is the position of the vertex buffer... scroll all the way up to the top back to the beginning of the file. What do you see in the picture below? That's right, that's 0x2F3C in little endian and that is the position of our vertex buffer. We now officially have a way to get to our vertex buffer by reading an offset from the beginning of the file.

Now we have to determine how much data is in the vertex buffer. Go back to offset 0x2F3C and keep scrolling down until you see an obvious break in the pattern. This occurs at offset 0x2A03C.

So scroll back up to the top and look around... find this special number 0x2A03C anywhere? Yep yep yep, right after the position of the vertex buffer! This offset not only marks the end of the vertex buffer, but it also marks the beginning of the index buffer, which we'll talk about later.

It is now time to write some python code. We are going to open the file, read the two offsets, starting at position 0x14, and then try to read the vertex data located between those two offsets. Since (0x2A03C - 0x2F3C)/0x20 = 0x1388, we are going to have 5,000 total vertices. When there is a pattern in the vertex data, such as the 0x20 byte pattern in this one, the vertex data is probably interleaved. Vertex data can be interleaved or non-interleaved, but usually, it is interleaved with positions, normals and uvs all mixed together.

Run the following code. It opens the file, moves to the vertex buffer, and then tries to read the vertex data in groups of 0x20 bytes. It initially assumes that all data in the vertex buffer are 4-byte floats. Assuming all 4-byte floats is usually a good place to start. However, looking at the output of the code, you will see that the floats are waaaaaayyyyyy off! 2.945423288093901e+17? No way!

Code: Select all

import struct

#
# open file 
#
ifile = open('nina1u.chr', 'rb');

#
# read vertex buffer and index buffer offsets
#
ifile.seek(0x14);
vbuffer_offset = struct.unpack('<I', ifile.read(4))[0];
ibuffer_offset = struct.unpack('<I', ifile.read(4))[0];
print('vbuffer_offset = ', vbuffer_offset);
print('ibuffer_offset = ', ibuffer_offset);

#
# calculate vertex buffer properties
#
vbuffer_size = ibuffer_offset - vbuffer_offset;
vertices = int(vbuffer_size/0x20);
print('size of vertex buffer in bytes = ', vbuffer_size);
print('number of vertices = ', vertices);
print('');

#
# move file pointer to vertex buffer
#
ifile.seek(vbuffer_offset);

#
# try reading vertices in groups of 0x20 (32) bytes
#
for i in range(vertices):
    print(struct.unpack('f', ifile.read(4))[0]); #  4 bytes total
    print(struct.unpack('f', ifile.read(4))[0]); #  8 bytes total
    print(struct.unpack('f', ifile.read(4))[0]); # 12 bytes total
    print(struct.unpack('f', ifile.read(4))[0]); # 16 bytes total
    print(struct.unpack('f', ifile.read(4))[0]); # 20 bytes total
    print(struct.unpack('f', ifile.read(4))[0]); # 24 bytes total
    print(struct.unpack('f', ifile.read(4))[0]); # 28 bytes total
    print(struct.unpack('f', ifile.read(4))[0]); # 32 bytes total
    print('');

ifile.close()

Code: Select all

C:\Python32>python tekken.py
vbuffer_offset =  12092
ibuffer_offset =  172092
size of vertex buffer in bytes =  160000
number of vertices =  5000

2.945423288093901e+17
2.986167027476185e-41
0.00010737865522969514
2.0891958804618698e-41
1.5222401117398476e-09
1.401298464324817e-45
2.152394441202919e-41
0.0

2.6888498445706854e+17
3.0179765026163585e-41
5.48811221960932e-05
2.103489124797983e-41
2.5642337142528504e-09
1.401298464324817e-45
2.152394441202919e-41
0.0

2.5557927885327565e+17
2.98812884532624e-41
4.00543212890625e-05
2.1092344485017147e-41
2.971773938043043e-09
1.401298464324817e-45
2.152394441202919e-41
0.0

1.4344176297364685e+17
3.035632863266851e-41
0.0001317995775025338
2.0777052330544063e-41
1.0722761345505205e-08
1.7455932931140317e-39
3.513446247805328e-14
1.5209693531781564e-41

So in other words, this game doesn't use 4-byte floats to store vertex data. It must using something else. If 4-byte floats fail, the next thing to try is 2-byte floats, which are known as half-floats. Python doesn't support half-floats so you will have to code your own algorithm. Half-floats are encoded as 2-byte unsigned integers and are then converted to 4-byte floats using a special algorithm. So let's write a function in python that reads an unsigned short (a half-float) and converts it into a normal 4-byte float. We will also call this function, read_half_float, in our little read loop instead and see what we get. Remember that because we are reading only 2 bytes at a time now we have to read 16 half-floats per vertex. When your run the following program below, you will see very nice numbers now! Look at the first three numbers per group; those are our vertices! The second group of three numbers are obviously the per-vertex normal and the next group of two numbers is the UV. This is too easy!

Code: Select all

import struct
import math
import array

def read_half_float(ifile):

    # read unsigned short
    value = struct.unpack('H', ifile.read(2))[0]; # read unsigned short

    # parse unsigned short into floating-point components
    temp = array.array('H', [0, 0, 0])
    temp[0] = (value & 0x8000);       # sign
    temp[1] = (value & 0x7C00) >> 10; # exponent
    temp[2] = (value & 0x03FF);       # mantissa

    # set sign constant
    sgn = 0.0;
    if temp[0] == 0:
       sgn = +1.0
    else:
       sgn = -1.0

    # if exponent is 0
    if temp[1] == 0:
       if temp[2] == 0:
          return 0.0
       else:
          return sgn*(math.pow(2.0, -14.0)*(temp[2]/1024.0));

    # if exponent is less than 32
    if temp[1] < 32:
       return sgn*math.pow(2.0, temp[1] - 15.0)*(1.0 + (temp[2]/1024.0));

    # half-float is invalid
    if temp[2] == 0:
       return math.NaN;

    # half-float is invalid
    return math.NaN;

#
# open file 
#
ifile = open('nina1u.chr', 'rb');

#
# read vertex buffer and index buffer offsets
#
ifile.seek(0x14);
vbuffer_offset = struct.unpack('<I', ifile.read(4))[0];
ibuffer_offset = struct.unpack('<I', ifile.read(4))[0];
print('vbuffer_offset = ', vbuffer_offset);
print('ibuffer_offset = ', ibuffer_offset);

#
# calculate vertex buffer properties
#
vbuffer_size = ibuffer_offset - vbuffer_offset;
vertices = int(vbuffer_size/0x20);
print('size of vertex buffer in bytes = ', vbuffer_size);
print('number of vertices = ', vertices);
print('');

#
# move file pointer to vertex buffer
#
ifile.seek(vbuffer_offset);

#
# try reading vertices in groups of 0x20 (32) bytes
#
for i in range(vertices):
    print(read_half_float(ifile)); # 2 bytes total
    print(read_half_float(ifile)); # 4 bytes total
    print(read_half_float(ifile)); # 6 bytes total
    print(read_half_float(ifile)); # 8 bytes total
    print(read_half_float(ifile)); # 10 bytes total
    print(read_half_float(ifile)); # 12 bytes total
    print(read_half_float(ifile)); # 14 bytes total
    print(read_half_float(ifile)); # 16 bytes total
    print(read_half_float(ifile)); # 18 bytes total
    print(read_half_float(ifile)); # 20 bytes total
    print(read_half_float(ifile)); # 22 bytes total
    print(read_half_float(ifile)); # 24 bytes total
    print(read_half_float(ifile)); # 26 bytes total
    print(read_half_float(ifile)); # 28 bytes total
    print(read_half_float(ifile)); # 30 bytes total
    print(read_half_float(ifile)); # 32 bytes total
    print('');

#
# close file
#
ifile.close()

Code: Select all

C:\Python32>python tekken.py
vbuffer_offset =  12092
ibuffer_offset =  172092
size of vertex buffer in bytes =  160000
number of vertices =  5000

-22.34375
288.5
57.9375
0.0
0.139892578125
0.60986328125
0.77978515625
0.0
0.44140625
0.1505126953125
5.960464477539063e-08
0.0
1.0
0.0
0.0
0.0

-42.9375
283.5
66.0625
0.0
0.1298828125
0.5498046875
0.82958984375
0.0
0.40478515625
0.162109375
5.960464477539063e-08
0.0
1.0
0.0
0.0
0.0

0.0
280.75
58.375
0.0
0.0
0.51953125
0.849609375
0.0
0.5
0.16552734375
5.960464477539063e-08
0.0
1.0
0.0
0.0
0.0

Now let's write some code to write all this data to an OBJ file. Look at Wikipedia for the OBJ file format; it is really easy.

Code: Select all

import struct
import math
import array

def read_half_float(ifile):

    # read unsigned short
    value = struct.unpack('H', ifile.read(2))[0]; # read unsigned short

    # parse unsigned short into floating-point components
    temp = array.array('H', [0, 0, 0])
    temp[0] = (value & 0x8000);       # sign
    temp[1] = (value & 0x7C00) >> 10; # exponent
    temp[2] = (value & 0x03FF);       # mantissa

    # set sign constant
    sgn = 0.0;
    if temp[0] == 0:
       sgn = +1.0
    else:
       sgn = -1.0

    # if exponent is 0
    if temp[1] == 0:
       if temp[2] == 0:
          return 0.0
       else:
          return sgn*(math.pow(2.0, -14.0)*(temp[2]/1024.0));

    # if exponent is less than 32
    if temp[1] < 32:
       return sgn*math.pow(2.0, temp[1] - 15.0)*(1.0 + (temp[2]/1024.0));

    # half-float is invalid
    if temp[2] == 0:
       return math.NaN;

    # half-float is invalid
    return math.NaN;

#
# open input file 
#
ifile = open('nina1u.chr', 'rb');

#
# open output file
#
ofile = open('nina1u.obj', 'wt');

#
# save OBJ header
#
ofile.write('o nina1u.obj\n');
ofile.write('\n');

#
# read vertex buffer and index buffer offsets
#
ifile.seek(0x14);
vbuffer_offset = struct.unpack('<I', ifile.read(4))[0];
ibuffer_offset = struct.unpack('<I', ifile.read(4))[0];
print('vbuffer_offset = ', vbuffer_offset);
print('ibuffer_offset = ', ibuffer_offset);

#
# calculate vertex buffer properties
#
vbuffer_size = ibuffer_offset - vbuffer_offset;
vertices = int(vbuffer_size/0x20);
print('size of vertex buffer in bytes = ', vbuffer_size);
print('number of vertices = ', vertices);
print('');

#
# move file pointer to vertex buffer
#
ifile.seek(vbuffer_offset);

#
# try reading vertices in groups of 0x20 (32) bytes
#
for i in range(vertices):
    # read half floats
    vx = read_half_float(ifile); #  2 bytes total
    vy = read_half_float(ifile); #  4 bytes total
    vz = read_half_float(ifile); #  6 bytes total
    u1 = read_half_float(ifile); #  8 bytes total
    nx = read_half_float(ifile); # 10 bytes total
    ny = read_half_float(ifile); # 12 bytes total
    nz = read_half_float(ifile); # 14 bytes total
    u2 = read_half_float(ifile); # 16 bytes total
    tu = read_half_float(ifile); # 18 bytes total
    tv = read_half_float(ifile); # 20 bytes total
    u3 = read_half_float(ifile); # 22 bytes total
    u4 = read_half_float(ifile); # 24 bytes total
    u5 = read_half_float(ifile); # 26 bytes total
    u6 = read_half_float(ifile); # 28 bytes total
    u7 = read_half_float(ifile); # 30 bytes total
    u8 = read_half_float(ifile); # 32 bytes total

    # save vertex
    strlist = []
    strlist.append('v ');
    strlist.append(str(vx));
    strlist.append(' ');
    strlist.append(str(vy));
    strlist.append(' ');
    strlist.append(str(vz));
    strlist.append('\n');
    ofile.write(''.join(strlist));

    # save normal
    strlist = []
    strlist.append('vn ');
    strlist.append(str(nx));
    strlist.append(' ');
    strlist.append(str(ny));
    strlist.append(' ');
    strlist.append(str(nz));
    strlist.append('\n');
    ofile.write(''.join(strlist));

    # save UV
    strlist = []
    strlist.append('vt ');
    strlist.append(str(tu));
    strlist.append(' ');
    strlist.append(str(tv));
    strlist.append('\n');
    ofile.write(''.join(strlist));

    # extra newline
    ofile.write('\n');

#
# close file
#
ifile.close()

And now what do we got? We got BOOBIES!

Onto the next step!

Step #4: Reading the Index Buffer

Go back up to the top of the file to where we read those two offsets. The first offset we read was the start of the vertex data, 0x2F3C. The next offset, 0x02A03C, was the end of vertex data and the beginning of the index buffer data. The next offset after that (see image below), 0x030A38, is the end of the index buffer data and the beginning of the texture data.

Since each index is only two bytes each, an unsigned short, we can compute the number of indices by subtracting the last two offsets and dividing by 2. Replace the section of code that we had above with this one, which adds reading the texture offset.

Code: Select all

#
# read vertex buffer and index buffer offsets
#
ifile.seek(0x14);
vbuffer_offset = struct.unpack('<I', ifile.read(4))[0];
ibuffer_offset = struct.unpack('<I', ifile.read(4))[0];
tbuffer_offset = struct.unpack('<I', ifile.read(4))[0]; # texture data offset
print('vbuffer_offset = ', vbuffer_offset);
print('ibuffer_offset = ', ibuffer_offset);
print('tbuffer_offset = ', tbuffer_offset); # texture data offset

Then let's compute size of the index buffer and the number of indices to read.

Code: Select all

#
# calculate index buffer properties
#
ibuffer_size = tbuffer_offset - ibuffer_offset;
indices = int(ibuffer_size/0x2);
print('size of index buffer in bytes = ', ibuffer_size);
print('number of indices = ', indices);
print('');

You will get the following output now:

Code: Select all

C:\Python32>python tekken.py
vbuffer_offset =  12092
ibuffer_offset =  172092
tbuffer_offset =  199224

size of vertex buffer in bytes =  160000
number of vertices =  5000

size of index buffer in bytes =  27132
number of indices =  13566

Since the number of indices is divisible by 3, 13566/3 = 4522, I'm guessing that this index buffer uses triangle lists. Also, looking at the index buffer data in the hex editor, there are no 0xFFFFs or anything like that which usually indicate triangle strips with strip-cut indices. So I don't think this index buffer uses triangle strips. So let's try triangles first. In our python code, after you have read all the vertices, we are automatically at the beginning of the index data since the data is packed. All we have to do is loop through the number of triangles, which is (number of indices/3), reading three points at a time and saving them to the OBJ file. Here's how to do this:

Code: Select all

import struct
import math
import array

def read_half_float(ifile):

    # read unsigned short
    value = struct.unpack('H', ifile.read(2))[0]; # read unsigned short

    # parse unsigned short into floating-point components
    temp = array.array('H', [0, 0, 0])
    temp[0] = (value & 0x8000);       # sign
    temp[1] = (value & 0x7C00) >> 10; # exponent
    temp[2] = (value & 0x03FF);       # mantissa

    # set sign constant
    sgn = 0.0;
    if temp[0] == 0:
       sgn = +1.0
    else:
       sgn = -1.0

    # if exponent is 0
    if temp[1] == 0:
       if temp[2] == 0:
          return 0.0
       else:
          return sgn*(math.pow(2.0, -14.0)*(temp[2]/1024.0));

    # if exponent is less than 32
    if temp[1] < 32:
       return sgn*math.pow(2.0, temp[1] - 15.0)*(1.0 + (temp[2]/1024.0));

    # half-float is invalid
    if temp[2] == 0:
       return math.NaN;

    # half-float is invalid
    return math.NaN;

#
# open input file 
#
ifile = open('nina1u.chr', 'rb');

#
# open output file
#
ofile = open('nina1u.obj', 'wt');

#
# save OBJ header
#
ofile.write('o nina1u.obj\n');
ofile.write('\n');

#
# read vertex buffer and index buffer offsets
#
ifile.seek(0x14);
vbuffer_offset = struct.unpack('<I', ifile.read(4))[0];
ibuffer_offset = struct.unpack('<I', ifile.read(4))[0];
tbuffer_offset = struct.unpack('<I', ifile.read(4))[0]; # texture data offset
print('vbuffer_offset = ', vbuffer_offset);
print('ibuffer_offset = ', ibuffer_offset);
print('tbuffer_offset = ', tbuffer_offset); # texture data offset
print('');

#
# calculate vertex buffer properties
#
vbuffer_size = ibuffer_offset - vbuffer_offset;
vertices = int(vbuffer_size/0x20);
print('size of vertex buffer in bytes = ', vbuffer_size);
print('number of vertices = ', vertices);
print('');

#
# calculate index buffer properties
#
ibuffer_size = tbuffer_offset - ibuffer_offset;
indices = int(ibuffer_size/0x2);
triangles = int(indices/3);
print('size of index buffer in bytes = ', ibuffer_size);
print('number of indices = ', indices);
print('number of triangles = ', triangles);
print('');

#
# move file pointer to vertex buffer
#
ifile.seek(vbuffer_offset);

#
# try reading vertices in groups of 0x20 (32) bytes
#
for i in range(vertices):
    # read half floats
    vx = read_half_float(ifile); #  2 bytes total
    vy = read_half_float(ifile); #  4 bytes total
    vz = read_half_float(ifile); #  6 bytes total
    u1 = read_half_float(ifile); #  8 bytes total
    nx = read_half_float(ifile); # 10 bytes total
    ny = read_half_float(ifile); # 12 bytes total
    nz = read_half_float(ifile); # 14 bytes total
    u2 = read_half_float(ifile); # 16 bytes total
    tu = read_half_float(ifile); # 18 bytes total
    tv = read_half_float(ifile); # 20 bytes total
    u3 = read_half_float(ifile); # 22 bytes total
    u4 = read_half_float(ifile); # 24 bytes total
    u5 = read_half_float(ifile); # 26 bytes total
    u6 = read_half_float(ifile); # 28 bytes total
    u7 = read_half_float(ifile); # 30 bytes total
    u8 = read_half_float(ifile); # 32 bytes total

    # save vertex
    strlist = []
    strlist.append('v ');
    strlist.append(str(vx));
    strlist.append(' ');
    strlist.append(str(vy));
    strlist.append(' ');
    strlist.append(str(vz));
    strlist.append('\n');
    ofile.write(''.join(strlist));

    # save normal
    strlist = []
    strlist.append('vn ');
    strlist.append(str(nx));
    strlist.append(' ');
    strlist.append(str(ny));
    strlist.append(' ');
    strlist.append(str(nz));
    strlist.append('\n');
    ofile.write(''.join(strlist));

    # save UV
    strlist = []
    strlist.append('vt ');
    strlist.append(str(tu));
    strlist.append(' ');
    strlist.append(str(tv));
    strlist.append('\n');
    ofile.write(''.join(strlist));

    # extra newline
    ofile.write('\n');

#
# read triangles (2 bytes per index)
#
for i in range(triangles):
    # read three points of a triangle
    a = struct.unpack('<H', ifile.read(2))[0];
    b = struct.unpack('<H', ifile.read(2))[0];
    c = struct.unpack('<H', ifile.read(2))[0];

    # OBJ uses 1-based indices
    a = a + 1;
    b = b + 1;
    c = c + 1;

    # save points to OBJ file
    strlist = []
    strlist.append('f ');
    strlist.append(str(a));
    strlist.append('/');
    strlist.append(str(a));
    strlist.append('/');
    strlist.append(str(a));
    strlist.append(' ');
    strlist.append(str(b));
    strlist.append('/');
    strlist.append(str(b));
    strlist.append('/');
    strlist.append(str(b));
    strlist.append(' ');
    strlist.append(str(c));
    strlist.append('/');
    strlist.append(str(c));
    strlist.append('/');
    strlist.append(str(c));
    strlist.append(' ');
    strlist.append('\n');
    ofile.write(''.join(strlist));    

#
# close file
#
ifile.close()

Run the python code and open the OBJ file. What do you see? LOL this!

The reason why it's messed up is because I guessed wrong about the triangle lists. When you see only half the expected polygons there or so, or you see a bunch of holes in the model, it probably uses triangle strips instead. So instead of the number of triangles being the number of indices divided by three, the number of triangles is going to be the number of indices minus two. The algorithm for reading triangle strips is also a little different as well. Here is the code that tries triangle strips instead:

Code: Select all

import struct
import math
import array

def read_half_float(ifile):

    # read unsigned short
    value = struct.unpack('H', ifile.read(2))[0]; # read unsigned short

    # parse unsigned short into floating-point components
    temp = array.array('H', [0, 0, 0])
    temp[0] = (value & 0x8000);       # sign
    temp[1] = (value & 0x7C00) >> 10; # exponent
    temp[2] = (value & 0x03FF);       # mantissa

    # set sign constant
    sgn = 0.0;
    if temp[0] == 0:
       sgn = +1.0
    else:
       sgn = -1.0

    # if exponent is 0
    if temp[1] == 0:
       if temp[2] == 0:
          return 0.0
       else:
          return sgn*(math.pow(2.0, -14.0)*(temp[2]/1024.0));

    # if exponent is less than 32
    if temp[1] < 32:
       return sgn*math.pow(2.0, temp[1] - 15.0)*(1.0 + (temp[2]/1024.0));

    # half-float is invalid
    if temp[2] == 0:
       return math.NaN;

    # half-float is invalid
    return math.NaN;

#
# open input file 
#
ifile = open('nina1u.chr', 'rb');

#
# open output file
#
ofile = open('nina1u.obj', 'wt');

#
# save OBJ header
#
ofile.write('o nina1u.obj\n');
ofile.write('\n');

#
# read vertex buffer and index buffer offsets
#
ifile.seek(0x14);
vbuffer_offset = struct.unpack('<I', ifile.read(4))[0];
ibuffer_offset = struct.unpack('<I', ifile.read(4))[0];
tbuffer_offset = struct.unpack('<I', ifile.read(4))[0]; # texture data offset
print('vbuffer_offset = ', vbuffer_offset);
print('ibuffer_offset = ', ibuffer_offset);
print('tbuffer_offset = ', tbuffer_offset); # texture data offset
print('');

#
# calculate vertex buffer properties
#
vbuffer_size = ibuffer_offset - vbuffer_offset;
vertices = int(vbuffer_size/0x20);
print('size of vertex buffer in bytes = ', vbuffer_size);
print('number of vertices = ', vertices);
print('');

#
# calculate index buffer properties
#
ibuffer_size = tbuffer_offset - ibuffer_offset;
indices = int(ibuffer_size/0x2);
print('size of index buffer in bytes = ', ibuffer_size);
print('number of indices = ', indices);
print('');

#
# calculate number of triangles
#
triangles = indices - 2;
print('number of triangles = ', triangles);
print('');

#
# move file pointer to vertex buffer
#
ifile.seek(vbuffer_offset);

#
# try reading vertices in groups of 0x20 (32) bytes
#
for i in range(vertices):
    # read half floats
    vx = read_half_float(ifile); #  2 bytes total
    vy = read_half_float(ifile); #  4 bytes total
    vz = read_half_float(ifile); #  6 bytes total
    u1 = read_half_float(ifile); #  8 bytes total
    nx = read_half_float(ifile); # 10 bytes total
    ny = read_half_float(ifile); # 12 bytes total
    nz = read_half_float(ifile); # 14 bytes total
    u2 = read_half_float(ifile); # 16 bytes total
    tu = read_half_float(ifile); # 18 bytes total
    tv = read_half_float(ifile); # 20 bytes total
    u3 = read_half_float(ifile); # 22 bytes total
    u4 = read_half_float(ifile); # 24 bytes total
    u5 = read_half_float(ifile); # 26 bytes total
    u6 = read_half_float(ifile); # 28 bytes total
    u7 = read_half_float(ifile); # 30 bytes total
    u8 = read_half_float(ifile); # 32 bytes total

    # save vertex
    strlist = []
    strlist.append('v ');
    strlist.append(str(vx));
    strlist.append(' ');
    strlist.append(str(vy));
    strlist.append(' ');
    strlist.append(str(vz));
    strlist.append('\n');
    ofile.write(''.join(strlist));

    # save normal
    strlist = []
    strlist.append('vn ');
    strlist.append(str(nx));
    strlist.append(' ');
    strlist.append(str(ny));
    strlist.append(' ');
    strlist.append(str(nz));
    strlist.append('\n');
    ofile.write(''.join(strlist));

    # save UV
    strlist = []
    strlist.append('vt ');
    strlist.append(str(tu));
    strlist.append(' ');
    strlist.append(str(tv));
    strlist.append('\n');
    ofile.write(''.join(strlist));

    # extra newline
    ofile.write('\n');

#
# read first triangle in triangle strip
#
a = struct.unpack('<H', ifile.read(2))[0] + 1;
b = struct.unpack('<H', ifile.read(2))[0] + 1;
c = struct.unpack('<H', ifile.read(2))[0] + 1;

# save points to OBJ file
strlist = []
strlist.append('f ');
strlist.append(str(a));
strlist.append('/');
strlist.append(str(a));
strlist.append('/');
strlist.append(str(a));
strlist.append(' ');
strlist.append(str(b));
strlist.append('/');
strlist.append(str(b));
strlist.append('/');
strlist.append(str(b));
strlist.append(' ');
strlist.append(str(c));
strlist.append('/');
strlist.append(str(c));
strlist.append('/');
strlist.append(str(c));
strlist.append(' ');
strlist.append('\n');
ofile.write(''.join(strlist));

#
# read triangles (2 bytes per index)
#
for i in range(1, triangles):
    # read one more point
    a = b;
    b = c;
    c = struct.unpack('<H', ifile.read(2))[0] + 1;

    # save points to OBJ file
    if i % 2 == 0:
       strlist = []
       strlist.append('f ');
       strlist.append(str(a));
       strlist.append('/');
       strlist.append(str(a));
       strlist.append('/');
       strlist.append(str(a));
       strlist.append(' ');
       strlist.append(str(b));
       strlist.append('/');
       strlist.append(str(b));
       strlist.append('/');
       strlist.append(str(b));
       strlist.append(' ');
       strlist.append(str(c));
       strlist.append('/');
       strlist.append(str(c));
       strlist.append('/');
       strlist.append(str(c));
       strlist.append(' ');
       strlist.append('\n');
       ofile.write(''.join(strlist));
    else:
       strlist = []
       strlist.append('f ');
       strlist.append(str(a));
       strlist.append('/');
       strlist.append(str(a));
       strlist.append('/');
       strlist.append(str(a));
       strlist.append(' ');
       strlist.append(str(c));
       strlist.append('/');
       strlist.append(str(c));
       strlist.append('/');
       strlist.append(str(c));
       strlist.append(' ');
       strlist.append(str(b));
       strlist.append('/');
       strlist.append(str(b));
       strlist.append('/');
       strlist.append(str(b));
       strlist.append(' ');
       strlist.append('\n');
       ofile.write(''.join(strlist));
#
# close file
#
ifile.close()

And of course, even though it looks better, it still doesn't look right! This is more than likely because the model is not just made up of a single triangle strip, but rather multiple ones. The next step is going to be how to figure out how the model is divided up into surfaces. This data is hidden somewhere in the data above the vertex buffer data and now we have to go hunt for it.

Step #5: Dividing the Model Into Surfaces

So now we have to figure out how the model is divided into surfaces. But before we do this, I'm going to write a little python code in another file to ease the pain of all those silly struct.unpack calls. Create a file called xentax.py and add the following code to it.

Code: Select all

import struct
import math
import array

#
# READING LITTLE ENDIAN
#

def LE_read_sint08(ifile):
    return struct.unpack('<b', ifile.read(1))[0];
	
def LE_read_uint08(ifile):
    return struct.unpack('<B', ifile.read(1))[0];
	
def LE_read_sint16(ifile):
    return struct.unpack('<h', ifile.read(2))[0];
	
def LE_read_uint16(ifile):
    return struct.unpack('<H', ifile.read(2))[0];

def LE_read_sint32(ifile):
    return struct.unpack('<i', ifile.read(4))[0];
	
def LE_read_uint32(ifile):
    return struct.unpack('<I', ifile.read(4))[0];
	
def LE_read_sint64(ifile):
    return struct.unpack('<q', ifile.read(8))[0];
	
def LE_read_uint64(ifile):
    return struct.unpack('<Q', ifile.read(8))[0];

def LE_read_float16(ifile):

    # read unsigned short
    value = struct.unpack('<H', ifile.read(2))[0]; # read unsigned short

    # parse unsigned short into floating-point components
    temp = array.array('H', [0, 0, 0])
    temp[0] = (value & 0x8000);       # sign
    temp[1] = (value & 0x7C00) >> 10; # exponent
    temp[2] = (value & 0x03FF);       # mantissa

    # set sign constant
    sgn = 0.0;
    if temp[0] == 0:
       sgn = +1.0
    else:
       sgn = -1.0

    # if exponent is 0
    if temp[1] == 0:
       if temp[2] == 0:
          return 0.0
       else:
          return sgn*(math.pow(2.0, -14.0)*(temp[2]/1024.0));

    # if exponent is less than 32
    if temp[1] < 32:
       return sgn*math.pow(2.0, temp[1] - 15.0)*(1.0 + (temp[2]/1024.0));

    # half-float is invalid
    if temp[2] == 0:
       return math.NaN;

    # half-float is invalid
    return math.NaN;

def LE_read_float32(ifile):
    return struct.unpack('<f', ifile.read(4))[0];
	
def LE_read_float64(ifile):
    return struct.unpack('<d', ifile.read(8))[0];

Now, going back to tekken.py, let's forget about the triangle code for a bit and start from scratch. Open Nina's file in your python code and move the file pointer to 0x40. This appears to be where the data actually starts (the 0xFFFFFFFF appears to be a marker to let you know the header is finished). Look at the following highlighted 0x1C bytes. There is also a similar pattern starting at offset 0x104. Let's read the data.

Code: Select all

import xentax

#
# open input file 
#
ifile = open('nina1u.chr', 'rb');

#
# move to 0x40 and read some data
#
ifile.seek(0x40);
param01 = xentax.LE_read_uint32(ifile);
param02 = xentax.LE_read_uint32(ifile);
param03 = xentax.LE_read_uint32(ifile);
param04 = xentax.LE_read_uint32(ifile);
param05 = xentax.LE_read_uint32(ifile);
param06 = xentax.LE_read_uint32(ifile);
param07 = xentax.LE_read_uint32(ifile);
print(param01);
print(param02);
print(param03);
print(param04);
print(param05);
print(param06);
print(param07);

#
# close file
#
ifile.close()

Code: Select all

C:\Python32>python tekken.py
0
0
67305985
555753246
908403490
976828471
1715354683

Definitely not 32-bit unsigned integers. After a little experimenting, you'll see that once again Tekken uses half-floats and that the following structure of the data is:

Code: Select all

import xentax

#
# open input file 
#
ifile = open('nina1u.chr', 'rb');

#
# move to 0x40 and read some data
#
ifile.seek(0x40);
param01 = xentax.LE_read_uint32(ifile);
param02 = xentax.LE_read_uint32(ifile);
param03 = xentax.LE_read_uint16(ifile);
param04 = xentax.LE_read_uint16(ifile);
param05 = xentax.LE_read_float16(ifile);
param06 = xentax.LE_read_float16(ifile);
param07 = xentax.LE_read_float16(ifile);
param08 = xentax.LE_read_float16(ifile);
param09 = xentax.LE_read_float16(ifile);
param10 = xentax.LE_read_float16(ifile);
param11 = xentax.LE_read_float16(ifile);
param12 = xentax.LE_read_float16(ifile);
print(param01);
print(param02);
print(param03);
print(param04);
print(param05);
print(param06);
print(param07);
print(param08);
print(param09);
print(param10);
print(param11);
print(param12);

#
# close file
#
ifile.close()

Code: Select all

C:\Python32>python tekken.py
0
0
513
1027
0.00695037841796875
0.010009765625
0.0139312744140625
0.384033203125
0.52685546875
0.77783203125
1.0576171875
1598.0

So what is this data? Probably some kind of bounding box + miscellaneous crap that's not too important. For now let's ignore it and move on to the next piece of data. From where we left of at offset 0x5C, highlight the next 0x38 bytes until the next 0x01. This is some kind of pattern. How do I know? Well, look at the 0x44 in the highlighted code. Then, select that and move to the next 0x44 that you see. Then select that until the next 0x44 that you see. You'll see that each 0x44 is 0x38 bytes away from each other. Let's try reading these 0x38 bytes as unsigned 32-bit integers first.

Code: Select all

#
# read next 0x38 bytes of data
#
d01 = xentax.LE_read_uint32(ifile); # 0x04 bytes
d02 = xentax.LE_read_uint32(ifile); # 0x08 bytes
d03 = xentax.LE_read_uint32(ifile); # 0x0C bytes
d04 = xentax.LE_read_uint32(ifile); # 0x10 bytes
d05 = xentax.LE_read_uint32(ifile); # 0x14 bytes
d06 = xentax.LE_read_uint32(ifile); # 0x18 bytes
d07 = xentax.LE_read_uint32(ifile); # 0x1C bytes
d08 = xentax.LE_read_uint32(ifile); # 0x20 bytes
d09 = xentax.LE_read_uint32(ifile); # 0x24 bytes
d10 = xentax.LE_read_uint32(ifile); # 0x28 bytes
d11 = xentax.LE_read_uint32(ifile); # 0x2C bytes
d12 = xentax.LE_read_uint32(ifile); # 0x30 bytes
d13 = xentax.LE_read_uint32(ifile); # 0x34 bytes
d14 = xentax.LE_read_uint32(ifile); # 0x38 bytes
print(d01);
print(d02);
print(d03);
print(d04);
print(d05);
print(d06);
print(d07);
print(d08);
print(d09);
print(d10);
print(d11);
print(d12);
print(d13);
print(d14);
print('');

Code: Select all

I highly doubt that 1784414276 is a uint32. Let's separate it into two uint16 values instead.

Code: Select all

#
# read next 0x38 bytes of data
#
d01 = xentax.LE_read_uint32(ifile); # 0x04 bytes
d02 = xentax.LE_read_uint32(ifile); # 0x08 bytes
d03 = xentax.LE_read_uint32(ifile); # 0x0C bytes
d04 = xentax.LE_read_uint16(ifile); # 0x0E bytes
d05 = xentax.LE_read_uint16(ifile); # 0x10 bytes
d06 = xentax.LE_read_uint32(ifile); # 0x14 bytes
d07 = xentax.LE_read_uint32(ifile); # 0x18 bytes
d08 = xentax.LE_read_uint32(ifile); # 0x1C bytes
d09 = xentax.LE_read_uint32(ifile); # 0x20 bytes
d10 = xentax.LE_read_uint32(ifile); # 0x24 bytes
d11 = xentax.LE_read_uint32(ifile); # 0x28 bytes
d12 = xentax.LE_read_uint32(ifile); # 0x2C bytes
d13 = xentax.LE_read_uint32(ifile); # 0x30 bytes
d14 = xentax.LE_read_uint32(ifile); # 0x34 bytes
d15 = xentax.LE_read_uint32(ifile); # 0x38 bytes 
print(d01);
print(d02);
print(d03);
print(d04);
print(d05);
print(d06);
print(d07);
print(d08);
print(d09);
print(d10);
print(d11);
print(d12);
print(d13);
print(d14);
print(d15);
print('');

Code: Select all

Now it's looking good. Now go back to the file and read the next 0x38 bytes. In fact, do it in a loop for 3 three times. If you look at the data after the third time the next piece of data is another one of those bounding box things that we read.

Code: Select all

for i in range(0, 3):
    d01 = xentax.LE_read_uint32(ifile); # 0x04 bytes
    d02 = xentax.LE_read_uint32(ifile); # 0x08 bytes
    d03 = xentax.LE_read_uint32(ifile); # 0x0C bytes
    d04 = xentax.LE_read_uint16(ifile); # 0x0E bytes
    d05 = xentax.LE_read_uint16(ifile); # 0x10 bytes
    d06 = xentax.LE_read_uint32(ifile); # 0x14 bytes
    d07 = xentax.LE_read_uint32(ifile); # 0x18 bytes
    d08 = xentax.LE_read_uint32(ifile); # 0x1C bytes
    d09 = xentax.LE_read_uint32(ifile); # 0x20 bytes
    d10 = xentax.LE_read_uint32(ifile); # 0x24 bytes
    d11 = xentax.LE_read_uint32(ifile); # 0x28 bytes
    d12 = xentax.LE_read_uint32(ifile); # 0x2C bytes
    d13 = xentax.LE_read_uint32(ifile); # 0x30 bytes
    d14 = xentax.LE_read_uint32(ifile); # 0x34 bytes
    d15 = xentax.LE_read_uint32(ifile); # 0x38 bytes 
    print(d01);
    print(d02);
    print(d03);
    print(d04);
    print(d05);
    print(d06);
    print(d07);
    print(d08);
    print(d09);
    print(d10);
    print(d11);
    print(d12);
    print(d13);
    print(d14);
    print(d15);
    print('');

Code: Select all

Now notice something about the last two numbers of each set. (0 + 878) = 878. (878 + 1266) = 2144. (2144 + 878) = ??? You can probably guess what the next set is going to begin with. Because of this, it is my guess that these values are the number of indices to read per surface. Don't know for sure yet until we read all the surface data, but let's check now.

Remember when we loaded the bounding box? What was the first number? What was param01? It was 0. For these surface entries, what is the first number of the three entries we just loaded? 1. Let's write a loop to test to see if we can read this data in a loop.

Code: Select all

import xentax

#
# open input file 
#
ifile = open('nina1u.chr', 'rb');

#
# move to 0x40
#
ifile.seek(0x40);

#
# read bounding box and surface entries
#
while True:

    type = xentax.LE_read_uint32(ifile);

    if type == 0:
       print('BOUNDING BOX');
       param02 = xentax.LE_read_uint32(ifile);
       param03 = xentax.LE_read_uint16(ifile);
       param04 = xentax.LE_read_uint16(ifile);
       param05 = xentax.LE_read_float16(ifile);
       param06 = xentax.LE_read_float16(ifile);
       param07 = xentax.LE_read_float16(ifile);
       param08 = xentax.LE_read_float16(ifile);
       param09 = xentax.LE_read_float16(ifile);
       param10 = xentax.LE_read_float16(ifile);
       param11 = xentax.LE_read_float16(ifile);
       param12 = xentax.LE_read_float16(ifile);
       print(param02);
       print(param03);
       print(param04);
       print(param05);
       print(param06);
       print(param07);
       print(param08);
       print(param09);
       print(param10);
       print(param11);
       print(param12);
       print('');
    elif type == 1:
       print('SURFACE ENTRY');
       d02 = xentax.LE_read_uint32(ifile); # 0x08 bytes
       d03 = xentax.LE_read_uint32(ifile); # 0x0C bytes
       d04 = xentax.LE_read_uint16(ifile); # 0x0E bytes
       d05 = xentax.LE_read_uint16(ifile); # 0x10 bytes
       d06 = xentax.LE_read_uint32(ifile); # 0x14 bytes
       d07 = xentax.LE_read_uint32(ifile); # 0x18 bytes
       d08 = xentax.LE_read_uint32(ifile); # 0x1C bytes
       d09 = xentax.LE_read_uint32(ifile); # 0x20 bytes
       d10 = xentax.LE_read_uint32(ifile); # 0x24 bytes
       d11 = xentax.LE_read_uint32(ifile); # 0x28 bytes
       d12 = xentax.LE_read_uint32(ifile); # 0x2C bytes
       d13 = xentax.LE_read_uint32(ifile); # 0x30 bytes
       d14 = xentax.LE_read_uint32(ifile); # 0x34 bytes
       d15 = xentax.LE_read_uint32(ifile); # 0x38 bytes 
       print(d02);
       print(d03);
       print(d04);
       print(d05);
       print(d06);
       print(d07);
       print(d08);
       print(d09);
       print(d10);
       print(d11);
       print(d12);
       print(d13);
       print(d14);
       print(d15);
       print('');
    else:
       print('UNKNOWN ENTRY TYPE:');
       print(type);
       print(ifile.tell());
       break;

#
# close file
#
ifile.close()

Oh hell yeah! The data is too long to display here but it works and works well. Very consistent data. Only problem is at the end it terminates with:

Code: Select all

UNKNOWN ENTRY TYPE:
2
2588

Let's go to offset 2588 (0xA1C) and see what's going on. Since the type had to be read, let's actually move to offset 0xA18 instead. As you can see from the picture below, if the first value we read is 2, then we only need to read 0x10 bytes per entry. Wonder what is in store for 2? Let's add it to the code. We'll treat all 0x10 bytes at 4 32-bit uint32 values.

Code: Select all

import xentax

#
# open input file 
#
ifile = open('nina1u.chr', 'rb');

#
# move to 0x40
#
ifile.seek(0x40);

#
# read bounding box and surface entries
#
while True:

    type = xentax.LE_read_uint32(ifile);

    if type == 0:
       print('BOUNDING BOX');
       param02 = xentax.LE_read_uint32(ifile);
       param03 = xentax.LE_read_uint16(ifile);
       param04 = xentax.LE_read_uint16(ifile);
       param05 = xentax.LE_read_float16(ifile);
       param06 = xentax.LE_read_float16(ifile);
       param07 = xentax.LE_read_float16(ifile);
       param08 = xentax.LE_read_float16(ifile);
       param09 = xentax.LE_read_float16(ifile);
       param10 = xentax.LE_read_float16(ifile);
       param11 = xentax.LE_read_float16(ifile);
       param12 = xentax.LE_read_float16(ifile);
       print(param02);
       print(param03);
       print(param04);
       print(param05);
       print(param06);
       print(param07);
       print(param08);
       print(param09);
       print(param10);
       print(param11);
       print(param12);
       print('');
    elif type == 1:
       print('SURFACE ENTRY');
       d02 = xentax.LE_read_uint32(ifile); # 0x08 bytes
       d03 = xentax.LE_read_uint32(ifile); # 0x0C bytes
       d04 = xentax.LE_read_uint16(ifile); # 0x0E bytes
       d05 = xentax.LE_read_uint16(ifile); # 0x10 bytes
       d06 = xentax.LE_read_uint32(ifile); # 0x14 bytes
       d07 = xentax.LE_read_uint32(ifile); # 0x18 bytes
       d08 = xentax.LE_read_uint32(ifile); # 0x1C bytes
       d09 = xentax.LE_read_uint32(ifile); # 0x20 bytes
       d10 = xentax.LE_read_uint32(ifile); # 0x24 bytes
       d11 = xentax.LE_read_uint32(ifile); # 0x28 bytes
       d12 = xentax.LE_read_uint32(ifile); # 0x2C bytes
       d13 = xentax.LE_read_uint32(ifile); # 0x30 bytes
       d14 = xentax.LE_read_uint32(ifile); # 0x34 bytes
       d15 = xentax.LE_read_uint32(ifile); # 0x38 bytes 
       print(d02);
       print(d03);
       print(d04);
       print(d05);
       print(d06);
       print(d07);
       print(d08);
       print(d09);
       print(d10);
       print(d11);
       print(d12);
       print(d13);
       print(d14);
       print(d15);
       print('');
    elif type == 2:
       print('WHAT IS THIS?');
       u02 = xentax.LE_read_uint32(ifile);
       u03 = xentax.LE_read_uint32(ifile);
       u04 = xentax.LE_read_uint32(ifile);
       u05 = xentax.LE_read_uint32(ifile);
       print(u02);
       print(u03);
       print(u04);
       print(u05);
       print('');
    else:
       print('UNKNOWN ENTRY TYPE:');
       print(type);
       print(ifile.tell());
       break;

#
# close file
#
ifile.close()

Code: Select all

SURFACE ENTRY
0
0
68
0
0
0
0
0
0
2
19
0
10966
26

WHAT IS THIS?
20
65536
10992
33

WHAT IS THIS?
19
0
11025
5

WHAT IS THIS?
20
65536
11030
41

WHAT IS THIS?
19
0
11071
36

Excellent! Appears to work. Once again, too much data to display, but it appears that this is also a surface entry as well. Once again the last two numbers add up to the first number in the next entry. This might be a special surface or something... we won't know until we load the model later. Notice that the code above still ends in an error.

Code: Select all

UNKNOWN ENTRY TYPE:
4294967295
4884

Let's go to offset 4884 (0x1314). See the 0xFFFFFFFF? That means we hit the end of the surface data and we are finished.

However, notice that after the 0xFFFFFFFF there looks like there are more surfaces to read! I see more 0x44's around. Since the first uint32 after the 0xFFFFFFFF is a 0, the next entry is a bounding box. Let's ignore the 0xFFFFFFFF and continue processing surfaces. Here's the code:

Code: Select all

import xentax

#
# open input file 
#
ifile = open('nina1u.chr', 'rb');

#
# move to 0x40
#
ifile.seek(0x40);

#
# read bounding box and surface entries
#
while True:

    type = xentax.LE_read_uint32(ifile);

    if type == 0:
       print('BOUNDING BOX');
       param02 = xentax.LE_read_uint32(ifile);
       param03 = xentax.LE_read_uint16(ifile);
       param04 = xentax.LE_read_uint16(ifile);
       param05 = xentax.LE_read_float16(ifile);
       param06 = xentax.LE_read_float16(ifile);
       param07 = xentax.LE_read_float16(ifile);
       param08 = xentax.LE_read_float16(ifile);
       param09 = xentax.LE_read_float16(ifile);
       param10 = xentax.LE_read_float16(ifile);
       param11 = xentax.LE_read_float16(ifile);
       param12 = xentax.LE_read_float16(ifile);
       print(param02);
       print(param03);
       print(param04);
       print(param05);
       print(param06);
       print(param07);
       print(param08);
       print(param09);
       print(param10);
       print(param11);
       print(param12);
       print('');
    elif type == 1:
       print('SURFACE ENTRY TYPE 1');
       d02 = xentax.LE_read_uint32(ifile); # 0x08 bytes
       d03 = xentax.LE_read_uint32(ifile); # 0x0C bytes
       d04 = xentax.LE_read_uint16(ifile); # 0x0E bytes
       d05 = xentax.LE_read_uint16(ifile); # 0x10 bytes
       d06 = xentax.LE_read_uint32(ifile); # 0x14 bytes
       d07 = xentax.LE_read_uint32(ifile); # 0x18 bytes
       d08 = xentax.LE_read_uint32(ifile); # 0x1C bytes
       d09 = xentax.LE_read_uint32(ifile); # 0x20 bytes
       d10 = xentax.LE_read_uint32(ifile); # 0x24 bytes
       d11 = xentax.LE_read_uint32(ifile); # 0x28 bytes
       d12 = xentax.LE_read_uint32(ifile); # 0x2C bytes
       d13 = xentax.LE_read_uint32(ifile); # 0x30 bytes
       d14 = xentax.LE_read_uint32(ifile); # 0x34 bytes
       d15 = xentax.LE_read_uint32(ifile); # 0x38 bytes 
       print(d02);
       print(d03);
       print(d04);
       print(d05);
       print(d06);
       print(d07);
       print(d08);
       print(d09);
       print(d10);
       print(d11);
       print(d12);
       print(d13);
       print(d14);
       print(d15);
       print('');
    elif type == 2:
       print('SURFACE ENTRY TYPE 2');
       u02 = xentax.LE_read_uint32(ifile);
       u03 = xentax.LE_read_uint32(ifile);
       u04 = xentax.LE_read_uint32(ifile);
       u05 = xentax.LE_read_uint32(ifile);
       print(u02);
       print(u03);
       print(u04);
       print(u05);
       print('');
    elif type == 0xFFFFFFFF:
       print('SECTION ENDING 0XFFFFFFFF');
       print('');
    else:
       print('UNKNOWN ENTRY TYPE:');
       print(type);
       print(ifile.tell());
       break;

#
# close file
#
ifile.close()

Code: Select all

UNKNOWN ENTRY TYPE:
1065353215
5312

Of course, we still get an error when it comes to the end of the surface data, at offset 5312 (0x14C0). Can we detect the end of the surface entries? Yes. Look up at the top of the file and look at the highlighted uint32. The data at 0x20 marks the end of the surface entries. So when we read a 0xFFFFFFFF type, we can check to see if we have reached the end of the surface entries. If so, we can terminate the loop without worrying about errors. Once finished, let's also sum up the last number on each surface entry and see what they add up to. Here is the code:

Code: Select all

import xentax

#
# open input file 
#
ifile = open('nina1u.chr', 'rb');

#
# read header offsets
#
ifile.seek(0x10);
offset01 = xentax.LE_read_uint32(ifile);
offset02 = xentax.LE_read_uint32(ifile);
offset03 = xentax.LE_read_uint32(ifile);
offset04 = xentax.LE_read_uint32(ifile);
offset05 = xentax.LE_read_uint32(ifile); # offset to bone data
offset06 = xentax.LE_read_uint32(ifile);
offset07 = xentax.LE_read_uint32(ifile);
offset08 = xentax.LE_read_uint32(ifile);
offset09 = xentax.LE_read_uint32(ifile);
offset10 = xentax.LE_read_uint32(ifile);
offset11 = xentax.LE_read_uint32(ifile);

#
# move to 0x40
#
ifile.seek(0x40);

#
# sum of surface entries
#
surface_sum = 0;

#
# read bounding box and surface entries
#
while True:

    type = xentax.LE_read_uint32(ifile);

    if type == 0:
       print('BOUNDING BOX');
       param02 = xentax.LE_read_uint32(ifile);
       param03 = xentax.LE_read_uint16(ifile);
       param04 = xentax.LE_read_uint16(ifile);
       param05 = xentax.LE_read_float16(ifile);
       param06 = xentax.LE_read_float16(ifile);
       param07 = xentax.LE_read_float16(ifile);
       param08 = xentax.LE_read_float16(ifile);
       param09 = xentax.LE_read_float16(ifile);
       param10 = xentax.LE_read_float16(ifile);
       param11 = xentax.LE_read_float16(ifile);
       param12 = xentax.LE_read_float16(ifile);
       print(param02);
       print(param03);
       print(param04);
       print(param05);
       print(param06);
       print(param07);
       print(param08);
       print(param09);
       print(param10);
       print(param11);
       print(param12);
       print('');
    elif type == 1:
       print('SURFACE ENTRY TYPE 1');
       d02 = xentax.LE_read_uint32(ifile); # 0x08 bytes
       d03 = xentax.LE_read_uint32(ifile); # 0x0C bytes
       d04 = xentax.LE_read_uint16(ifile); # 0x0E bytes
       d05 = xentax.LE_read_uint16(ifile); # 0x10 bytes
       d06 = xentax.LE_read_uint32(ifile); # 0x14 bytes
       d07 = xentax.LE_read_uint32(ifile); # 0x18 bytes
       d08 = xentax.LE_read_uint32(ifile); # 0x1C bytes
       d09 = xentax.LE_read_uint32(ifile); # 0x20 bytes
       d10 = xentax.LE_read_uint32(ifile); # 0x24 bytes
       d11 = xentax.LE_read_uint32(ifile); # 0x28 bytes
       d12 = xentax.LE_read_uint32(ifile); # 0x2C bytes
       d13 = xentax.LE_read_uint32(ifile); # 0x30 bytes
       d14 = xentax.LE_read_uint32(ifile); # 0x34 bytes
       d15 = xentax.LE_read_uint32(ifile); # 0x38 bytes 
       print(d02);
       print(d03);
       print(d04);
       print(d05);
       print(d06);
       print(d07);
       print(d08);
       print(d09);
       print(d10);
       print(d11);
       print(d12);
       print(d13);
       print(d14);
       print(d15);
       surface_sum += d15;
       print('');
    elif type == 2:
       print('SURFACE ENTRY TYPE 2');
       u02 = xentax.LE_read_uint32(ifile);
       u03 = xentax.LE_read_uint32(ifile);
       u04 = xentax.LE_read_uint32(ifile);
       u05 = xentax.LE_read_uint32(ifile);
       print(u02);
       print(u03);
       print(u04);
       print(u05);
       surface_sum += u05;
       print('');
    elif type == 0xFFFFFFFF:
       print('SECTION ENDING 0XFFFFFFFF');
       print('');
       if ifile.tell() == offset05:
          break;
    else:
       print('UNKNOWN ENTRY TYPE:');
       print(type);
       print(ifile.tell());
       break;

print('The number of surface values summed up = ');
print(surface_sum);
	   
#
# close file
#
ifile.close()

Code: Select all

The number of surface values summed up =
13566

What do you know? 13,566 is exactly the number of indices that we have (see previous steps where we loaded the geometry). We now have our surfaces and we can insert them into an array. The following code loads all surface data and displays it.

Code: Select all

import xentax

class SurfaceInfo:
    def __init__(self, si, sl):
        self.startIndex = si;
        self.stripLength = sl;

#
# open input file 
#
ifile = open('nina1u.chr', 'rb');

#
# read header offsets
#
ifile.seek(0x10);
offset01 = xentax.LE_read_uint32(ifile);
offset02 = xentax.LE_read_uint32(ifile);
offset03 = xentax.LE_read_uint32(ifile);
offset04 = xentax.LE_read_uint32(ifile);
offset05 = xentax.LE_read_uint32(ifile); # offset to bone data
offset06 = xentax.LE_read_uint32(ifile);
offset07 = xentax.LE_read_uint32(ifile);
offset08 = xentax.LE_read_uint32(ifile);
offset09 = xentax.LE_read_uint32(ifile);
offset10 = xentax.LE_read_uint32(ifile);
offset11 = xentax.LE_read_uint32(ifile);

#
# READ SURFACE DATA
#
surface_array = [];
ifile.seek(0x40);
while True:
    type = xentax.LE_read_uint32(ifile);
    if type == 0:
       param02 = xentax.LE_read_uint32(ifile);
       param03 = xentax.LE_read_uint16(ifile);
       param04 = xentax.LE_read_uint16(ifile);
       param05 = xentax.LE_read_float16(ifile);
       param06 = xentax.LE_read_float16(ifile);
       param07 = xentax.LE_read_float16(ifile);
       param08 = xentax.LE_read_float16(ifile);
       param09 = xentax.LE_read_float16(ifile);
       param10 = xentax.LE_read_float16(ifile);
       param11 = xentax.LE_read_float16(ifile);
       param12 = xentax.LE_read_float16(ifile);
    elif type == 1:
       d02 = xentax.LE_read_uint32(ifile);
       d03 = xentax.LE_read_uint32(ifile);
       d04 = xentax.LE_read_uint16(ifile);
       d05 = xentax.LE_read_uint16(ifile);
       d06 = xentax.LE_read_uint32(ifile);
       d07 = xentax.LE_read_uint32(ifile);
       d08 = xentax.LE_read_uint32(ifile);
       d09 = xentax.LE_read_uint32(ifile);
       d10 = xentax.LE_read_uint32(ifile);
       d11 = xentax.LE_read_uint32(ifile);
       d12 = xentax.LE_read_uint32(ifile);
       d13 = xentax.LE_read_uint32(ifile);
       d14 = xentax.LE_read_uint32(ifile);
       d15 = xentax.LE_read_uint32(ifile); 
       surface_array.append(SurfaceInfo(d14, d15));
    elif type == 2:
       u02 = xentax.LE_read_uint32(ifile);
       u03 = xentax.LE_read_uint32(ifile);
       u04 = xentax.LE_read_uint32(ifile);
       u05 = xentax.LE_read_uint32(ifile);
       surface_array.append(SurfaceInfo(u04, u05));
    elif type == 0xFFFFFFFF:
       if ifile.tell() == offset05:
          break;
    else:
       print('UNKNOWN ENTRY TYPE:');
       print(type);
       print(ifile.tell());
       break;

#
# LIST SURFACES
#
print('SURFACE INDICES');
for item in surface_array:
    print(item.stripLength);
 
#
# close file
#
ifile.close()

Code: Select all

C:\Python32>python tekken.py
SURFACE INDICES
878
1266
878
577
88
95
31
126
41
577
96
99
115
178
116
743
12
21
154
34
18
14
102
72
26
1251
98
743
110
113
271
162
173
30
186
60
212
1064
34
102
26
33
5
41
36
12
12
28
28
28
28
18
23
38
38
34
30
22
22
317
8
164
20
94
15
36
17
36
16
16
30
4
27
8
40
11
22
9
20
25
28
8
16
14
30
6
40
33
76
8
58
302
186
18
122
18
200

Well now, here comes the hard part... putting stuff from this tutorial section and the previous tutorial section all together! All we have to do is loop through the surface information and save tristrips of specific lengths determined by the data we saved in SurfaceInfo.stripLength. Here is the code to load nina1u.chr by individual surfaces. Of course, even though we divided the tristrips into surfaces, the geometry is still messed up! See the picture after the code.

Code: Select all

LOL 60,000 character post limit oh well :-).

finale00 · Post by **finale00** » Mon Jan 09, 2012 5:26 am

Use noesis.

Models + textures plz.

Whenever I think console games, I imagine complicated and awkward data structures.

But that tekken game doesn't seem too bad. Do a lot of PS3 games look like that?

howfie · Post by **howfie** » Mon Jan 09, 2012 7:35 am

No not all ps3 games look this easy. Take for instance ni no kuni. Dat file contains zarc files. Zarc files contain hpk files. Hpk files contain pkchr files. Pkchr files contain all kinds of other files. These other files contain crazy vertex and image formats. It would probably take me a full month to figure out how to rip stuff from that game so I just move on.

Post by **MrAdults** » Mon Jan 09, 2012 7:37 am

howfie wrote:Step #2: See If Rich Has Already Done It

First thing to do is to see if Rich has already done it. In general, I try not to step on something somebody else is working on or has already done. Tekken is a AAA-game and I tend to leave these alone since usually other guys tend to pick them up before I can even save up enough money to buy the game! I tried loading the CHR files in Noesis and it was a no go so I think we're good here.

Haha. Tekken 6 is the only Tekken I've looked at, and it was mostly because it happened to use a format very similar to Soul Calibur 4. While I haven't looked, I'd bet money that anything Tekken 5 or earlier has no real resemblance.

gjinka wrote:Well you have to think both ways: over the last 10 years more complex compression and encryption algorithms were probably created. And also I think as game budgets rise the developers become more likely to reinvent the wheel by creating custom encryption/compression, or add extra protection.

The real reason a lot of games don't use serious compression/encryption now is the same reason they didn't back then - it still imposes a totally unnecessary load time (and possibly memory) overhead. The trend of console vendors offering compression/encryption libraries to developers has been on the rise, but in that case, it tends to be very universal and we only need to figure out how to crack it once. It's otherwise pretty hard as a game developer to justify the runtime overhead of rolling your own compression/encryption scheme, and at most competent developers will typically take the minimal approach there. (FF13-2 being a recent example, they encrypted their file table but didn't bother encrypting actual data probably because of the rather large overhead you would have if you wanted to AES128-decrypt all of your data on load)

On the whole, older games are usually way more difficult to figure out. Especially games that came around before everyone was developing for fixed function graphics pipelines, because they could store and render data in pretty much any form that they wanted, and some even went so far as doing crazy shit like directly scanline-rendering basic primitive types and patches. We're going to be moving back toward that in another generation or 2, but for now it's smooth sailing and we can pretty much always expect to find runtime data in the form of raw triangles and vertices. You also generally don't even need to know about anything particularly next-gen to figure out next-gen data. It helps if you can recognize a tangent vector/matrix when you see one, for example, but you really don't need to know what it is if all you're looking to do is get the raw triangle/vertex data out for whatever your purposes may be. If you want to actually create an exporter *to* that format, though, then that's a lot different and you'll need to have a pretty good idea of what every piece of data in the file actually is.

XeNTaX

XeNTaX

Tekken Hybrid

Tekken Hybrid

...

Re: my first attempt at file format exploring (Tekken 3 bin)

Re: my first attempt at file format exploring (Tekken 3 bin)

Re: my first attempt at file format exploring (Tekken 3 form

Re: my first attempt at file format exploring (Tekken 3 form

Re: my first attempt at file format exploring (Tekken 3 form

Re: my first attempt at file format exploring (Tekken 3 form

Re: my first attempt at file format exploring (Tekken 3 form

Re: my first attempt at file format exploring (Tekken 3 form

Re: my first attempt at file format exploring (Tekken 3 form

Step #1-4

Re: my first attempt at file format exploring (Tekken 3 form

Re: my first attempt at file format exploring (Tekken 3 form

Re: Step #1-2