In the first file I tried (client green - file '0.bkt') I found that between the FQN of the file and the zlib-ed conten the separator might or might not be 00 00.
After FQN qst.test.location.dromund_kaas.bronze.jungle_raid there is only one 00 before zlib header.
The sad thing is that even ufter unpacking the zlib-ed content you are faced with more encoded/compressed stuff. There are some human readable things in there, but there is clearly a custom format of data at work here. I can see some byte sequences repeating for some of the uncompressed content as if they belong to the same group of data.
Could it be that the data is split into chuncks and perhaps across all bucket files and the headers define chunks?
Important information: this site is currently scheduled to go offline indefinitely by end of the year.
Star Wars - The Old Republic - Beta
-
- beginner
- Posts: 32
- Joined: Sat Sep 12, 2009 11:33 am
- Has thanked: 10 times
- Been thanked: 5 times
Re: Star Wars - The Old Republic - Beta
I also wanted to know that uncompressed content of bucket file chunks usually starts with Ï@ (ascii). This "string" repeats throughout content which leads me to believe this is < from <xml>.
I am also 99% sure that there are either compressed or encoded XML files in there. I found one quest xml whose fqn is qst.location.alderaan.class.bounty_hunter.kingmaker_for_a_day but it is all garbled due to compression. I am not sure if they are usin a propriety compression method or something already known about because I am not very experienced in these things.
BTW - If you want me to share the file I used to uncompress content of bucket files I can, but be warned - it is a PHP script
I am also 99% sure that there are either compressed or encoded XML files in there. I found one quest xml whose fqn is qst.location.alderaan.class.bounty_hunter.kingmaker_for_a_day but it is all garbled due to compression. I am not sure if they are usin a propriety compression method or something already known about because I am not very experienced in these things.
BTW - If you want me to share the file I used to uncompress content of bucket files I can, but be warned - it is a PHP script
-
- veteran
- Posts: 112
- Joined: Thu Nov 17, 2011 5:33 pm
- Has thanked: 35 times
- Been thanked: 132 times
Re: Star Wars - The Old Republic - Beta
I have only unpacked a few files, but those that I have unpacked were regular XML files. Either I was lucky with these files or you used a wrong zlib library. Anyway, I will extract some more files and look at them.stalja wrote:The sad thing is that even ufter unpacking the zlib-ed content you are faced with more encoded/compressed stuff.
It would be great if you could share your PHP code, I have some experience with PHP and could use it as well. So far, I have been using VB.net for all of my scripts related to TOR but they are not yet worth sharing.
That seems very possible. It is probably best that we first find a way to identify the version of our assets files before we continue any analysis so that we don't also face the problem of having different versions of the file formats.DKK wrote:I'm wondering if Bytes 4-5 and 6-7 are some kind of version tag as my bkt files (as linked in my previous post) are all 02 00 05 00 as opposed to your 02 00 04 00
The same applies to the 32-bit integer following the first DBLB where yours read 01 00 00 00 mine read 02 00 00 00
I looked into the main TOR folder that contains launcher.exe and found the following XML files (these files are dated approximately August 2011):
movies_en_us.version: <Version>3</Version><Name>3</Name><InProgress>FALSE</InProgress>
retailclient_he601.version: <Version>23</Version><Name>23</Name><InProgress>FALSE</InProgress>
swtor_assets_en_us.version: <Version>6</Version><Name>6</Name><InProgress>FALSE</InProgress>
swtor_assets_main.version: <Version>6</Version><Name>6</Name><InProgress>FALSE</InProgress>
patcher.version: <Version>93</Version><Name>93</Name><InProgress>FALSE</InProgress>
It would be great if everyone who is contributing to this topic could share his version numbers of these files so that we know which versions we have.
I also found the file /resources/localcacheversioninfo.bin which in my case is 33 88 71 64 4A 12 00 00 79 97 79 03 00 00 00 00 00 00 00 00.
Last edited by SWTOR fan on Mon Nov 21, 2011 6:07 pm, edited 1 time in total.
-
- veteran
- Posts: 112
- Joined: Thu Nov 17, 2011 5:33 pm
- Has thanked: 35 times
- Been thanked: 132 times
Re: Star Wars - The Old Republic - Beta
Finally, I looked into the metadata.bin file that can be found in the folder you extracted the assets to (this is the folder that contains the resources/ folder). According to its file title, it seems to contain metadata on some files. From my analysis, it is likely that this contains the metadata for the .bnk/BKHD files in the assets_locale_en_us_X.tor files.
If you open the file in a hex editor and set the line width to 32 bytes then you can see a special pattern for each of the 96,616 lines. There is no file header.
It seems like byte number 24 stands for the number of the archive, running from 1 to 17 (in my copy, there are 17 locale archives). Also, the number of lines for each archive seems to coincide with the number of .bnk files in this archive.
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Assets 1 to 10:
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 01 00 00 00 40 3E B2 41
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 02 00 00 00 40 48 B2 41
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 03 00 00 00 10 56 B2 41
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 04 00 00 00 80 6D B2 41
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 05 00 00 00 C0 78 B2 41
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 06 00 00 00 B0 87 B2 41
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 07 00 00 00 60 92 B2 41
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 08 00 00 00 70 BD B2 41
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 09 00 00 00 A0 C8 B2 41
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 0A 00 00 00 50 D1 B2 41
Assets 12, 13 and 15:
XX XX XX XX XX XX XX XX 78 97 79 03 A0 F1 12 00 XX XX XX XX XX XX XX XX 0C EC 12 00 7C 5B 49 78
XX XX XX XX XX XX XX XX 78 97 79 03 A0 F1 12 00 XX XX XX XX XX XX XX XX 0D EC 12 00 7C 5B 49 78
XX XX XX XX XX XX XX XX 78 97 79 03 A0 F1 12 00 XX XX XX XX XX XX XX XX 0F EC 12 00 7C 5B 49 78
Assets 11, 14, 16 and 17:
XX XX XX XX XX XX XX XX 78 97 79 03 C0 BD F0 FF XX XX XX XX XX XX XX XX 0B EC 12 00 E0 EB 12 00
XX XX XX XX XX XX XX XX 78 97 79 03 C0 BD F0 FF XX XX XX XX XX XX XX XX 0E EC 12 00 E0 EB 12 00
XX XX XX XX XX XX XX XX 78 97 79 03 C0 BD F0 FF XX XX XX XX XX XX XX XX 10 EC 12 00 E0 EB 12 00
XX XX XX XX XX XX XX XX 78 97 79 03 C0 BD F0 FF XX XX XX XX XX XX XX XX 11 EC 12 00 E0 EB 12 00
If you open the file in a hex editor and set the line width to 32 bytes then you can see a special pattern for each of the 96,616 lines. There is no file header.
It seems like byte number 24 stands for the number of the archive, running from 1 to 17 (in my copy, there are 17 locale archives). Also, the number of lines for each archive seems to coincide with the number of .bnk files in this archive.
And here is the raw data I used for the analysis, with the CRC bytes replaced by XX:metadata.bin specification wrote: LOOP (for each 32-byte line) {
--- 0-7: 8 bytes, different for each line, maybe CRC checksum
--- 8-11: 4 bytes, always the same: 78 97 79 03
--- 12-15: 4 bytes, either 10 00 00 00 (folders 01-0A), C0 BD F0 FF (archives 0B, 0E, 10-11) or A0 F1 12 00 (archives 0C-0D, 0F).
--- 16-23: 8 bytes, different for each line, maybe CRC checksum
--- 24: 1 byte, standing for the asset number, running from 01 to 11 (ie. from 1 to 17)
--- 25-26: 2 bytes, either 00 00 (for 01-0A), or EC 12 (for 0B-11)
--- 27: 1 byte, always 00
--- 28: 1 byte, usually ending with a 0, like 10, 40, 50, A0, C0, except for 7C.
--- 29: different for each older, except 5B (for 0C, 0D, 0F) or EB (for 0B, 0E, 10-11)
--- 30-31: 2 bytes, either B2 41 (for 01-0A), 49 78 (for 0C, 0D, 0F) or 12 00 (for 0B, 0E, 10-11)
} END OF LOOP
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Assets 1 to 10:
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 01 00 00 00 40 3E B2 41
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 02 00 00 00 40 48 B2 41
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 03 00 00 00 10 56 B2 41
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 04 00 00 00 80 6D B2 41
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 05 00 00 00 C0 78 B2 41
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 06 00 00 00 B0 87 B2 41
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 07 00 00 00 60 92 B2 41
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 08 00 00 00 70 BD B2 41
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 09 00 00 00 A0 C8 B2 41
XX XX XX XX XX XX XX XX 78 97 79 03 10 00 00 00 XX XX XX XX XX XX XX XX 0A 00 00 00 50 D1 B2 41
Assets 12, 13 and 15:
XX XX XX XX XX XX XX XX 78 97 79 03 A0 F1 12 00 XX XX XX XX XX XX XX XX 0C EC 12 00 7C 5B 49 78
XX XX XX XX XX XX XX XX 78 97 79 03 A0 F1 12 00 XX XX XX XX XX XX XX XX 0D EC 12 00 7C 5B 49 78
XX XX XX XX XX XX XX XX 78 97 79 03 A0 F1 12 00 XX XX XX XX XX XX XX XX 0F EC 12 00 7C 5B 49 78
Assets 11, 14, 16 and 17:
XX XX XX XX XX XX XX XX 78 97 79 03 C0 BD F0 FF XX XX XX XX XX XX XX XX 0B EC 12 00 E0 EB 12 00
XX XX XX XX XX XX XX XX 78 97 79 03 C0 BD F0 FF XX XX XX XX XX XX XX XX 0E EC 12 00 E0 EB 12 00
XX XX XX XX XX XX XX XX 78 97 79 03 C0 BD F0 FF XX XX XX XX XX XX XX XX 10 EC 12 00 E0 EB 12 00
XX XX XX XX XX XX XX XX 78 97 79 03 C0 BD F0 FF XX XX XX XX XX XX XX XX 11 EC 12 00 E0 EB 12 00
-
- veteran
- Posts: 112
- Joined: Thu Nov 17, 2011 5:33 pm
- Has thanked: 35 times
- Been thanked: 132 times
Re: Star Wars - The Old Republic - Beta
I have not yet seen this format but it definitely is a binary file. Are you sure that it is from The Old Republic?
It seems that pow2h is a nickname of a pro gamer of Halo 2 Vista (H2V) but I do not know if this has any relation with this file (It is far-fetched, but maybe one of the Bioware developers plays H2V).
In order for us to analyse the file, we need a bigger part of it, at least 30 times as much, or more than 1 MB. [...]
It is better to have the original file than if you copy it into the forum, because many information is lost by copying.
EDIT: Restored post in an edited form to comply with forum rules.
It seems that pow2h is a nickname of a pro gamer of Halo 2 Vista (H2V) but I do not know if this has any relation with this file (It is far-fetched, but maybe one of the Bioware developers plays H2V).
In order for us to analyse the file, we need a bigger part of it, at least 30 times as much, or more than 1 MB. [...]
It is better to have the original file than if you copy it into the forum, because many information is lost by copying.
EDIT: Restored post in an edited form to comply with forum rules.
Last edited by SWTOR fan on Mon Apr 16, 2012 2:23 pm, edited 1 time in total.
-
- veteran
- Posts: 112
- Joined: Thu Nov 17, 2011 5:33 pm
- Has thanked: 35 times
- Been thanked: 132 times
Re: Star Wars - The Old Republic - Beta
I looked at the .p2v file and the format is completely different from the current MYP archives.
Most importantly, the filelist contains not hashes but actual filenames.
I am not sure what those bytes starting at 0x2040 mean, they seem to be 4-bit flags that can either be set to 0, 8 or 9.
Starting at position 0x012000, the filelist begins. Each file is described in a block of 512 bytes.
See below for the beginning of each block:
00 00 00 00 00 00 00 00 02 01 00 00 FF FF FF FF 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
95 9A 01 00 02 98 01 00 04 01 00 00 FE 36 04 00 02 46 1B 00 DF 05 0F 00 00 00 00 00 08 00 00 00 80 D8 2D 07
50 2B 00 00 8A 20 00 00 06 01 00 00 1E 02 04 00 02 46 57 00 13 86 04 00 00 00 00 00 0D 00 00 00 60 1F DE 07
E8 01 00 00 0A 01 00 00 08 01 00 00 5B 01 04 00 02 46 35 00 EC 20 19 00 00 00 00 00 06 00 00 00 45 5E E5 6F
58 01 00 00 A7 00 00 00 0A 01 00 00 37 01 04 00 02 46 69 00 88 63 17 00 00 00 00 00 07 00 00 00 45 5E E5 6F
F0 55 05 00 DB 17 02 00 0C 01 00 00 FE 75 04 00 02 46 49 00 FB 70 03 00 00 00 00 00 44 00 00 00 80 D8 2D 07
68 7A 00 00 90 49 00 00 0E 01 00 00 BE 06 04 00 02 46 25 00 0A 05 12 00 00 00 00 00 07 00 00 00 80 D8 2D 07
B9 02 00 00 64 01 00 00 10 01 00 00 0F 01 04 00 02 46 35 00 60 04 1A 00 00 00 00 00 05 00 00 00 45 5E E5 6F
__ __ 0_ 00|__ __ 0- 00|XX 01 00 00|-- --|04 00 02 46|-- 00|-- __ __ 00|00 00 00 00|-- 00 00 00|-- -- -- --|
Normally, typical bytes in a filelist include file offset, file size (both compressed and uncompressed) and possibly a checksum.
The file you uploaded is too small for us to be able to fully analyse the format. However, the red bytes do stand out because they are increasing. They could either be the file offset or just the ID of the file.
The filenames are similar to the ones in the current archives and all file formats are still in use, like /bnk/streamed/430210007.ogg or /art/static/area/all_all/arch/slum/texture/all_arch_slum_med_hut_int_r1_wall_s.tiny.dds.
@badmp3: I doubt there is any possibility for you to upload the whole 7GB file, right? Instead, what you can do is this: you can quickly scroll through the file and look for the position where the filename format ends and the contents of the files start. It should be pretty easy to recognize, at that point all those dots on the right side will be replaced by some strange letters and symbols.
If you have found this position, please upload a file with maybe 50 lines before and after this position. And please don't use FileFactory for uploading again, it took me at least five tried until I was able to download the file. A good list of uploading sites can be found in this topic.
If I have time for it, I may write a program that goes through the file and outputs all filenames. This could be useful for figuring out the remaining filenames belonging to the hashes. Otherwise, there is not much we can do with an incomplete specification, sorry.
Most importantly, the filelist contains not hashes but actual filenames.
I am not sure what those bytes starting at 0x2040 mean, they seem to be 4-bit flags that can either be set to 0, 8 or 9.
Starting at position 0x012000, the filelist begins. Each file is described in a block of 512 bytes.
See below for the beginning of each block:
00 00 00 00 00 00 00 00 02 01 00 00 FF FF FF FF 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
95 9A 01 00 02 98 01 00 04 01 00 00 FE 36 04 00 02 46 1B 00 DF 05 0F 00 00 00 00 00 08 00 00 00 80 D8 2D 07
50 2B 00 00 8A 20 00 00 06 01 00 00 1E 02 04 00 02 46 57 00 13 86 04 00 00 00 00 00 0D 00 00 00 60 1F DE 07
E8 01 00 00 0A 01 00 00 08 01 00 00 5B 01 04 00 02 46 35 00 EC 20 19 00 00 00 00 00 06 00 00 00 45 5E E5 6F
58 01 00 00 A7 00 00 00 0A 01 00 00 37 01 04 00 02 46 69 00 88 63 17 00 00 00 00 00 07 00 00 00 45 5E E5 6F
F0 55 05 00 DB 17 02 00 0C 01 00 00 FE 75 04 00 02 46 49 00 FB 70 03 00 00 00 00 00 44 00 00 00 80 D8 2D 07
68 7A 00 00 90 49 00 00 0E 01 00 00 BE 06 04 00 02 46 25 00 0A 05 12 00 00 00 00 00 07 00 00 00 80 D8 2D 07
B9 02 00 00 64 01 00 00 10 01 00 00 0F 01 04 00 02 46 35 00 60 04 1A 00 00 00 00 00 05 00 00 00 45 5E E5 6F
__ __ 0_ 00|__ __ 0- 00|XX 01 00 00|-- --|04 00 02 46|-- 00|-- __ __ 00|00 00 00 00|-- 00 00 00|-- -- -- --|
Normally, typical bytes in a filelist include file offset, file size (both compressed and uncompressed) and possibly a checksum.
The file you uploaded is too small for us to be able to fully analyse the format. However, the red bytes do stand out because they are increasing. They could either be the file offset or just the ID of the file.
The filenames are similar to the ones in the current archives and all file formats are still in use, like /bnk/streamed/430210007.ogg or /art/static/area/all_all/arch/slum/texture/all_arch_slum_med_hut_int_r1_wall_s.tiny.dds.
@badmp3: I doubt there is any possibility for you to upload the whole 7GB file, right? Instead, what you can do is this: you can quickly scroll through the file and look for the position where the filename format ends and the contents of the files start. It should be pretty easy to recognize, at that point all those dots on the right side will be replaced by some strange letters and symbols.
If you have found this position, please upload a file with maybe 50 lines before and after this position. And please don't use FileFactory for uploading again, it took me at least five tried until I was able to download the file. A good list of uploading sites can be found in this topic.
If I have time for it, I may write a program that goes through the file and outputs all filenames. This could be useful for figuring out the remaining filenames belonging to the hashes. Otherwise, there is not much we can do with an incomplete specification, sorry.
Re: Star Wars - The Old Republic - Beta
The contents of this post was deleted because of possible forum rules violation.
-
- veteran
- Posts: 112
- Joined: Thu Nov 17, 2011 5:33 pm
- Has thanked: 35 times
- Been thanked: 132 times
Re: Star Wars - The Old Republic - Beta
I have to correct myself, I have not yet found any plain xml files in the bucket files. (I confused it with the zlib files from the MYP archives)
Anyways, the files in the bucket archives really are compiled/binary versions of the raw xml files. They seem to be somewhat similar to the PROT/.node files I already analysed (they are just missing a header), so I should be able to post a specification soon.
Here are just some information already in advance:
I recommend looking at the integer format specification for variable-sized integers (see page 1 of this topic). The compiled files make frequent use of them.
Most files start with some zero bytes (00 00 00 00) but I found at least one compiled MPN file that starts with 86 BE BC 67 C1 DF 00 E0.
The general format of these files is like this:
0-...: varying number of zero bytes (0 to at least 7 bytes)
One variable-sized integer, eg. 15, 2F or C9 01 59 (15 = mapnote, 59 = item)
Another variable-sized integer, eg. 06, 05 or C9 00 14
LOOP (for all nodes) {
--- ID of the current node as variable-sized integer (most files, including PROT) or as 4-byte integer (compiled MPN files)
--- Node type, one byte.
--- More data, different for each node type
} END LOOP
Eg., one node type is the string type and is identified by the byte 06 (most files, including PROT) or 85 (compiled MPN files).
Example analysis for one string node:
CF 40 00 00 02 F5 E6 DC 5A (Ï@...õæÜZ) → node id, in this case: 4611686031142870000
06 (.) → node type #6 = string
25 (%) → length of string (37 bytes)
65 70 70 2E 73 74 61 74 65 2E 73 77 69 74 63 68 5F 77 65 61 70 6F 6E 5F 73 74 61 6E 63 65 2E 70 69 73 74 6F 6C
→ the actual string, in this case epp.state.switch_weapon_stance.pistol
There are very many node types and I have figured out the number of bytes for most of them, but unfortunately there are still many unknowns. However, we may be able to figure out more things by comparing these files to their respective XML equivalents.
Anyways, the files in the bucket archives really are compiled/binary versions of the raw xml files. They seem to be somewhat similar to the PROT/.node files I already analysed (they are just missing a header), so I should be able to post a specification soon.
Here are just some information already in advance:
I recommend looking at the integer format specification for variable-sized integers (see page 1 of this topic). The compiled files make frequent use of them.
Most files start with some zero bytes (00 00 00 00) but I found at least one compiled MPN file that starts with 86 BE BC 67 C1 DF 00 E0.
The general format of these files is like this:
0-...: varying number of zero bytes (0 to at least 7 bytes)
One variable-sized integer, eg. 15, 2F or C9 01 59 (15 = mapnote, 59 = item)
Another variable-sized integer, eg. 06, 05 or C9 00 14
LOOP (for all nodes) {
--- ID of the current node as variable-sized integer (most files, including PROT) or as 4-byte integer (compiled MPN files)
--- Node type, one byte.
--- More data, different for each node type
} END LOOP
Eg., one node type is the string type and is identified by the byte 06 (most files, including PROT) or 85 (compiled MPN files).
Example analysis for one string node:
CF 40 00 00 02 F5 E6 DC 5A (Ï@...õæÜZ) → node id, in this case: 4611686031142870000
06 (.) → node type #6 = string
25 (%) → length of string (37 bytes)
65 70 70 2E 73 74 61 74 65 2E 73 77 69 74 63 68 5F 77 65 61 70 6F 6E 5F 73 74 61 6E 63 65 2E 70 69 73 74 6F 6C
→ the actual string, in this case epp.state.switch_weapon_stance.pistol
There are very many node types and I have figured out the number of bytes for most of them, but unfortunately there are still many unknowns. However, we may be able to figure out more things by comparing these files to their respective XML equivalents.
-
- beginner
- Posts: 32
- Joined: Sat Sep 12, 2009 11:33 am
- Has thanked: 10 times
- Been thanked: 5 times
Re: Star Wars - The Old Republic - Beta
I guess I could build something that will connect the node names with the node names I have from uncompressed files from before. I don;t have much time at the moment though. I'll try and get some work done and share the results as soon as possible. So you think it is just a byte code = node type of compression here? There is probably a position byte in there somewhere (position of data).
-
- veteran
- Posts: 112
- Joined: Thu Nov 17, 2011 5:33 pm
- Has thanked: 35 times
- Been thanked: 132 times
Re: Star Wars - The Old Republic - Beta
I fear you understood me wrong, the compiled files are not 1:1 translations of the xml files. They seem to contain the same information, but in a completely different order, so unfortunately it is not that easy to write a translator.stalja wrote:I guess I could build something that will connect the node names with the node names I have from uncompressed files from before. I don;t have much time at the moment though. I'll try and get some work done and share the results as soon as possible. So you think it is just a byte code = node type of compression here? There is probably a position byte in there somewhere (position of data).
For those who do not yet have the newest beta version, here is an updated list of all TOR files. As XHD already wrote, the assets are now grouped by their contents, so this could help us with the file names.
And scratch what I said earlier about the version files; the newest version files are password-protected ZIP files and cannot be read so easily. However, there are two version files in the Assets folder that seem to contain the version number (Assets/assets_swtor_main_version.txt and Assets/assets_swtor_XX_XX_version.txt).
Main assets:
Code: Select all
swtor_main_anim_creature_a_1.tor
swtor_main_anim_creature_b_1.tor
swtor_main_anim_creature_npc_1.tor
swtor_main_anim_humanoid_bfab_1.tor
swtor_main_anim_humanoid_bfns_1.tor
swtor_main_anim_humanoid_bmaf_1.tor
swtor_main_anim_humanoid_bmns_1.tor
swtor_main_anim_misc_1.tor
swtor_main_areadat_1.tor
swtor_main_areadat_epsilon_1.tor
swtor_main_area_alderaan_1.tor
swtor_main_area_balmorra_1.tor
swtor_main_area_belsavis_1.tor
swtor_main_area_corellia_1.tor
swtor_main_area_coruscant_1.tor
swtor_main_area_dromund_kaas_1.tor
swtor_main_area_epsilon_1.tor
swtor_main_area_hoth_1.tor
swtor_main_area_hutta_1.tor
swtor_main_area_ilum_1.tor
swtor_main_area_korriban_1.tor
swtor_main_area_misc_1.tor
swtor_main_area_nar_shaddaa_1.tor
swtor_main_area_open_worlds_1.tor
swtor_main_area_ord_mantell_1.tor
swtor_main_area_quesh_1.tor
swtor_main_area_raid_1.tor
swtor_main_area_taris_1.tor
swtor_main_area_tatooine_1.tor
swtor_main_area_tython_1.tor
swtor_main_area_voss_1.tor
swtor_main_area_zed_1.tor
swtor_main_art_area_all_arch_a_1.tor
swtor_main_art_area_all_arch_b_1.tor
swtor_main_art_area_all_item_1.tor
swtor_main_art_creature_a_1.tor
swtor_main_art_creature_b_1.tor
swtor_main_art_creature_npc_1.tor
swtor_main_art_dynamic_cape_1.tor
swtor_main_art_dynamic_chest_1.tor
swtor_main_art_dynamic_chest_tight_1.tor
swtor_main_art_dynamic_hand_1.tor
swtor_main_art_dynamic_head_1.tor
swtor_main_art_dynamic_lower_1.tor
swtor_main_art_dynamic_mags_1.tor
swtor_main_art_fx_1.tor
swtor_main_art_harvesting_1.tor
swtor_main_art_misc_1.tor
swtor_main_art_space_combat_1.tor
swtor_main_art_vehicle_1.tor
swtor_main_art_weapon_1.tor
swtor_main_art_zed_1.tor
swtor_main_bnk_audiodata_1.tor
swtor_main_bnk_audio_1.tor
swtor_main_bnk_location_1.tor
swtor_main_bnk_streamed_a_1.tor
swtor_main_bnk_streamed_b_1.tor
swtor_main_bnk_streamed_c_1.tor
swtor_main_bnk_voc_1.tor
swtor_main_cnv_alien_1.tor
swtor_main_gamedata_1.tor
swtor_main_gfx_1.tor
swtor_main_global_1.tor
swtor_main_systemgenerated_gom_1.tor
swtor_main_zed_1.tor
Code: Select all
swtor_XX-XX_area_alderaan_1.tor
swtor_XX-XX_area_balmorra_1.tor
swtor_XX-XX_area_belsavis_1.tor
swtor_XX-XX_area_corellia_1.tor
swtor_XX-XX_area_coruscant_1.tor
swtor_XX-XX_area_dromund_kaas_1.tor
swtor_XX-XX_area_hoth_1.tor
swtor_XX-XX_area_hutta_1.tor
swtor_XX-XX_area_ilum_1.tor
swtor_XX-XX_area_korriban_1.tor
swtor_XX-XX_area_misc_1.tor
swtor_XX-XX_area_nar_shaddaa_1.tor
swtor_XX-XX_area_open_worlds_1.tor
swtor_XX-XX_area_ord_mantell_1.tor
swtor_XX-XX_area_quesh_1.tor
swtor_XX-XX_area_raid_1.tor
swtor_XX-XX_area_taris_1.tor
swtor_XX-XX_area_tatooine_1.tor
swtor_XX-XX_area_tython_1.tor
swtor_XX-XX_area_voss_1.tor
swtor_XX-XX_cnv_comp_chars_imp_1.tor
swtor_XX-XX_cnv_comp_chars_rep_1.tor
swtor_XX-XX_cnv_misc_1.tor
swtor_XX-XX_cnv_transitions_1.tor
swtor_XX-XX_global_1.tor
-
- veteran
- Posts: 112
- Joined: Thu Nov 17, 2011 5:33 pm
- Has thanked: 35 times
- Been thanked: 132 times
Re: Star Wars - The Old Republic - Beta
@badmp3: I have now put together some tools for analysing p2v archives. You can download them at http://www.sendspace.com/file/ykt0va. In it, you will find five files.
At the moment it is more important to analyse the newer files but I may come back to the p2v archives later.
- P2V-Extractor.exe is a tool I wrote to extract the filenames from the p2v archives. Just run it, select the p2v file to open and a file to save the filelist to. Should be pretty easy to use. In case anyone is interested, I also included the source code in Form1.vb (written in VB.net).
- p2v-filenames-sorted.txt contains the filenames from the p2v file you have uploaded and will hopefully help us with guessing more file names. Thank you again very much for providing the p2v file, it is always great to see exclusive beta files!
- It is also possible to extract the contents of the p2v files with the tool offzip.exe. Offzip is an extractor written by aluigi (not by me ) that extracts any zlib-encoded files in an archive. Just copy offzip.exe and extract.bat into the folder where you have the smallassets.p2v file stored and double-click on extract.bat. The program will then extract all files from the archive into a subfolder called "extracted". However, make sure that you have at least 20GB of disk space free before you run the program, because these files will be very large!
At the moment it is more important to analyse the newer files but I may come back to the p2v archives later.
Re: Star Wars - The Old Republic - Beta
Hi all! I've been watching this thread closely since finding it, as somebody who is very excited about SWTOR and would love to see some of the level 50 content unearthed, such as boss models, high end mission text, etc.
I am a total novice with reverse-engineering games, although I would like to learn and help out. I have a few questions about the TOR files which have been analysed so I can better understand the process that is going on.
Firstly, is the 50% complete hash list tied to a specific version of the beta client? i.e. does a new hash list need to be produced every time the game files are altered?
Secondly, as the game uses zlib compression, do the extracted files need to be "decompressed", or does easymyp do that in the process of extracting them?
Thirdly, are the DEADBEEF files useful in any way? For example, can a DEADBEEF text file simply be renamed to the correct extension provided the file header gives away what kind of file it is? i.e. is renaming "DEADBEEF_FE26FD386BBA06DA.txt" to "somename.gr2" the equivalent to easymyp extracting it with the name "somename.gr2" based on the hash list?
Sorry to ask n00bish questions as many of you are well versed in this stuff already!
I am a total novice with reverse-engineering games, although I would like to learn and help out. I have a few questions about the TOR files which have been analysed so I can better understand the process that is going on.
Firstly, is the 50% complete hash list tied to a specific version of the beta client? i.e. does a new hash list need to be produced every time the game files are altered?
Secondly, as the game uses zlib compression, do the extracted files need to be "decompressed", or does easymyp do that in the process of extracting them?
Thirdly, are the DEADBEEF files useful in any way? For example, can a DEADBEEF text file simply be renamed to the correct extension provided the file header gives away what kind of file it is? i.e. is renaming "DEADBEEF_FE26FD386BBA06DA.txt" to "somename.gr2" the equivalent to easymyp extracting it with the name "somename.gr2" based on the hash list?
Sorry to ask n00bish questions as many of you are well versed in this stuff already!
-
- veteran
- Posts: 112
- Joined: Thu Nov 17, 2011 5:33 pm
- Has thanked: 35 times
- Been thanked: 132 times
Re: Star Wars - The Old Republic - Beta
Glad of you to join in!
It could be possible that the hash list no longer works because Bioware
a) changed all the file names, eg. by renaming the top directory /resources/ to a different name. In this case, no wonder that the file list no longer works.
b) changed the hash generation, so that the same filenames now have different paths.
Any case is bad news for us. Under a), we would need to find out the new filenames, under b), we would have the disassemble swtor.exe and would need to fix the Easymyp algorithm.
By the way, the hash list does not store any version information nor the filenames of the TOR archives, so there shouldn't be any problem with that.
The only disadvantage with the DEADBEEF files is that for files like the textures they are not of much use: if you have a directory of multiple thousand DEADBEEF files and don't know what filename they have, good luck finding the right file .
But XML files work fine even if you do not know their filename as the FQN name is always specified in the first tag.
However, there will always be the case that old files are deleted, new ones added and other files renamed or modified. However, the majority of the files should stay the same so we only need to update a few file names everytime there is an update.
In my opinion, we rather need a TOR viewer instead of an extractor like easyMYP, because the extracted files can be up to 60 GB in file size. It would be better to have a program similar to the Windows Explorer that can be used to look into the archives, preview the files and extract them on a "on demand" basis.
The only problem then is how we can keep track of file changes during patches.
Actually, until now I have only been working with older beta files so I could not test the 50% hash list yet.hazballs wrote:Firstly, is the 50% complete hash list tied to a specific version of the beta client?
It could be possible that the hash list no longer works because Bioware
a) changed all the file names, eg. by renaming the top directory /resources/ to a different name. In this case, no wonder that the file list no longer works.
b) changed the hash generation, so that the same filenames now have different paths.
Any case is bad news for us. Under a), we would need to find out the new filenames, under b), we would have the disassemble swtor.exe and would need to fix the Easymyp algorithm.
By the way, the hash list does not store any version information nor the filenames of the TOR archives, so there shouldn't be any problem with that.
Easymyp already decompresses all files during extraction, so you do not need to worry about that. However, there are some proprietary files, like the bucket or BKHD files, that are encoded twice, so we need to decompress/extract them once more before we can use them.hazballs wrote:Secondly, as the game uses zlib compression, do the extracted files need to be "decompressed", or does easymyp do that in the process of extracting them?
The DEADBEEF files are really just normal files, they just have a different filename. So you can open them in a hex editor, look for the file header, rename the file extension and open it in the correct program.hazballs wrote:Thirdly, are the DEADBEEF files useful in any way?
The only disadvantage with the DEADBEEF files is that for files like the textures they are not of much use: if you have a directory of multiple thousand DEADBEEF files and don't know what filename they have, good luck finding the right file .
But XML files work fine even if you do not know their filename as the FQN name is always specified in the first tag.
Similar to question one, the hash list should work with all versions so we do not need to update the format everytime, unless Bioware changes the algorithm again.hazballs wrote:Fourth, does a new hash list need to be made every time the game files change?
However, there will always be the case that old files are deleted, new ones added and other files renamed or modified. However, the majority of the files should stay the same so we only need to update a few file names everytime there is an update.
Well, right now I am still working on recognizing all file formats and hope that I will fix the filename issue soon.hazballs wrote:Finally, how is everyone else's research coming along on the "close to release" version of the game files?
In my opinion, we rather need a TOR viewer instead of an extractor like easyMYP, because the extracted files can be up to 60 GB in file size. It would be better to have a program similar to the Windows Explorer that can be used to look into the archives, preview the files and extract them on a "on demand" basis.
The only problem then is how we can keep track of file changes during patches.
-
- beginner
- Posts: 32
- Joined: Sat Sep 12, 2009 11:33 am
- Has thanked: 10 times
- Been thanked: 5 times
Re: Star Wars - The Old Republic - Beta
The hash algorithm has not changed to the best of my knowledge. I just extracted all the item icons using the hash algorithm and it worked right.
What you need to know and what is clearly stated in easymyp documentation. The files you get from extraction are named like this
CRC_Hash1Hash2.guessed_extension
so - DEADBEEF is in fact the CRC of the file that easymyp author wanted to use for versioning purposes. What is important for matching filenames to hashes is the second part of the filename and that is Hash1Hash2 thing.
So, I can confirm that, at least for the icons, the hashing algorithm still works.
They have changed the filenames. They have added a lot more of them as well. They removed some. If I was to guess I would say that the old client and the new client intersect in about 60% of filenames, maybe more.
What you need to know and what is clearly stated in easymyp documentation. The files you get from extraction are named like this
CRC_Hash1Hash2.guessed_extension
so - DEADBEEF is in fact the CRC of the file that easymyp author wanted to use for versioning purposes. What is important for matching filenames to hashes is the second part of the filename and that is Hash1Hash2 thing.
So, I can confirm that, at least for the icons, the hashing algorithm still works.
They have changed the filenames. They have added a lot more of them as well. They removed some. If I was to guess I would say that the old client and the new client intersect in about 60% of filenames, maybe more.