Bucket file format
As I wrote earlier, I already have analysed files which are similar to the files in the bucket files, so I will just post what I have. I hope the format did not change too much since I analysed them.
Thank you very much for your bucket explorer, mvanderw! It is helping me a lot with the analysis. I hope you can improve the coloring with my information and we can improve the analysis of these files.
SWTOR fan wrote:I recommend looking at the integer format specification for variable-sized integers (see page 1 of this topic). The compiled files make frequent use of them.
The general format of these files is like this:
0-...: varying number of header bytes (0 to at least 7 bytes)
One variable-sized integer, eg. 15, 2F or C9 01 59 (15 = mapnote, 59 = item) ??
Another variable-sized integer, eg. 06, 05 or C9 00 14
LOOP (for all nodes) {
--- ID of the current node as variable-sized integer (most files, including PROT) or as 4-byte integer (compiled MPN files)
--- Node type, one or two bytes.
--- More data, different for each node type
} END LOOP
Eg., one node type is the string type and is identified by the byte 06 (most files, including PROT) or 85 (compiled MPN files).
Example analysis for one string node:
CF 40 00 00 02 F5 E6 DC 5A (Ï@...õæÜZ) → node id, in this case: 4611686031142870000
06 (.) → node type #6 = string
25 (%) → length of string (37 bytes)
65 70 70 2E 73 74 61 74 65 2E 73 77 69 74 63 68 5F 77 65 61 70 6F 6E 5F 73 74 61 6E 63 65 2E 70 69 73 74 6F 6C
→ the actual string, in this case epp.state.switch_weapon_stance.pistol
In the following, you can find a list of the different node types. Some nodes reference other node ids, other nodes contain one string or a list of string, and even other nodes contain some kind of flags (e.g. 01=05, 02=DE, 03=AB, 04=01, ...). And of course there are many more node types where I do not know what they stand for.
But at least with this information one can write a better format analyser.
Once we have that done, we can try to make some sense of the data. It seems that the node id has something to do with what the node stands for. For example, in dialogue files, the node id classifies a node to say which audio should be spoken, what the parent node is, which animations/cameras to play etc.
01 (reference to a different node id)
Variable-length integer: Node id
One byte: 01
Variable-length integer: Another node id
02 (reference to a different node id)
Variable-length integer: Node id
One byte: 02
Variable-length integer: Another node id
In dialogue files, if the first node id is 0x2C, then the second id stands for the string id that is to be spoken.
03 (short node, end of section?)
Variable-length integer: Node id
One byte: 03
One byte: 01
04 02 (short node)
Variable-length integer: Node id
One byte: 04
One byte: 02
04 03 (short node)
Variable-length integer: Node id
One byte: 04
One byte: 03
04 02 02 (unknown)
Variable-length integer: Node id
Three bytes: 04 02 02
Four bytes: unknown
Three bytes: unknown
05 (short node)
Variable-length integer: Node id
One byte: 05
Two unknown bytes
06 (single string)
Variable-length integer: Node id
One byte: 06
Variable-length integer: Length of string
String with the aforementioned length
07 01 (multiple flags)
Variable-length integer: Node id
Two bytes: 07 01
Variable-length integer: Number of flags
Variable-length integer: Number of flags, same as above
LOOP {
--- Variable-length integer: Flag id
--- Variable-length integer: Flag value
} END LOOP
07 03 (short node)
Variable-length integer: Node id
Two bytes: 07 03
07 05 (multiple flags)
Variable-length integer: Node id
Two bytes: 07 05
Variable-length integer: Number of flags
Variable-length integer: Number of flags, same as above
LOOP {
--- Variable-length integer: Flag id
--- Variable-length integer: Flag value
} END LOOP
07 06 (multiple strings)
Variable-length integer: Node id
Two bytes: 07 06
Variable-length integer: Number of strings
Variable-length integer: Number of strings, same as above
LOOP {
--- Variable-length integer: String id
--- Variable-length integer: String length
--- String with aforementioned length
} END LOOP
07 09 (unknown)
Variable-length integer: Node id
Two bytes: 07 05
Two or five unknown bytes
08 01 09 (unknown length)
Variable-length integer: Node id
Three bytes: 08 01 09
Variable-length integer: Unknown length
Variable-length integer: Same as above
08 02 02 (multiple flags)
Variable-length integer: Node id
Three bytes: 08 02 02
Variable-length integer: Number of flags
Variable-length integer: Number of flags, same as above
LOOP {
--- Variable-length integer: Flag id
--- Variable-length integer: Flag value
} END LOOP
08 02 06 (multiple strings)
Variable-length integer: Node id
Three bytes: 08 02 06
Variable-length integer: Number of strings
Variable-length integer: Number of strings, same as above
LOOP {
--- Variable-length integer: String id
--- Variable-length integer: String length
--- String with aforementioned length
} END LOOP
08 02 07 (beginning of multiple string nodes)
Variable-length integer: Node id
Three bytes: 08 02 07
Variable-length integer: Number of string nodes following this node
Variable-length integer: Same as above
08 02 08 (beginning of multiple string nodes)
Variable-length integer: Node id
Three bytes: 08 02 08
Variable-length integer: Number of string nodes following this node
Variable-length integer: Same as above
08 02 09 (unknown length)
Variable-length integer: Node id
Three bytes: 08 02 09
Variable-length integer: Unknown length
Variable-length integer: Unknown length, same as above
08 05 08 (multiple string sections)
Variable-length integer: Node id
Three bytes: 08 05 08
Variable-length integer: Number of string sections
Variable-length integer: Number of string sections, same as above
LOOP (for each section) {
--- Variable-length integer: Section number
--- Two bytes: XX XX
--- Variable-length integer: Number of strings per section
--- Variable-length integer: Number of strings per section, same as above
---
LOOP (for each string in this section) {
--- --- Variable-length integer: String id
--- --- Variable-length integer: String length
--- --- String with aforementioned length, sometimes length = 0
---
} END LOOP
} END LOOP
08 05 (short node)
Variable-length integer: Node id
Two bytes: 08 05
08 06 03 (multiple strings)
Variable-length integer: Node id
Three bytes: 08 06 03
Variable-length integer: Number of strings
Variable-length integer: Number of strings, same as above
LOOP (for each string) {
--- Variable-length integer: String id
--- Variable-length integer: String length
--- String with aforementioned length, sometimes length = 0
--- Sometimes: one zero byte (00)
} END LOOP
08 06 08 (multiple strings)
Variable-length integer: Node id
Three bytes: 08 06 08
Variable-length integer: Number of strings
Variable-length integer: Number of strings, same as above
LOOP (for each string) {
--- Variable-length integer: String id
--- Variable-length integer: String length
--- String with aforementioned length, sometimes length = 0
} END LOOP
Four bytes: 06 03 01 01
Variable-length integer: String id
Variable-length integer: String length
String with aforementioned length, sometimes length = 0
One byte: 01
09, 39 or 3B (beginning of a section)
Variable-length integer: section number
One byte: 09, 39 or 3B (on dialogue files: 39=audio string, 09=metadata, 3B=unknown)
One unknown byte (on dialogue files: 05=metadata, 15=NPC speaking, 16=PC speaking, 16/17=unknown)
-----------------------------------------
Following 08 02 07 or 08 02 08 are not regular string sections; instead, they have the following format:
Following 08 02 07:
Variable-length integer: Node id
One byte: 06
Variable-length integer: Number of strings
Variable-length integer: Number of strings, same as above
LOOP (for each string) {
--- Variable-length integer: String id
--- Variable-length integer: String length
--- String with aforementioned length, sometimes length = 0
} END LOOP
Following 08 02 08:
Variable-length integer: Node id
One byte: 06
One byte: 07
Variable-length integer: Number of strings
Variable-length integer: Number of strings, same as above
LOOP (for each string) {
--- Variable-length integer: String id
--- Variable-length integer: String length
--- String with aforementioned length, sometimes length = 0
--- One unknown byte: 02 (maybe flag mode??)
--- Variable-length integer: Number of flags
--- Variable-length integer: Number of flags, same as above
---
LOOP (for each flag) {
--- --- Flag number
--- --- Flag value
---
} END LOOP
} END LOOP