I wanted to replace the music in metal gear solid twin snakes with the original ps1 music
on extracting the gc rom the files in the shared/audio/stream and the audio/stream folder are actually dsp files, all the level music and ingame voices(whos there,a surveillance camera,tighten security,mantis reading the memory card,title theme,snake snake SNAAAAAAAAAAAAAKE) I edited some of them in audacity and replaced cd 1 level music but then i ran into this problem
scanning the vox.dat and movies.dat with psound and solidus which are supposed to contain the codec convos and real time rendered video music results in no files found
as far as I understand the working of solidus and psound(I may be wrong) they look for specific file types in the dat archive so if the search is modified to find files without extensions then it can esily extract all the files present in the dat archive in this case fioles wthout extensions are music files
so can I have a versions of solidus that extracts files without extensions? pretty please
ps. if someone wants a list of music ouside the dat files let me know
Important information: this site is currently scheduled to go offline indefinitely by end of the year.
Metal gear solid Twin snakes .dat files and audio
-
- ultra-n00b
- Posts: 2
- Joined: Wed Jun 10, 2015 10:42 am
-
- mega-veteran
- Posts: 278
- Joined: Thu Apr 17, 2008 3:48 am
- Has thanked: 47 times
- Been thanked: 40 times
Re: Metal gear solid Twin snakes .dat files and audio
Solidus supposedly only worked with texture formats in twin snakes if i recall, but i never even got that working. psound is designed for sony ps1/2/psp audio formats, I'm pretty sure twin snakes uses ogg vorbis (weird on gamecube, I know, but the dev behind it was starting a big project on UE3, which ended up being their downfall) The only problem is, the audio you get is recognizable, but segments of audio are jumbled across inside all the files, and not coherent to one bit of audio. I can't really describe it better, you'll see when you get the files out.Speed Nukem wrote:I wanted to replace the music in metal gear solid twin snakes with the original ps1 music
on extracting the gc rom the files in the shared/audio/stream and the audio/stream folder are actually dsp files, all the level music and ingame voices(whos there,a surveillance camera,tighten security,mantis reading the memory card,title theme,snake snake SNAAAAAAAAAAAAAKE) I edited some of them in audacity and replaced cd 1 level music but then i ran into this problem
scanning the vox.dat and movies.dat with psound and solidus which are supposed to contain the codec convos and real time rendered video music results in no files found
as far as I understand the working of solidus and psound(I may be wrong) they look for specific file types in the dat archive so if the search is modified to find files without extensions then it can esily extract all the files present in the dat archive in this case fioles wthout extensions are music files
so can I have a versions of solidus that extracts files without extensions? pretty please
ps. if someone wants a list of music ouside the dat files let me know
- lionheartuk
- double-veteran
- Posts: 749
- Joined: Tue May 16, 2006 10:55 pm
- Location: Everywhere
- Has thanked: 34 times
- Been thanked: 42 times
Re: Metal gear solid Twin snakes .dat files and audio
Solidus could/can extract the textures but thats it, it can't do anything else with twin snakes.Pepper wrote:Solidus supposedly only worked with texture formats in twin snakes if i recall, but i never even got that working. psound is designed for sony ps1/2/psp audio formats, I'm pretty sure twin snakes uses ogg vorbis (weird on gamecube, I know, but the dev behind it was starting a big project on UE3, which ended up being their downfall) The only problem is, the audio you get is recognizable, but segments of audio are jumbled across inside all the files, and not coherent to one bit of audio. I can't really describe it better, you'll see when you get the files out.Speed Nukem wrote:I wanted to replace the music in metal gear solid twin snakes with the original ps1 music
on extracting the gc rom the files in the shared/audio/stream and the audio/stream folder are actually dsp files, all the level music and ingame voices(whos there,a surveillance camera,tighten security,mantis reading the memory card,title theme,snake snake SNAAAAAAAAAAAAAKE) I edited some of them in audacity and replaced cd 1 level music but then i ran into this problem
scanning the vox.dat and movies.dat with psound and solidus which are supposed to contain the codec convos and real time rendered video music results in no files found
as far as I understand the working of solidus and psound(I may be wrong) they look for specific file types in the dat archive so if the search is modified to find files without extensions then it can esily extract all the files present in the dat archive in this case fioles wthout extensions are music files
so can I have a versions of solidus that extracts files without extensions? pretty please
ps. if someone wants a list of music ouside the dat files let me know
Twin Snakes does use OGG Vorbis for its music, but not sure what it uses for its cutscene audio or basically non gameplay background music audio.
-
- mega-veteran
- Posts: 278
- Joined: Thu Apr 17, 2008 3:48 am
- Has thanked: 47 times
- Been thanked: 40 times
Re: Metal gear solid Twin snakes .dat files and audio
it's pretty much ogg vorbis for cutscene audio, it's just smashed into small bits of audio stored in each ogg file. I would guess the game somehow takes these 0.1-1 second chunks strewn about inside each ogg vorbis stream and reassembles them. Haven't got the faintest clue how it knows what to do what with, and the files just sound like a jumbled phonetic mess of syllables from recognizable voices.lionheartuk wrote:Solidus could/can extract the textures but thats it, it can't do anything else with twin snakes.Pepper wrote:Solidus supposedly only worked with texture formats in twin snakes if i recall, but i never even got that working. psound is designed for sony ps1/2/psp audio formats, I'm pretty sure twin snakes uses ogg vorbis (weird on gamecube, I know, but the dev behind it was starting a big project on UE3, which ended up being their downfall) The only problem is, the audio you get is recognizable, but segments of audio are jumbled across inside all the files, and not coherent to one bit of audio. I can't really describe it better, you'll see when you get the files out.Speed Nukem wrote:I wanted to replace the music in metal gear solid twin snakes with the original ps1 music
on extracting the gc rom the files in the shared/audio/stream and the audio/stream folder are actually dsp files, all the level music and ingame voices(whos there,a surveillance camera,tighten security,mantis reading the memory card,title theme,snake snake SNAAAAAAAAAAAAAKE) I edited some of them in audacity and replaced cd 1 level music but then i ran into this problem
scanning the vox.dat and movies.dat with psound and solidus which are supposed to contain the codec convos and real time rendered video music results in no files found
as far as I understand the working of solidus and psound(I may be wrong) they look for specific file types in the dat archive so if the search is modified to find files without extensions then it can esily extract all the files present in the dat archive in this case fioles wthout extensions are music files
so can I have a versions of solidus that extracts files without extensions? pretty please
ps. if someone wants a list of music ouside the dat files let me know
Twin Snakes does use OGG Vorbis for its music, but not sure what it uses for its cutscene audio or basically non gameplay background music audio.
-
- ultra-n00b
- Posts: 2
- Joined: Wed Jun 10, 2015 10:42 am
Re: Metal gear solid Twin snakes .dat files and audio
I don't think the music files are ogg I guess they are dsp files like the ones outside the DAT files
maybe the dsp and ogg have something common in their hex which results in the jumbled up audio
this obviously is a speculation
I have got a open source DAT extractor called dragon unpacker
modifying this program to find dsp files should give a answer
P.s.
I have replaced all the level music in mgs tts rom how do i make a patch?
maybe the dsp and ogg have something common in their hex which results in the jumbled up audio
this obviously is a speculation
I have got a open source DAT extractor called dragon unpacker
modifying this program to find dsp files should give a answer
P.s.
I have replaced all the level music in mgs tts rom how do i make a patch?
- Jonathan Ingram
- n00b
- Posts: 11
- Joined: Sun Jun 21, 2015 11:20 pm
- Been thanked: 2 times
Re: Metal gear solid Twin snakes .dat files and audio
The files found in the \audio\ or \shared\audio\ directories are standard Nintendo DSP streams and SPD/SPT banks.
You can use demux_dat_be.exe to extract the Vorbis streams from demo.dat and \shared\movie.dat and have them play back correctly. It will name them as .mtaf files, so change the extensions to .ogg after extraction.
Solidus can extract the files from stage.dat and \shared\face.dat. It can also convert the .texturefiles, but it won't work with the .kmy models.
I still haven't found anything that successfully extracts the Vorbis streams from vox.dat. demux_dat_be will only recognize the subtitle data. I've heard that Obscure Extractor GTK works on vox.dat as well as extracts the VP3 format movies from movie.dat, but I can never get the thing to work.
If anyone here knows how to get vox.dat's contents and in a way so that they play back correctly, please let me know.
You can use demux_dat_be.exe to extract the Vorbis streams from demo.dat and \shared\movie.dat and have them play back correctly. It will name them as .mtaf files, so change the extensions to .ogg after extraction.
Solidus can extract the files from stage.dat and \shared\face.dat. It can also convert the .texturefiles, but it won't work with the .kmy models.
I still haven't found anything that successfully extracts the Vorbis streams from vox.dat. demux_dat_be will only recognize the subtitle data. I've heard that Obscure Extractor GTK works on vox.dat as well as extracts the VP3 format movies from movie.dat, but I can never get the thing to work.
If anyone here knows how to get vox.dat's contents and in a way so that they play back correctly, please let me know.
-
- ultra-n00b
- Posts: 1
- Joined: Sat Jul 13, 2019 5:34 pm
Re: Metal gear solid Twin snakes .dat files and audio
Has there been progress in this project. I'm having some trouble with it. I found how identify wich audio modify from the audio/stream folder. Every audio are split left and right channel. How can i replace that? First i need to convert the psx audio in dsp format. Do i need to split them left and right? Or can i simple put the same audio in both channel?
It will be nice to have some tutorial for doing everything if someome as already did. Thanks a lot
It will be nice to have some tutorial for doing everything if someome as already did. Thanks a lot
-
- ultra-n00b
- Posts: 8
- Joined: Sun Nov 08, 2020 10:14 am
- Has thanked: 2 times
- Been thanked: 9 times
Re: Metal gear solid Twin snakes .dat files and audio
Hello!
I made some progress on the file extraction from .dat files. So far my tool works on both vox.dat, movie.dat and demo.dat. Other dat files use a different data structure as far as I know. My aim is to replace the audio files with the ones in another language provided by the PSX version. I don't really know if it will be feasible or not, but I think it's worth trying.
Before releasing the tool I want to add some basic features, like being able to only analyze the dat file without extracting every file, or extracting audio files only.
Anyway, I want to document here how the data structure works. I have understood like 90% of the entire structure. Hopefully, someone can help me understanding what the few "unknown" values actually represent.
Let's start. These are the first bytes of vox.dat from Disc 1:
Data is stored as a "dictionary", and is split into multiple "pages". Each page has the following structure:
Blue rows represent dictionary "definitions", which tells how many and what kind of data is stored in every dictionary.
Green rows represent data in the dictionary.
Red rows represent zeros used for padding.
Let's see the first dictionary definition in detail. Every definition is composed of 16 bytes organized as follows:
00 00 00 10 = 0x10 (in big endian) is an identifier for "dictionary definition"
00 00 00 10 = 0x10 = 16 bytes is the length of the definition
00 00 00 00 are all zeros in the definition
00 01 00 04 is the defined data type
Data types, in particular, are split into two word (2-bytes numbers). The second word (in big endian) is the type of datum stored. So far I encountered the following four types:
0x01 = audio
0x04 = subtitle
0x02 = unknown
0x06 = unknown
0x07 = unknown
The first word (again, in big endian) represents the language:
0x00 = default
0x01 = english
0x02 = french
0x03 = german
0x04 = italian
0x05 = spanish
0x07 = japanese
So the first definition actually defines a subtitle (0x04) in english (0x01). The second definition, instead, a subtitle (0x04) in japanese (0x07).
Let's now focus on the green part that represent the data, here called dictionary entry. Each entry is identified with a 16-bytes header. The first entry has
00 01 00 04 = 0x04 (subtitle) and 0x01 (english) identifies the data
00 00 00 40 = 0x40 = 64 bytes is the length of data (with header)
00 00 00 00 (unknown1) are all zeros here
00 00 00 00 (unknown2) are all zeros here
The second entry is very similar. The only difference is in the subtitle language (jap instead of eng).
Finally, a padding section is found in the red area:
00 00 00 F0 = 0xF0 identifies the padding
00 00 00 10 = 0x10 = 16 bytes is the length of this header
00 00 01 2C (unknown1) I don't know what this represents. It is not the number of 00 used for padding.
00 00 00 00 (unknown2) are all zeros here
The zeros used for padding end at the file offset 0x800 and mark the end of a dictionary page. Every page in the file is always aligned to 0x800 (so for example, it can end at 0x800, 0x1000, 0x1800 and so on).
Since I'm interested in audio files, I want to show how they are stored. In vox.dat, the first audio file appears at the offset 0x26b090:
Audio files are organized as chunks of 8192 bytes. However, before the first chunk one can find the following header:
00 00 00 01 = 0x01 (audio) and 0x00 (default language)
00 00 00 20 = 0x20 = 32 bytes is the length of data (with header)
00 00 00 00 (unknown1) are all zeros here
00 00 00 08 (unknown2) = 0x08
which contains the following data, that I'm not able to interpret:
00 00 00 07 = 0x07 (data type unknown)?
00 00 20 00 = 0x2000 = 8192 bytes?
66 66 66 66 (unknown1)
66 66 66 66 (unknown2)
But then, after this first chunk, one can find:
00 00 00 01 = 0x01 (audio) and 0x00 (default language)
00 00 20 10 = 0x2010 = 8208 bytes is the length of data (with header) -> 8192 + 16
00 00 00 00 (unknown1) are all zeros here
00 00 20 00 = 0x2000 = 8192 bytes. For audio files, this represents the encapsulated data length.
Finally, the first 8192-byte chunk of audio data occurs.
The analysis on this page reveals the following data:
where "Unk" is actually "unknown1". As you can see, "unknown1" increases by 1 up to 6, then it increases by (random?) amounts, which in principle do not have any relationship with the data size. The same behaviour can be observed with audio files in the other pages. So hopefully someone can help me understanding, in particular, the meaning of this value, which may be important to replace the audio files inside the .dat.
Thank you for the attention (if you read up to this point )
Big thanks to the the authors of "demux_dat", which was very helpful at the beginning of my research.
I made some progress on the file extraction from .dat files. So far my tool works on both vox.dat, movie.dat and demo.dat. Other dat files use a different data structure as far as I know. My aim is to replace the audio files with the ones in another language provided by the PSX version. I don't really know if it will be feasible or not, but I think it's worth trying.
Before releasing the tool I want to add some basic features, like being able to only analyze the dat file without extracting every file, or extracting audio files only.
Anyway, I want to document here how the data structure works. I have understood like 90% of the entire structure. Hopefully, someone can help me understanding what the few "unknown" values actually represent.
Let's start. These are the first bytes of vox.dat from Disc 1:
Data is stored as a "dictionary", and is split into multiple "pages". Each page has the following structure:
Blue rows represent dictionary "definitions", which tells how many and what kind of data is stored in every dictionary.
Green rows represent data in the dictionary.
Red rows represent zeros used for padding.
Let's see the first dictionary definition in detail. Every definition is composed of 16 bytes organized as follows:
00 00 00 10 = 0x10 (in big endian) is an identifier for "dictionary definition"
00 00 00 10 = 0x10 = 16 bytes is the length of the definition
00 00 00 00 are all zeros in the definition
00 01 00 04 is the defined data type
Data types, in particular, are split into two word (2-bytes numbers). The second word (in big endian) is the type of datum stored. So far I encountered the following four types:
0x01 = audio
0x04 = subtitle
0x02 = unknown
0x06 = unknown
0x07 = unknown
The first word (again, in big endian) represents the language:
0x00 = default
0x01 = english
0x02 = french
0x03 = german
0x04 = italian
0x05 = spanish
0x07 = japanese
So the first definition actually defines a subtitle (0x04) in english (0x01). The second definition, instead, a subtitle (0x04) in japanese (0x07).
Let's now focus on the green part that represent the data, here called dictionary entry. Each entry is identified with a 16-bytes header. The first entry has
00 01 00 04 = 0x04 (subtitle) and 0x01 (english) identifies the data
00 00 00 40 = 0x40 = 64 bytes is the length of data (with header)
00 00 00 00 (unknown1) are all zeros here
00 00 00 00 (unknown2) are all zeros here
The second entry is very similar. The only difference is in the subtitle language (jap instead of eng).
Finally, a padding section is found in the red area:
00 00 00 F0 = 0xF0 identifies the padding
00 00 00 10 = 0x10 = 16 bytes is the length of this header
00 00 01 2C (unknown1) I don't know what this represents. It is not the number of 00 used for padding.
00 00 00 00 (unknown2) are all zeros here
The zeros used for padding end at the file offset 0x800 and mark the end of a dictionary page. Every page in the file is always aligned to 0x800 (so for example, it can end at 0x800, 0x1000, 0x1800 and so on).
Since I'm interested in audio files, I want to show how they are stored. In vox.dat, the first audio file appears at the offset 0x26b090:
Audio files are organized as chunks of 8192 bytes. However, before the first chunk one can find the following header:
00 00 00 01 = 0x01 (audio) and 0x00 (default language)
00 00 00 20 = 0x20 = 32 bytes is the length of data (with header)
00 00 00 00 (unknown1) are all zeros here
00 00 00 08 (unknown2) = 0x08
which contains the following data, that I'm not able to interpret:
00 00 00 07 = 0x07 (data type unknown)?
00 00 20 00 = 0x2000 = 8192 bytes?
66 66 66 66 (unknown1)
66 66 66 66 (unknown2)
But then, after this first chunk, one can find:
00 00 00 01 = 0x01 (audio) and 0x00 (default language)
00 00 20 10 = 0x2010 = 8208 bytes is the length of data (with header) -> 8192 + 16
00 00 00 00 (unknown1) are all zeros here
00 00 20 00 = 0x2000 = 8192 bytes. For audio files, this represents the encapsulated data length.
Finally, the first 8192-byte chunk of audio data occurs.
The analysis on this page reveals the following data:
Code: Select all
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 32 Data size 16 Unk 0 Data (no pad) 8 at file offset 0x90
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 0 Data (no pad) 8192 at file offset 0xb0
I: Chunk Type unk (0x0006) Lang eng (0x0001) Length 256 Data size 240 Unk 0 Data (no pad) 0 at file offset 0x20c0
I: Chunk Type unk (0x0006) Lang ger (0x0003) Length 256 Data size 240 Unk 0 Data (no pad) 0 at file offset 0x21c0
I: Chunk Type unk (0x0006) Lang fra (0x0002) Length 256 Data size 240 Unk 0 Data (no pad) 0 at file offset 0x22c0
I: Chunk Type unk (0x0006) Lang ita (0x0004) Length 256 Data size 240 Unk 0 Data (no pad) 0 at file offset 0x23c0
I: Chunk Type unk (0x0006) Lang esp (0x0005) Length 256 Data size 240 Unk 0 Data (no pad) 0 at file offset 0x24c0
I: Chunk Type unk (0x0006) Lang jap (0x0007) Length 256 Data size 240 Unk 0 Data (no pad) 0 at file offset 0x25c0
I: Chunk Type unk (0x0007) Lang def (0x0000) Length 624 Data size 608 Unk 1 Data (no pad) 0 at file offset 0x26c0
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 1 Data (no pad) 8192 at file offset 0x2930
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 2 Data (no pad) 8192 at file offset 0x4940
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 3 Data (no pad) 8192 at file offset 0x6950
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 4 Data (no pad) 8192 at file offset 0x8960
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 5 Data (no pad) 8192 at file offset 0xa970
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 6 Data (no pad) 8192 at file offset 0xc980
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 576 Data (no pad) 8192 at file offset 0xe990
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 1209 Data (no pad) 8192 at file offset 0x109a0
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 1588 Data (no pad) 8192 at file offset 0x129b0
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 2081 Data (no pad) 8192 at file offset 0x149c0
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 2518 Data (no pad) 8192 at file offset 0x169d0
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 3022 Data (no pad) 8192 at file offset 0x189e0
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 3650 Data (no pad) 8192 at file offset 0x1a9f0
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 4028 Data (no pad) 8192 at file offset 0x1ca00
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 4611 Data (no pad) 8192 at file offset 0x1ea10
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 5328 Data (no pad) 8192 at file offset 0x20a20
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 5749 Data (no pad) 8192 at file offset 0x22a30
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 6229 Data (no pad) 8192 at file offset 0x24a40
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 6699 Data (no pad) 8192 at file offset 0x26a50
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 7270 Data (no pad) 8192 at file offset 0x28a60
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 7765 Data (no pad) 8192 at file offset 0x2aa70
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 7995 Data (no pad) 8192 at file offset 0x2ca80
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 8369 Data (no pad) 8192 at file offset 0x2ea90
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 8925 Data (no pad) 8192 at file offset 0x30aa0
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 9441 Data (no pad) 8192 at file offset 0x32ab0
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 10032 Data (no pad) 8192 at file offset 0x34ac0
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 10554 Data (no pad) 8192 at file offset 0x36ad0
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 8208 Data size 8192 Unk 10977 Data (no pad) 8192 at file offset 0x38ae0
I: Chunk Type audio (0x0001) Lang def (0x0000) Length 2448 Data size 2432 Unk 11392 Data (no pad) 2419 at file offset 0x3aaf0
Thank you for the attention (if you read up to this point )
Big thanks to the the authors of "demux_dat", which was very helpful at the beginning of my research.
- DKDave
- ultra-veteran
- Posts: 357
- Joined: Mon May 06, 2019 6:07 pm
- Location: On board the USS Callister
- Has thanked: 9 times
- Been thanked: 167 times
Re: Metal gear solid Twin snakes .dat files and audio
Just to add a little bit more from what I found out when I looked into the MGS games a while ago when trying to do my own extractor.
As you've already got, a block type of 1 is audio. If the block size is 0x20, then it's an audio header, otherwise it's audio data.
Your unknown2 value in the audio header is the audio type, in this case 8, which identifies the audio data as Ogg format - IIRC, I think this is the only format used in the Twin Snakes dat files. If the audio type is 0 (as in PSX MGS1 and PS2 MGS2), this indicates the audio data is headerless and then the rest of that header block contains the sample rate, channels and the actual audio codec. The 0x66 values in the header are values that are not required for Ogg as it's a self-contained format, not sure what the 7 represents.
As you've already got, a block type of 1 is audio. If the block size is 0x20, then it's an audio header, otherwise it's audio data.
Your unknown2 value in the audio header is the audio type, in this case 8, which identifies the audio data as Ogg format - IIRC, I think this is the only format used in the Twin Snakes dat files. If the audio type is 0 (as in PSX MGS1 and PS2 MGS2), this indicates the audio data is headerless and then the rest of that header block contains the sample rate, channels and the actual audio codec. The 0x66 values in the header are values that are not required for Ogg as it's a self-contained format, not sure what the 7 represents.
I see a vision rising, dreary, Fading in as children play twilight games, In the town called Ordinary, An eye of light reveals a gateway to doomsday
-
- ultra-n00b
- Posts: 8
- Joined: Sun Nov 08, 2020 10:14 am
- Has thanked: 2 times
- Been thanked: 9 times
Re: Metal gear solid Twin snakes .dat files and audio
Thank you DKDave for explaining the meaning of that audio header.
I linked here the source of my tool with a precompiled executable with cygwin. I tried to be as clear as possible in my code, but it's been a while since I programmed so much, so it may look a bit messy at times.
It can extract the contents of movie.dat, demo.dat and vox.dat successfully. Also, it can "decode" (nothing fancy here ) the following files:
I would like to test repacking the data file, but first I need to understand the meaning behind that "unknown" value. As shown in the linked debug logs of movie, demo and vox, "Unk" increases with every new chunk of data in a page, then it gets reset to 0 or another value. I think the increment depends on the number of chunks in every page, maybe the order of data types and the (total?) size of chunks. Anyway, two relationships are always valid:
I linked here the source of my tool with a precompiled executable with cygwin. I tried to be as clear as possible in my code, but it's been a while since I programmed so much, so it may look a bit messy at times.
It can extract the contents of movie.dat, demo.dat and vox.dat successfully. Also, it can "decode" (nothing fancy here ) the following files:
- audio files: audio header is removed and padding zeros are stripped
- subtitle files: subtitle lines are interpreted with the correct timestamp and the speaker name (note: I didn't search for all the speaker codes, many are still missing; also, japanese subtitles are not decoded properly as there is additional data used to most probably store/encode japanese fonts) and exported as .srt files
- so-called "sync" files are also correctly interpreted as subtitles. These are files of type 0x06 that store timestamps and speaker names, but without text data. Indeed, text data are stored in codec.dat, but I'm unable to understand the link between the two files, that is, how is that sync file pointing to the correct location in codec.dat?
- movie files: as far as I know these are VP3 (version 1) movies. Padding zeros are stripped, but I still cannot decode them via ffmpeg. Most probably there is some kind of unexpected header that prevents ffmpeg to correctly decode the file.
I would like to test repacking the data file, but first I need to understand the meaning behind that "unknown" value. As shown in the linked debug logs of movie, demo and vox, "Unk" increases with every new chunk of data in a page, then it gets reset to 0 or another value. I think the increment depends on the number of chunks in every page, maybe the order of data types and the (total?) size of chunks. Anyway, two relationships are always valid:
- Unk in padding chunk (type 0xF0) is always equal to Unk of the previous chunk + 300 (dec)
- Unk of the last chunk is always equal to Unk of the previous chunk + 1
Re: Metal gear solid Twin snakes .dat files and audio
Hi giofrida.
Could you explain a little how to run your tool when extracting dat files?
Thanks.
Could you explain a little how to run your tool when extracting dat files?
Thanks.