Important information: this site is currently scheduled to go offline indefinitely by end of the year.

Frostbite 2 sound extraction research

Get help on any and all audio formats, or chip in and help others!
daemon1
MEGAVETERAN
MEGAVETERAN
Posts: 2647
Joined: Tue Mar 24, 2015 8:12 pm
Has thanked: 65 times
Been thanked: 2871 times

Re: Frostbite 2 sound extraction research

Post by daemon1 »

All right, I have changed BF3 script so now it unpacks full DAI archives. It gave me 12000+ audio .ebx files, now proceeding with them.

fb3decoder.py script now extracts 23000 files (340 XAS, others Ealayer3) that used 9100 chunks as the source. Definitely it doesn't recognize music chunks. Proceeding.

It seems music .ebx have something different in them. Current scripts included. Can't guarantee it will work, try it. It must produce a lot of chunks.
You do not have the required permissions to view the files attached to this post.
User avatar
Vosvoy
veteran
Posts: 127
Joined: Fri Feb 18, 2011 4:58 pm
Has thanked: 15 times
Been thanked: 15 times

Re: Frostbite 2 sound extraction research

Post by Vosvoy »

The scripts required a certain version of Python I don't remember which one. Someone have it?
Vosvoy
daemon1
MEGAVETERAN
MEGAVETERAN
Posts: 2647
Joined: Tue Mar 24, 2015 8:12 pm
Has thanked: 65 times
Been thanked: 2871 times

Re: Frostbite 2 sound extraction research

Post by daemon1 »

Vosvoy wrote:The scripts required a certain version of Python I don't remember which one. Someone have it?
It must be 2.7 or later
User avatar
Vosvoy
veteran
Posts: 127
Joined: Fri Feb 18, 2011 4:58 pm
Has thanked: 15 times
Been thanked: 15 times

Re: Frostbite 2 sound extraction research

Post by Vosvoy »

All right.

I just noticed that the ebx script is missing from OrangeC archive. Someone have it?

EDIT: Never mind, you can have it here but I think we need the floattostring.dll as well but I can't find it anywhere.

It gives me this error:

Code: Select all

Traceback (most recent call last):
  File "H:\BF3SNDS\Script\final\ebx.py", line 407, in <module>
    main()
  File "H:\BF3SNDS\Script\final\ebx.py", line 45, in main
    createGuidTable()
  File "H:\BF3SNDS\Script\final\ebx.py", line 62, in createGuidTable
    dbx=Dbx(f,relPath)
  File "H:\BF3SNDS\Script\final\ebx.py", line 200, in __init__
    self.fieldDescriptors=[FieldDescriptor(self.unpack("IHHii",f.read(16)), self.keywordDict) for i in xrange(self.header.numField)]
  File "H:\BF3SNDS\Script\final\ebx.py", line 149, in __init__
    self.name            = keywordDict[varList[0]]
KeyError: 291404853
>>> 
In the other hand I can read "#The floattostring.dll requires 32bit Python to write floating point numbers in a succinct manner, but the dll is not required to run this script." so I don't know what to think.

EDIT2: Ok so, I tried again but with the Battlefield 3 Open Beta files and it worked but not with the provided ebx script above but with this one:

Code: Select all

#Requires Python 2.7
import string
import sys
from binascii import hexlify
from struct import unpack
import os
from cStringIO import StringIO
import cProfile
import cPickle

#adjust input and output folders here
inputFolder=r"H:\BF3SNDS\DUMP\bundles\ebx\sound"
outputFolder=r"H:\BF3SNDS\EBX"
guidTableName="guidTable bf" #name of the guid table file

EXTENSION=".txt"
SEP="    "

#the script can use the actual filenames in the explorer for the guid table (fast)
#or it can parse almost the entire file to retrieve the filename (slow, but necessary when the explorer names are just hashes)
#in case #2, create a separate guidTable file, in case#1 do not create that file. 
#True/False
useExplorerNames=True


#ignore all instances and fields with these names when converting to text:
IGNOREINSTANCES=[]
IGNOREFIELDS=[]
##IGNOREINSTANCES=["ShaderAdjustmentData","SocketData","WeaponSkinnedSocketObjectData","WeaponRegularSocketObjectData"]
##IGNOREFIELDS=["Mesh3pTransforms","Mesh3pRigidMeshSocketObjectTransforms"]


#run createGuidTable or dumpText, or both (preferably in the right order)
#When using explorer names, do not change anything below.
#When not using explorer names you might want to make the guid table first, then restart the script to dump text only,
#though it should work fine without change too.
def main():
    createGuidTable()
    dumpText()




##############################################################
##############################################################
if useExplorerNames:
    def createGuidTable(): #guid vs filename
        for dir0, dirs, ff in os.walk(inputFolder):
            for fname in ff:
                path=os.path.join(dir0,fname)
                f=open(path,"rb")
                if f.read(4)!="\xCE\xD1\xB2\x0F":
                    f.close()
                    continue
                #grab the file guid directly, absolute offset 48 bytes
                f.seek(48)
                fileguid=f.read(16)
                f.close()
                filename=path[len(inputFolder):-4].replace("\\","/")
                guidTable[fileguid]=filename
else:
    def createGuidTable():
        for dir0, dirs, ff in os.walk(inputFolder):
            for fname in ff:
                f=open(dir0+"\\"+fname,"rb")
                if f.read(4)!="\xCE\xD1\xB2\x0F":
                    f.close()
                    continue
                dbx=Dbx(f)
                guidTable[dbx.fileGUID]=dbx.trueFilename
        f5=open(guidTableName,"wb") #write the table
        cPickle.dump(guidTable,f5)
        f5.close()

def dumpText():
    for dir0, dirs, ff in os.walk(inputFolder):
        for fname in ff:
            f=open(dir0+"\\"+fname,"rb")
            if f.read(4)!="\xCE\xD1\xB2\x0F":
                f.close()
                continue
            dbx=Dbx(f)
            dbx.dump(outputFolder)
            
try:
    from ctypes import *
    floatlib = cdll.LoadLibrary("floattostring")
    def formatfloat(num):
        bufType = c_char * 100
        buf = bufType()
        bufpointer = pointer(buf)
        floatlib.convertNum(c_double(num), bufpointer, 100)
        rawstring=(buf.raw)[:buf.raw.find("\x00")]
        if rawstring[:2]=="-.": return "-0."+rawstring[2:]
        elif rawstring[0]==".": return "0."+rawstring[1:]
        elif "e" not in rawstring and "." not in rawstring: return rawstring+".0"
        return rawstring
except:
    def formatfloat(num):
        return str(num)
def hasher(keyword): #32bit FNV-1 hash with FNV_offset_basis = 5381 and FNV_prime = 33
    hash = 5381
    for byte in keyword:
        hash = (hash*33) ^ ord(byte)
    return hash & 0xffffffff # use & because Python promotes the num instead of intended overflow
class Header:
    def __init__(self,varList): ##all 4byte unsigned integers
        self.absStringOffset     = varList[0]  ## absolute offset for string section start
        self.lenStringToEOF      = varList[1]  ## length from string section start to EOF
        self.numGUID             = varList[2]  ## number of external GUIDs
        self.null                = varList[3]  ## 00000000
        self.numInstanceRepeater = varList[4]
        self.numComplex          = varList[5]  ## number of complex entries
        self.numField            = varList[6]  ## number of field entries
        self.lenName             = varList[7]  ## length of name section including padding
        self.lenString           = varList[8]  ## length of string section including padding
        self.numArrayRepeater    = varList[9]
        self.lenPayload          = varList[10] ## length of normal payload section; the start of the array payload section is absStringOffset+lenString+lenPayload
class FieldDescriptor:
    def __init__(self,varList,keywordDict):
        self.name            = keywordDict[varList[0]]
        self.type            = varList[1]
        self.ref             = varList[2] #the field may contain another complex
        self.offset          = varList[3] #offset in payload section; relative to the complex containing it
        self.secondaryOffset = varList[4]
class ComplexDescriptor:
    def __init__(self,varList,keywordDict):
        self.name            = keywordDict[varList[0]]
        self.fieldStartIndex = varList[1] #the index of the first field belonging to the complex
        self.numField        = varList[2] #the total number of fields belonging to the complex
        self.alignment       = varList[3]
        self.type            = varList[4]
        self.size            = varList[5] #total length of the complex in the payload section
        self.secondarySize   = varList[6] #seems deprecated
class InstanceRepeater:
    def __init__(self,varList):
        self.null            = varList[0] #called "internalCount", seems to be always null
        self.repetitions     = varList[1] #number of instance repetitions
        self.complexIndex    = varList[2] #index of complex used as the instance
class arrayRepeater:
    def __init__(self,varList):
        self.offset          = varList[0] #offset in array payload section
        self.repetitions     = varList[1] #number of array repetitions
        self.complexIndex    = varList[2] #not necessary for extraction
class Complex:
    def __init__(self,desc):
        self.desc=desc
class Field:
    def __init__(self,desc):
        self.desc=desc

numDict={0x0035:("I",4),0xc10d:("I",4),0xc14d:("d",8),0xc0ad:("?",1),0xc0fd:("i",4),0xc0bd:("B",1),0xc0ed:("h",2), 0xc0dd:("H",2), 0xc13d:("f",4)}

class Dbx:
    def __init__(self, f):
        #metadata
        self.trueFilename=""
        self.header=Header(unpack("11I",f.read(44)))
        self.arraySectionstart=self.header.absStringOffset+self.header.lenString+self.header.lenPayload
        self.fileGUID, self.primaryInstanceGUID = f.read(16), f.read(16)    
        self.externalGUIDs=[(f.read(16),f.read(16)) for i in xrange(self.header.numGUID)]
        self.keywords=str.split(f.read(self.header.lenName),"\x00")
        self.keywordDict=dict((hasher(keyword),keyword) for keyword in self.keywords)
        self.fieldDescriptors=[FieldDescriptor(unpack("IHHII",f.read(16)), self.keywordDict) for i in xrange(self.header.numField)]
        self.complexDescriptors=[ComplexDescriptor(unpack("IIBBHHH",f.read(16)), self.keywordDict) for i in xrange(self.header.numComplex)]
        self.instanceRepeaters=[InstanceRepeater(unpack("3I",f.read(12))) for i in xrange(self.header.numInstanceRepeater)] 
        while f.tell()%16!=0: f.seek(1,1) #padding
        self.arrayRepeaters=[arrayRepeater(unpack("3I",f.read(12))) for i in xrange(self.header.numArrayRepeater)]

        #payload
        f.seek(self.header.absStringOffset+self.header.lenString)
        self.internalGUIDs=[]
        self.instances=[] # (guid, complex)
        for instanceRepeater in self.instanceRepeaters:
            for repetition in xrange(instanceRepeater.repetitions):
                instanceGUID=f.read(16)
                self.internalGUIDs.append(instanceGUID)
                if instanceGUID==self.primaryInstanceGUID: self.isPrimaryInstance=True
                else: self.isPrimaryInstance=False
                
                self.instances.append( (instanceGUID,self.readComplex(instanceRepeater.complexIndex,f)) )
        f.close()
        
        

    def readComplex(self, complexIndex,f):
        complexDesc=self.complexDescriptors[complexIndex]
        cmplx=Complex(complexDesc)
        
        startPos=f.tell()                 
        cmplx.fields=[]
        for fieldIndex in xrange(complexDesc.fieldStartIndex,complexDesc.fieldStartIndex+complexDesc.numField):
            f.seek(startPos+self.fieldDescriptors[fieldIndex].offset)
            cmplx.fields.append(self.readField(fieldIndex,f))
        
        f.seek(startPos+complexDesc.size)
        return cmplx


    def readField(self,fieldIndex,f):
        fieldDesc = self.fieldDescriptors[fieldIndex]
        field=Field(fieldDesc)
        
        if fieldDesc.type in (0x0029, 0xd029,0x0000):
            field.value=self.readComplex(fieldDesc.ref,f)
        elif fieldDesc.type==0x0041:
            arrayRepeater=self.arrayRepeaters[unpack("I",f.read(4))[0]]
            arrayComplexDesc=self.complexDescriptors[fieldDesc.ref]

##            if arrayRepeater.repetitions==0: field.value = "*nullArray*"
            f.seek(self.arraySectionstart+arrayRepeater.offset)
            arrayComplex=Complex(arrayComplexDesc)
            arrayComplex.fields=[self.readField(arrayComplexDesc.fieldStartIndex,f) for repetition in xrange(arrayRepeater.repetitions)]
            field.value=arrayComplex
            
        elif fieldDesc.type in (0x407d, 0x409d):
            startPos=f.tell()
            f.seek(self.header.absStringOffset+unpack("I",f.read(4))[0])
            string=""
            while 1:
                a=f.read(1)
                if a=="\x00": break
                else: string+=a
            f.seek(startPos+4)
            
            if len(string)==0: field.value="*nullString*" #actually the string is ""
            else: field.value=string
            
            if self.isPrimaryInstance and self.trueFilename=="" and fieldDesc.name=="Name": self.trueFilename=string
            
                   
        elif fieldDesc.type==0x0089: #incomplete implementation, only gives back the selected string
            compareValue=unpack("I",f.read(4))[0] 
            enumComplex=self.complexDescriptors[fieldDesc.ref]

            if enumComplex.numField==0:
                field.value="*nullEnum*"
            for fieldIndex in xrange(enumComplex.fieldStartIndex,enumComplex.fieldStartIndex+enumComplex.numField):
                if self.fieldDescriptors[fieldIndex].offset==compareValue:
                    field.value=self.fieldDescriptors[fieldIndex].name
                    break
        elif fieldDesc.type==0xc15d:
            field.value=f.read(16)
        else:
            (typ,length)=numDict[fieldDesc.type]
            num=unpack(typ,f.read(length))[0]
            field.value=num
        
        return field
        

    def dump(self,outputFolder):
        dirName=os.path.dirname(outputFolder+self.trueFilename)
        if not os.path.isdir(dirName): os.makedirs(dirName)
        if not self.trueFilename: self.trueFilename=hexlify(self.fileGUID)
        f2=open(outputFolder+self.trueFilename+EXTENSION,"wb")
        print self.trueFilename
        
        for (guid,instance) in self.instances:
            if instance.desc.name not in IGNOREINSTANCES: #############
                if guid==self.primaryInstanceGUID: f2.write(instance.desc.name+" "+hexlify(guid)+ " #primary instance\r\n")
                else: f2.write(instance.desc.name+" "+hexlify(guid)+ "\r\n")
                self.recurse(instance.fields,f2,0)
        f2.close()

    def recurse(self, fields, f2, lvl): #over fields
        lvl+=1
        for field in fields:
            if field.desc.type in (0xc14d, 0xc0fd, 0xc10d, 0xc0ed, 0xc0dd, 0xc0bd, 0xc0ad, 0x407d, 0x409d, 0x0089):
                f2.write(lvl*SEP+field.desc.name+" "+str(field.value)+"\r\n")
            elif field.desc.type == 0xc13d:
                f2.write(lvl*SEP+field.desc.name+" "+formatfloat(field.value)+"\r\n")
            elif field.desc.type == 0xc15d:
                f2.write(lvl*SEP+field.desc.name+" "+hexlify(field.value).upper()+"\r\n") #upper case=> chunk guid
            elif field.desc.type == 0x0035:
                towrite=""
                if field.value>>31:
                    extguid=self.externalGUIDs[field.value&0x7fffffff]
                    try: towrite=guidTable[extguid[0]]+"/"+hexlify(extguid[1])
                    except: towrite=hexlify(extguid[0])+"/"+hexlify(extguid[1])
                elif field.value==0: towrite="*nullGuid*"
                else: towrite=hexlify(self.internalGUIDs[field.value-1])
                f2.write(lvl*SEP+field.desc.name+" "+towrite+"\r\n") 
            elif field.desc.type==0x0041 and len(field.value.fields)==0:
                f2.write(lvl*SEP+field.desc.name+" "+"*nullArray*"+"\r\n")
            else:
                if field.desc.name not in IGNOREFIELDS: #############
                    f2.write(lvl*SEP+field.desc.name+"::"+field.value.desc.name+"\r\n")
                    self.recurse(field.value.fields,f2,lvl)


if outputFolder[-1] not in ("/","\\"): outputFolder+="/"
if inputFolder[-1] not in ("/","\\"): inputFolder+="/"


#if there's a guid table already, use it
try:
    f5=open(guidTableName,"rb")
    guidTable=cPickle.load(f5)
    f5.close()
except:
    guidTable=dict()


main()
No chances about the sound decoder though (almost the same error as above). Doesn't work on BF3 anymore I presume.
Vosvoy
daemon1
MEGAVETERAN
MEGAVETERAN
Posts: 2647
Joined: Tue Mar 24, 2015 8:12 pm
Has thanked: 65 times
Been thanked: 2871 times

Re: Frostbite 2 sound extraction research

Post by daemon1 »

Vosvoy wrote:All right.

I just noticed that the ebx script is missing from OrangeC archive. Someone have it?
It IS in OrangeC archive and its called fb3decoder. It works for DAI.

p.s. I was wrong. The script recognizes all music chunks. I just didn't count the conversations.
So in total there are:

- 9101 chunks for /sound/
- 1465 chunks for /designcontent/conversations/
- 1 chunk for /vo/ss_inquisitor_vostream
= 10567 chunks. And this is exactly the total number of audio chunks found by dumper.
Win! We have extracted ALL audio chunks from DAI with proper names and all segments. This gives us about 100.000 audio files.

p.p.s. Ealayer3 needed to be corrected again to support this. It seems even with this version it fails somewhere after 90.000 files. I will correct it later.
daemon1
MEGAVETERAN
MEGAVETERAN
Posts: 2647
Joined: Tue Mar 24, 2015 8:12 pm
Has thanked: 65 times
Been thanked: 2871 times

Re: Frostbite 2 sound extraction research

Post by daemon1 »

Ok. It seems Ealayer3 can't decode only 2 small files from the whole game. It is interesting and I will investigave that later, but for now here's the current version of audio extraction script (all .dlls and my version of ealayer3 included) It will skip those 2 small files (1 ironbull roar and 1 claws scratching)

It converts all 10567 audio chunks into 101600 mp3 and wav files. Enjoy.

Later I will also make an extractor version to extract audio only, and put it all in one new thread.
You do not have the required permissions to view the files attached to this post.
daemon1
MEGAVETERAN
MEGAVETERAN
Posts: 2647
Joined: Tue Mar 24, 2015 8:12 pm
Has thanked: 65 times
Been thanked: 2871 times

Re: Frostbite 2 sound extraction research

Post by daemon1 »

Anybody have the old script to unpack the audio from BF3? It must be called fb2decoder.py or fb2audio and now I can't find it anywhere. We have all the other scripts now: both dumpers for BF3 & BF4, both ebxtotxt for old and new .ebx format, and audio extractor for BF4.
User avatar
Vosvoy
veteran
Posts: 127
Joined: Fri Feb 18, 2011 4:58 pm
Has thanked: 15 times
Been thanked: 15 times

Re: Frostbite 2 sound extraction research

Post by Vosvoy »

That's what I told you. The new script doesn't work at all on BF3 anymore. I don't think that the current .ebx script works on BF3 either. I've lost all my data lately so I don't have the scripts anymore (my HDD is beeping. It must be the head stucked/blocked inside the HDD. So, I consider I've a 50/50 chance of recovering my stuff).
Vosvoy
daemon1
MEGAVETERAN
MEGAVETERAN
Posts: 2647
Joined: Tue Mar 24, 2015 8:12 pm
Has thanked: 65 times
Been thanked: 2871 times

Re: Frostbite 2 sound extraction research

Post by daemon1 »

Vosvoy wrote:That's what I told you. The new script doesn't work at all on BF3 anymore. I don't think that the current .ebx script works on BF3 either.
Yes I know that. BF3 has different .ebx format, so the new script can't work with them.

I just thought you were talking about Dragon Age, and this is a different story. It has new .ebx format, but files packed with old packer, that's why I was able to change "dump" script to unpack it, and then use new audio script to decode the audio.

I think I can write a script to unpack BF3 again, but first let's see maybe someone has it saved.
User avatar
Vosvoy
veteran
Posts: 127
Joined: Fri Feb 18, 2011 4:58 pm
Has thanked: 15 times
Been thanked: 15 times

Re: Frostbite 2 sound extraction research

Post by Vosvoy »

daemon1 wrote:I think I can write a script to unpack BF3 again, but first let's see maybe someone has it saved.
Sure. Maybe I could recover my stuff on the HDD that keeps beepin' (where all my extraction tools are). But I'll need some extra tools for that. Maybe the next week.
Vosvoy
daemon1
MEGAVETERAN
MEGAVETERAN
Posts: 2647
Joined: Tue Mar 24, 2015 8:12 pm
Has thanked: 65 times
Been thanked: 2871 times

Re: Frostbite 2 sound extraction research

Post by daemon1 »

Nevermind. I've already made a new script and it works. I've compiled different parts of all his scripts, redone a part where .ebx is analysed and it seems it gives proper audio files. Just need to check it now if everything was extracted.

p.s. checked. As in DAI, all files extracted OK, except one, i don't know why
User avatar
Vosvoy
veteran
Posts: 127
Joined: Fri Feb 18, 2011 4:58 pm
Has thanked: 15 times
Been thanked: 15 times

Re: Frostbite 2 sound extraction research

Post by Vosvoy »

So? You did it? Don't need the old scripts?
Vosvoy
daemon1
MEGAVETERAN
MEGAVETERAN
Posts: 2647
Joined: Tue Mar 24, 2015 8:12 pm
Has thanked: 65 times
Been thanked: 2871 times

Re: Frostbite 2 sound extraction research

Post by daemon1 »

Vosvoy wrote:So? You did it? Don't need the old scripts?
Yes, I did it. Only checked it on one game, but i think it will work for all of frostbite 2 games. I also want to replace that XAS.dll so it will produce 16-bit sounds. It will save a lot of space. There's no need to make 32-bit waves, when the quality is long gone and said goodbye. It was packed into 4-bit adpcm xas.
Last edited by daemon1 on Wed Aug 12, 2015 6:23 pm, edited 1 time in total.
daemon1
MEGAVETERAN
MEGAVETERAN
Posts: 2647
Joined: Tue Mar 24, 2015 8:12 pm
Has thanked: 65 times
Been thanked: 2871 times

Re: Frostbite 2 sound extraction research

Post by daemon1 »

The "new" scripts to dump and decode frostbite 2 games:

dumper, ebxtotxt, fb2audio

All .dlls and ealayer3 included
You do not have the required permissions to view the files attached to this post.
User avatar
Vosvoy
veteran
Posts: 127
Joined: Fri Feb 18, 2011 4:58 pm
Has thanked: 15 times
Been thanked: 15 times

Re: Frostbite 2 sound extraction research

Post by Vosvoy »

Oh man, you know your shit. Multitask guy heh?

Jokes apart, script works with BF3OpenBeta. Thanks mate.
Vosvoy
Post Reply