Important information: this site is currently scheduled to go offline indefinitely by end of the year.

Script format.

Coders and would-be coders alike, this is the place to talk about programming.
szevvy
n00b
Posts: 15
Joined: Thu Sep 21, 2006 7:20 am

Script format.

Post by szevvy »

First, you're going to have to make some assumptions for me. :)

1. Assume that I've been working on a game extractor since May, and it's starting to come together.
2. Assume I've released game extractors before and thus know how to code, and -
3. Assume I'm posting this here only because this seems to the best game archive community around, and not because I want to ride on anyone's coat-tails or take advantage of these forums (or Mr. Mouse, whom I very much respect for the things he does)

OK.

Now, this game extractor I'm working on supports scripting, but it does it differently to Mex. It has the advantage of being (IMO) easier to read, but the disadvantage of being less powerful in the things it can do.

A few samples:

Code: Select all

driver "Eve Online"

file name = "*.stuff" {
	packed simple

	unsigned32 fileCount

	FileEntry [fileCount] (
		unsigned32 size
		unsigned32 nameSize		
		string name
	)
}

file name = "*.txt" {
	text
}

file name = "*.ogg" {
	sound OGG
}

file name = "*.dds" {
	image DDS
}

Code: Select all

driver "Cars (THQ)"

file name = "*.pak" {
	packed simple

	unsigned32 fileCount

	FileEntry[fileCount] (
		FixedString name 100
		unsigned32 offset_from_end_of_header
		offset: (fileCount * 108) + 4 + offset_from_end_of_header
		unsigned32 size
	)
}

Code: Select all

file name = "sc2000.dat" {
	packed simple

	FileEntry[filePos < dataStart] (
		FixedString name 12
		unsigned32 offset
	)
}

file name = "title.raw" { #title image is special
	data unknown 4
	image VGA (
		width: 640
		height: 480
	)
}
You can use any variable anywhere in a script, so you can do things like:

Code: Select all

file name = "*.pack" {
    unsigned16 fileCount;

    FileEntry[fileCount + 2](
        unsigned32 size
        name: "Dunno" + size
        someFlag = true
    )
}

file (someFlag) = 1 OR (name = "Something") {
    image RAW24 (
        width: 24
        height: size / ( width * 3)
    )
}
And things like LZSS and XOR encryption will be included as standard things you can 'wrap around' other bits of data.

Some of the things you might ask for that work already:

- Archives with no names. You can:

Code: Select all

 file name = "*.pak" {
    u8 fileCount
    packedName: name #store the packed file's name

    FileEntry[fileCount] (
        u32 size
        u32 offset
        name: filenames["name"]
    )
}

table filenames(
    -packedName- name
    "foo.pak"        booPakFiles["name"]
    "bar.pak"        barPakFiles["name"]
)

table booPakFiles (
    -fileNum-    name
    0               "blah.bmp"
    ..etc..
)
- Files with only an offset/size listed - automatically handled by the "simple" archive reader, although you can define the "size" and "offset" variables if the archive isn't a header followed by file data

- Data types: u8, s8, u16, s16, u32, s32, string (a null-terminated string), fixedString (a string of fixed size, with or without null terminator), ansiString (a u8 of size followed by data), and dirtyString (a u8 of size, followed by string data, padded out to a certain length with no null ternimator); u64/s64 and floats might come later if it's required.

- Hex viewer: still very crap, but it works.

- Modding: not yet, however it's definately planned. Given that the program reads a file in given the structure, it should be possible to reverse the process quite easily (in some cases).

Now, the reason I'm posting this is the following: this thing's fairly far into development, and I can see that the way I chose it implement it does, in fact, work (against my better judgement :)) So, now that I can see results and things work, I can start to massage in things that people want. What would you like to see in something like this? What would you like the scripts to look like? What would you find easiest to read? Would you prefer variables to be written as "var name: 'Something' + size" to set them apart from data in a file more? Once again, I emphasise that I'm trying to write something that people will find a use for, not to blow my own trumpet or prove anything.

Cheers guys.
Mr.Mouse
Site Admin
Posts: 4073
Joined: Wed Jan 15, 2003 6:45 pm
Location: Dungeons of Doom
Has thanked: 450 times
Been thanked: 682 times
Contact:

Post by Mr.Mouse »

I like the idea, but I haven't had time to really consider your work yet, to see potential pitfalls or advantages other than those you mention.

However, it seems you might want to check out http://hachoir.org/ as this is a tool that is similar to your idea and far in development I believe. I am really intrigued by Hachoir.
You do not have the required permissions to view the files attached to this post.
Rheini
Moderator
Posts: 652
Joined: Wed Oct 18, 2006 9:48 pm
Location: Germany
Has thanked: 19 times
Been thanked: 46 times
Contact:

Post by Rheini »

The goal for a good tool is to support complicated and extensive computations, like some encryption algorithms and compression techniques require.
A good way would be a scripting language like LUA I think. It's an easy language (cause similar to C), easy to implement and quite powerful.
szevvy
n00b
Posts: 15
Joined: Thu Sep 21, 2006 7:20 am

Post by szevvy »

LUA is problematic, as it doesn't support (natively) integers or bitwise operations. If there's a version out there that does, please let know, but for now I'm more inclined towards Python as it supports both of the above.
Rheini
Moderator
Posts: 652
Joined: Wed Oct 18, 2006 9:48 pm
Location: Germany
Has thanked: 19 times
Been thanked: 46 times
Contact:

Post by Rheini »

You can get a library for bitwise operations here
Rahly
VVIP member
VVIP member
Posts: 411
Joined: Thu Aug 05, 2004 10:17 am
Been thanked: 1 time

Post by Rahly »

And i'm trying to get Mr. Mouse to move from script, silly boys
"By nature men are alike. Through practice they have become far apart." Confucius (Analect 17:2)
Rheini
Moderator
Posts: 652
Joined: Wed Oct 18, 2006 9:48 pm
Location: Germany
Has thanked: 19 times
Been thanked: 46 times
Contact:

Post by Rheini »

It depends on which scripting language is used. The one Mex Commander uses is indeed improper cause it is complicated and restricted. If you have a struct like in C

Code: Select all

struct header {
int a;
byte b[64];
int c;
}
you can directly read out the structure of the file and every C programmer understands it.
Mr.Mouse
Site Admin
Posts: 4073
Joined: Wed Jan 15, 2003 6:45 pm
Location: Dungeons of Doom
Has thanked: 450 times
Been thanked: 682 times
Contact:

Post by Mr.Mouse »

MexScript was not intented to recreate C or other such low-level languages.

It was intended for people with little knowledge of programming that wish to support new formats themselves, with many important underlying functions being performed by MultiEx. Thus, it is very high-level, and therefore highly restricted. But that's a choice.

There is not much point in creating a script like C, you can just as well program a plugin in actual C then (or other language for that matter).

You will find that if you start doing it in a C like manner, that you will need more and more statements, functions and commands to support all formats. Again, you might as well start coding in C right from the beginning; it's got all you need.

Rahly is in the process of creating an interpreter of MexScript that will handle old scripts, but also improves much of the MexScript. He still has another idea for this type of thing, but without using a script. Hold your breath for that one.
Rahly
VVIP member
VVIP member
Posts: 411
Joined: Thu Aug 05, 2004 10:17 am
Been thanked: 1 time

Post by Rahly »

If you are going to do this, i'd suggest you write up a grammar for that syntax. They are helpful in planning out the language and make it 1000x easier to write out the parser for it. A lot of times, you'll find pitfalls and figure a way around it, or easier to start over without writing code that'll become convoluted, and harder to add new features later.
"By nature men are alike. Through practice they have become far apart." Confucius (Analect 17:2)
Rheini
Moderator
Posts: 652
Joined: Wed Oct 18, 2006 9:48 pm
Location: Germany
Has thanked: 19 times
Been thanked: 46 times
Contact:

Post by Rheini »

Mr.Mouse wrote: It was intended for people with little knowledge of programming that wish to support new formats themselves
Yeah, but I doubt that such people are able to figure out a file format. ;)
Mr.Mouse
Site Admin
Posts: 4073
Joined: Wed Jan 15, 2003 6:45 pm
Location: Dungeons of Doom
Has thanked: 450 times
Been thanked: 682 times
Contact:

Post by Mr.Mouse »

Rheini wrote:
Mr.Mouse wrote: It was intended for people with little knowledge of programming that wish to support new formats themselves
Yeah, but I doubt that such people are able to figure out a file format. ;)
That's why I invited Watto to join me in writing this tutorial at the time, eh. :)

http://wiki.xentax.com/index.php/DGTEFF
User avatar
Dinoguy1000
Site Admin
Posts: 786
Joined: Mon Sep 13, 2004 1:55 am
Has thanked: 154 times
Been thanked: 163 times

Post by Dinoguy1000 »

And also why we have this support forum. And there's always the list of unsupported formats on the WIKI...
Welcome to Xentax!

Rules | Requests | Wiki | Discord

If you run across a post that breaks the rules, please report the post - a mod or admin will handle it from there.
Rahly
VVIP member
VVIP member
Posts: 411
Joined: Thu Aug 05, 2004 10:17 am
Been thanked: 1 time

Post by Rahly »

C* like languages are rather ugly
"By nature men are alike. Through practice they have become far apart." Confucius (Analect 17:2)
Rheini
Moderator
Posts: 652
Joined: Wed Oct 18, 2006 9:48 pm
Location: Germany
Has thanked: 19 times
Been thanked: 46 times
Contact:

Post by Rheini »

That's a matter of taste. ^^
I hate Pascal-like languages.
Rahly
VVIP member
VVIP member
Posts: 411
Joined: Thu Aug 05, 2004 10:17 am
Been thanked: 1 time

Post by Rahly »

it doesn't really have anything to do with tastes, it has to do with semantics. Pascal is a very readable, except for a couple key words, but I think they were more lazy, like c* coders.

Except for single command, you can do a 1-to-1 from C and Pascal, since they use the same constructs.
"By nature men are alike. Through practice they have become far apart." Confucius (Analect 17:2)
Post Reply