Difference between revisions of "Cine/Specifications"

From ScummVM :: Wiki
Jump to navigation Jump to search
(Revised compression format part's introduction a bit.)
(Added info about the Delphine's compression format's parsing.)
Line 114: Line 114:
The compression algorithm used by all Delphine's adventure games uses
The compression algorithm used by all Delphine's adventure games uses
sliding window compression (Quite like [http://en.wikipedia.org/wiki/LZ77 LZ77])
sliding window compression (Quite like [http://en.wikipedia.org/wiki/LZ77 LZ77])
combined with a fixed [http://en.wikipedia.org/wiki/Entropy_coding entropy coding] scheme (Not of any type I could recognize).
combined with a fixed non-adaptive [http://en.wikipedia.org/wiki/Entropy_coding entropy coding] scheme (Not of any type I could recognize).


The compressed data is in big endian 32-bit chunks, working backwards from the buffer's end.
The compressed data is in big endian 32-bit chunks, working backwards from the buffer's end.
Line 134: Line 134:
     --------  ------------------------------------------------------------------------
     --------  ------------------------------------------------------------------------
</pre>
</pre>
===Bit sequences in the compressed stream===
First bit of a command always tells how many bits the whole command takes.
If the first bit of a command is zero, then the whole command (Including the
first bit) takes two bits total, otherwise it takes three bits (i.e. the first bit is one).
There are two possible types of commands:
* unpackRawBytes(N)
** Copies N bytes straight from the source stream and writes them to the destination
* copyRelocatedBytes(OFFSET, N)
** Copies N bytes from position OFFSET in the sliding window (i.e. from the unpacked buffer)
Commands may have predefined values or a restricted value range for some of their parameters.
Here are all the possible commands (Including their first bit that tells the command's length):
<pre>
Bits  => Action:
0 0  => unpackRawBytes(3 bits + 1)              i.e. unpackRawBytes(1..9)
1 1 1 => unpackRawBytes(8 bits + 9)              i.e. unpackRawBytes(9..264)
0 1  => copyRelocatedBytes(8 bits, 2)          i.e. copyRelocatedBytes(0..255, 2)
1 0 0 => copyRelocatedBytes(9 bits, 3)          i.e. copyRelocatedBytes(0..511, 3)
1 0 1 => copyRelocatedBytes(10 bits, 4)          i.e. copyRelocatedBytes(0..1023, 4)
1 1 0 => copyRelocatedBytes(12 bits, 8 bits + 1) i.e. copyRelocatedBytes(0..4095, 1..256)
</pre>
Some examples of parsing the commands:
* We read one bit from the stream, it's 0. Okay, so the command has length of two bits. We read the second bit, it's 0, so the command is unpackRawBytes(3 bits + 1). That means we read 3 bits from the source stream, let's call that number X. Then we call unpackRawBytes(X + 1).
* We read one bit from the stream, it's 1. So the command has length of three bits. We read two more bits, they're 1 and 0 (i.e. 10b). So we've got command copyRelocatedBytes(12 bits, 8 bits + 1). Let's first read 12 bits from the source stream and call that OFFSET. Then let's read 8 bits from the source stream and call that X. Then we'll call copyRelocatedBytes(OFFSET, X + 1).
* We read one bit from the stream, it's 1. So the command has length of three bits. We read two more bits, they're 1 and 1 (i.e. 11b). So we've got command unpackedRawBytes(8 bits + 9). Let's then read 8 bits from the source stream and call that X. Then we'll call unpackRawBytes(X + 9).
* We read one bit from the stream, it's 1. So the command has length of three bits. We read two more bits, they're 0 and 0 (i.e. 00b). So we've got command copyRelocatedBytes(9 bits, 3). Let's read 9 bits from the source stream and call that OFFSET. Then we'll call copyRelocatedBytes(OFFSET, 3).
===End of decompression===
The unpacking ends when we've written <i>unpacked length</i> bytes into the destination buffer.
If at that point the [http://en.wikipedia.org/wiki/XOR exclusive-or] of all the read source stream
chunks together equals the <i>error code</i> from the source stream's header (i.e. their exclusive-or is zero),
that means the packed data is probably intact (Only probably because the exclusive-or of 32-bit chunks isn't a very good
error detection method, it lets errors through relatively easily if compared to other more robust
error detection methods like [http://en.wikipedia.org/wiki/SHA-1 SHA-1] or even [http://en.wikipedia.org/wiki/Cyclic_redundancy_check CRC]).

Revision as of 16:40, 16 June 2008

Cine Specifications (version ε)

This is a place intended for information about the Cinématique engine's internals, file formats etc. Adding any additional information here is encouraged. Information probably is inaccurate in places so if you have more accurate information, please add it here.

File formats

Part file format

NOTE: This applies to both Future Wars and Operation Stealth.

Part file's start:

    Byte  Meaning
    ----- -----------------------------------------------------------
     0-1  Number of elements in this part file (Uint16BE)
     2-3  Entry size (Uint16BE). Normally 0x1E i.e. 30
    ----- -----------------------------------------------------------

Then comes info for each element (Entry size in length each):

    Byte  Meaning
    ------ -----------------------------------------------------------
     0-13  Name (ASCIIZ string)
    14-17  Data's starting offset in this part file (Uint32BE)
    18-21  Packed size (Uint32BE)
    22-25  Unpacked size (Uint32BE)
    26-29  ???
    ------ -----------------------------------------------------------

After that it's the data for all the elements contained in this part file.

VOL.CNF

NOTE: This file is specific to Operation Stealth.

This file contains list of resource files (e.g. PROCS10, RSC04, SONS2, LABYBASE etc) and a list of files contained in each resource file (e.g. AUTO00.PRC and MASKG.REL in PROCS00).

If the file starts with "ABASECP" the file is compressed, otherwise it's uncompressed.

Compressed file's start:

    Byte  Meaning
    ----- -----------------------------------------------------------
     0-7  Magic header ("ABASECP" string with the trailing zero)
     8-11 Unpacked size of the data after this header (Uint32BE)
    12-15 Packed size of the data after this header (Uint32BE)
    ----- -----------------------------------------------------------

For a compressed file just read the rest of the file after the header and uncompress it with Delphine's unpacking routine and you've got the same data as you'd have had if you'd have had an unpacked file to start with (Except the header, of course).

Uncompressed file's start:

    Byte  Meaning
    ----- -----------------------------------------------------------
     0-1  Resource files count (Uint16BE)
     2-3  Entry size (Uint16BE). 0x14 in all tested files so far
    ----- -----------------------------------------------------------

Then come resource files count times resource file info structs of entry size bytes length each:

Resource file info struct:
x = entry size - 1

    Byte  Meaning
    ----- -----------------------------------------------------------
     0-7  Resource file name string (Possibly no trailing zero!)
     8-x  Unknown data
    ----- -----------------------------------------------------------

Then for each resource file comes a list of files that are in it.

Each list begins with an unsigned 32-bit big endian integer telling the size of the entry. After that come (size / 11) or (size / 13) filenames of length 11 or 13 respectively. Almost as a rule compressed files use filename length of 11 and uncompressed files use filename length of 13 but it's not always so (At least some Amiga version used a compressed 'vol.cnf' file but still used filenames of length 13).

Filenames of length 11 have no separation of the extension and the basename so that's why we have to convert them first. There's no trailing zero in them either and they're always of the full length 11 with padding spaces. Extension can be always found at offset 8 onwards. Filenames of length 13 are okay as they are, no need for converting them.

Examples of filename mappings:

    "AEROPORTMSG" -> "AEROPORT.MSG"
    "MITRAILLHP " -> "MITRAILL.HP" (Notice the trailing space after the extension)
    "BOND10     " -> "BOND10"
    "GIRL    SET" -> "GIRL.SET"

Compression format

The compression algorithm used by all Delphine's adventure games uses sliding window compression (Quite like LZ77) combined with a fixed non-adaptive entropy coding scheme (Not of any type I could recognize).

The compressed data is in big endian 32-bit chunks, working backwards from the buffer's end. So we start from the data's end and work backwards.

Compression format:

NOTE: As the whole data consists of unsigned big endian 32-bit integers, I use indexing
      in 32-bit addresses here. By -1 I mean the last 32-bits of the data
      (i.e. bytes src[srcLen-4], src[srcLen-3], src[srcLen-2] and src[srcLen-1]),
      by -2 the second to last 32-bits etc.

    Dword     Meaning
    --------- ------------------------------------------------------------------------
    -1        Unpacked length (Uint32BE).
    -2        Error code (Uint32BE). Xor of the whole packed data in Uint32BE chunks.
    0 - (-3)  The packed data (In Uint32BE chunks).
    --------  ------------------------------------------------------------------------

Bit sequences in the compressed stream

First bit of a command always tells how many bits the whole command takes. If the first bit of a command is zero, then the whole command (Including the first bit) takes two bits total, otherwise it takes three bits (i.e. the first bit is one).

There are two possible types of commands:

  • unpackRawBytes(N)
    • Copies N bytes straight from the source stream and writes them to the destination
  • copyRelocatedBytes(OFFSET, N)
    • Copies N bytes from position OFFSET in the sliding window (i.e. from the unpacked buffer)

Commands may have predefined values or a restricted value range for some of their parameters.

Here are all the possible commands (Including their first bit that tells the command's length):

Bits  => Action:
0 0   => unpackRawBytes(3 bits + 1)              i.e. unpackRawBytes(1..9)
1 1 1 => unpackRawBytes(8 bits + 9)              i.e. unpackRawBytes(9..264)
0 1   => copyRelocatedBytes(8 bits, 2)           i.e. copyRelocatedBytes(0..255, 2)
1 0 0 => copyRelocatedBytes(9 bits, 3)           i.e. copyRelocatedBytes(0..511, 3)
1 0 1 => copyRelocatedBytes(10 bits, 4)          i.e. copyRelocatedBytes(0..1023, 4)
1 1 0 => copyRelocatedBytes(12 bits, 8 bits + 1) i.e. copyRelocatedBytes(0..4095, 1..256)

Some examples of parsing the commands:

  • We read one bit from the stream, it's 0. Okay, so the command has length of two bits. We read the second bit, it's 0, so the command is unpackRawBytes(3 bits + 1). That means we read 3 bits from the source stream, let's call that number X. Then we call unpackRawBytes(X + 1).
  • We read one bit from the stream, it's 1. So the command has length of three bits. We read two more bits, they're 1 and 0 (i.e. 10b). So we've got command copyRelocatedBytes(12 bits, 8 bits + 1). Let's first read 12 bits from the source stream and call that OFFSET. Then let's read 8 bits from the source stream and call that X. Then we'll call copyRelocatedBytes(OFFSET, X + 1).
  • We read one bit from the stream, it's 1. So the command has length of three bits. We read two more bits, they're 1 and 1 (i.e. 11b). So we've got command unpackedRawBytes(8 bits + 9). Let's then read 8 bits from the source stream and call that X. Then we'll call unpackRawBytes(X + 9).
  • We read one bit from the stream, it's 1. So the command has length of three bits. We read two more bits, they're 0 and 0 (i.e. 00b). So we've got command copyRelocatedBytes(9 bits, 3). Let's read 9 bits from the source stream and call that OFFSET. Then we'll call copyRelocatedBytes(OFFSET, 3).

End of decompression

The unpacking ends when we've written unpacked length bytes into the destination buffer. If at that point the exclusive-or of all the read source stream chunks together equals the error code from the source stream's header (i.e. their exclusive-or is zero), that means the packed data is probably intact (Only probably because the exclusive-or of 32-bit chunks isn't a very good error detection method, it lets errors through relatively easily if compared to other more robust error detection methods like SHA-1 or even CRC).