AGI/Specifications/Formats

From ScummVM :: Wiki
< AGI‎ | Specifications
Revision as of 02:58, 23 January 2011 by Clone2727 (talk | contribs) (colour -> color)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Formats of the resource files

Directory files

Written by Lance Ewing (Last updated: 31 August 1997).

All AGI games have either one directory file, or more commonly, four. AGI version 2 games will have the files logdir, picdir, viewdir, and snddir. Games that use version 3 of the AGI interpreter will have a single file called *dir where the star is the initials of the game (e.g. BC, GR, MH2, MH1, KQ4). This single file is basically the four version 2 files joined together except that it has an 8 byte header giving the position of each directory within the single file.

The directory files give the location of the data types within the VOL files. The type of directory determines the type of data. For example, the logdir gives the locations of the LOGIC files. For a brief introduction to the different data types, see section General AGI overview.

Note: In this description and elsewhere in documents written by me, the AGI data called LOGIC, PICTURE, VIEW, and SOUND data are referred to by me as files even though they are part of a single VOL file. I think of the VOL file as sort of a virtual storage device in itself that holds many files. Some documents call the files contains in VOL files "resources".

Version 2 directories

Each directory file is of the same format. They contain a finite number of three byte entries, no more than 256. The size will vary depending on the number of files of the type that the directory file is pointing to. Dividing the filesize by three gives the maximum file number of that type of data file. Each entry is of the following format:

        Byte 1           Byte 2           Byte 3
    7 6 5 4 3 2 1 0  7 6 5 4 3 2 1 0  7 6 5 4 3 2 1 0
    V V V V P P P P  P P P P P P P P  P P P P P P P P

where V = VOL number and P = position (offset into VOL file).

The entry number itself gives the number of the data file that it is pointing to. For example, if the following three byte entry is entry number 45 in the SOUND directory file,

   12 3D FE

then sound.45 is located at position 0x23DFE in the vol.1 file. The first entry number is entry 0.

If the three bytes contain the value 0xFFFFFF, then the resource does not exist.

Version 3 directories

In the case of version 3 of the AGI interpreter, the logdir, picdir, viewdir, and snddir files are concatenated together in that order with an eight byte header giving the starting offset of each directory.

Header
    Byte 0 1 2 3 4 5 6 7
         L L P P V V S S

where L = offset of logdir, P = offset of picdir, V = offset of viewdir and S = offset of snddir.

Each offset is two bytes in length where the first byte is the low byte and the second byte is the high byte as is the case in the whole AGI system. For example, the first two bytes will always be 0x0800 since the header is a fixed size of eight bytes.

The format of each of the individual directory sections then follows as above for AGI v2.


Version 2 volume format

Written by Lance Ewing (Last updated: 31 August 1997).

Volumes are the main data files for AGI games. They contain four types of data: LOGIC, PICTURE, VIEW, and SOUND data. A Vol file is a collection of a large number of these "resource" files which can be in any order. The directory files determine the start of each resource.

The start of every resource file has a five byte header.

    Byte  Meaning
    ----- -----------------------------------------------------------
     0-1  Signature (0x12--0x34)
      2   Vol number that the resource is contained in
     3-4  Length of the resource taken from after the header
    ----- -----------------------------------------------------------

The data after the header depends on the type of resource file. These formats are documented elsewhere.

Version 3 resource storage

Written by Lance Ewing (Last updated: 27 January 1997).

AGIv3 stores resources in a slightly different way from AGIv2. The first significant difference is in the length of the resource header which is now seven bytes.

    Byte  Meaning
    ----- -----------------------------------------------------------
     0-1  Signature (0x12--0x34)
      2   Vol number that the resource is contained in
     3-4  Uncompressed resource size (LO-HI)
     5-6  Compressed resource size (LO-HI)
    ----- -----------------------------------------------------------

Instead of one resource size as in AGIv2, there are now two sizes. Most of the resources in AGIv3 games are compressed with a form of LZW. Some of them are not though. The interpreter determines whether the resource is compressed by comparing the values of the two sizes given in the header information. If they are equal, then it knows that the resource is stored uncompressed. However, if the sizes do not match, this does not mean that the file is compressed with LZW. If the file is a PICTURE file, then it is stored with its own limited form of compression. This is why the top bit of the third byte in the header is used to tell the interpreter that the resource is a PICTURE file, otherwise it would think that the resource was compressed with LZW.

Note: As far as I can tell, none of the PICTUREs are compressed with LZW. This may well be possible though. It could also be possible for the PICTURE to be totally uncompressed (i.e. it wouldn't use the PICTURE compression method), but I haven't seen any examples of either of the above two cases. (L.E.)

LZW compression

The compression used with version 3 games is an adaptive form of LZW. The LZW algorithm is not explained here, but it basically compresses data by representing previous strings by single codes. When these strings are encountered again, the code can be stored instead. The following information states how the AGIv3 algorithm differs from the standard LZW algorithm. There are plenty of places on the net where you can find a description of the LZW algorithm if you are not familiar with it.

AGIv3 uses an adaptive form of LZW that starts by using 9 bit codes and when the code space is full, it progresses on to 10 bits and so on. As with normal LZW, codes 0-255 represent the standard ASCII characters. The next two codes have a special meaning:

  • 256 is used as a start over code. The table is cleared, the number of bits set back to 9, and the process begins again with the next code being 258.
  • 257 tells the interpreter that it has reached the end of the resource.

Code 256 seems to be the first code stored in all compressed resources. This is probably just to make sure everything is initialized for beginning the compression process. As was mentioned above, the first code used for the LZW table itself is code 258. From there it stores pairs of prefix codes and appended characters for each table entry until it reaches code 512 at which stage it switches to storing the codes using 10 bits and then 11 and so on. It appears that it will never get to 12 bits because code 256 always seems to turn up just before it needs to switch up to 12 bits, i.e. when code 2048 is required. Carl Muckenhoupt's decrypt routine for SCI games specifically prevents it from switching to 12 bits anyway. Whether there is ever a case where code 256 does not intervene, it has not yet been determined.


Note: I should point out that Carl and myself both arrived at the above algorithm independently which confirms that the compression used in the early SCI games was identical to that used in AGIv3.

Picture compression

Pictures in AGI version 3 use a simple form of compression to shrink their size my a tiny amount. It was obviously recognized by the interpreter coders that four bits were being wasted for picture codes 0xF0 and 0xF2. These are the two codes that change the visual and the priority color respectively. Since there are only 16 colors, there need not be a whole byte set aside for storing the color. All the picture compression does is store these colors in 4 bits rather than 8.

Example:

   Original picture codes: F0 06 F8 12 45 F0 07 F2 05 F8 14 67 ...
   Compressed picture code: F0 6F 81 24 5F 07 F2 5F 81 46 7 ...

Sample code

The following examples are available in the distribution package:

  • agifiles.c by Lance Ewing: routines to handle loading of resources
  • agifiles.h by Lance Ewing: header file for agifiles.c
  • general.h by Lance Ewing: general definitions
  • volx2.c by Lance Ewing, Joakim Mueller and Martin Tillenius: program to extract resources from AGI version 2 games (UNIX version available in the AGI Utils package).
  • xv3.pas by Lance Ewing: program to extract resources from AGI version 3 games (UNIX version available in the AGI Utils package).
  • agiver.pas by Jeremy Hayes: displays version number of game and interpreter