Open main menu

Difference between revisions of "AGI/Specifications/Formats"

Wikify and fill in differences from SGML. Work complete
(Wikify and fill in differences from SGML. Work complete)
Line 6: Line 6:
Written by Lance Ewing (Last updated: 31 August 1997).
Written by Lance Ewing (Last updated: 31 August 1997).


All AGI games have either one directory file, or more commonly, four. AGI version 2 games will have the files logdir, picdir, viewdir, and snddir. Games that use version 3 of the AGI interpreter will have a single file called *dir where the star is the initials of the game (e.g. BC, GR, MH2, MH1, KQ4). This single file is basically the four version 2 files joined together except that it has an 8 byte header giving the position of each directory within the single file.
All AGI games have either one directory file, or more commonly, four. AGI version 2 games will have the files '''logdir''', '''picdir''', '''viewdir''', and '''snddir'''. Games that use version 3 of the AGI interpreter will have a single file called '''*dir''' where the star is the initials of the game (e.g. BC, GR, MH2, MH1, KQ4). This single file is basically the four version 2 files joined together except that it has an 8 byte header giving the position of each directory within the single file.


The directory files give the location of the data types within the VOL files. The type of directory determines the type of data. For example, the logdir gives the locations of the LOGIC files. For a brief introduction to the different data types, see section General AGI overview.
The directory files give the location of the data types within the VOL files. The type of directory determines the type of data. For example, the '''logdir''' gives the locations of the LOGIC files. For a brief introduction to the different data types, see [[AGI/Specifications/Overview | section General AGI overview]].


Note: In this description and elsewhere in documents written by me, the AGI data called LOGIC, PICTURE, VIEW, and SOUND data are referred to by me as files even though they are part of a single VOL file. I think of the VOL file as sort of a virtual storage device in itself that holds many files. Some documents call the files contains in VOL files "resources".
Note: In this description and elsewhere in documents written by me, the AGI data called LOGIC, PICTURE, VIEW, and SOUND data are referred to by me as files even though they are part of a single VOL file. I think of the VOL file as sort of a virtual storage device in itself that holds many files. Some documents call the files contains in VOL files "resources".
Version 2 directories
 
===Version 2 directories===


Each directory file is of the same format. They contain a finite number of three byte entries, no more than 256. The size will vary depending on the number of files of the type that the directory file is pointing to. Dividing the filesize by three gives the maximum file number of that type of data file. Each entry is of the following format:
Each directory file is of the same format. They contain a finite number of three byte entries, no more than 256. The size will vary depending on the number of files of the type that the directory file is pointing to. Dividing the filesize by three gives the maximum file number of that type of data file. Each entry is of the following format:
Line 30: Line 31:


If the three bytes contain the value 0xFFFFFF, then the resource does not exist.
If the three bytes contain the value 0xFFFFFF, then the resource does not exist.
Version 3 directories


In the case of version 3 of the AGI interpreter, the logdir, picdir, viewdir, and snddir files are concatenated together in that order with an eight byte header giving the starting offset of each directory.
===Version 3 directories===
 
In the case of version 3 of the AGI interpreter, the '''logdir''', '''picdir''', '''viewdir''', and '''snddir''' files are concatenated together in that order with an eight byte header giving the starting offset of each directory.


<pre>
<pre>
Line 40: Line 42:
</pre>
</pre>


where L = offset of logdir, P = offset of picdir, V = offset of viewdir and S = offset of snddir.
where L = offset of '''logdir''', P = offset of '''picdir''', V = offset of '''viewdir''' and S = offset of '''snddir'''.


Each offset is two bytes in length where the first byte is the low byte and the second byte is the high byte as is the case in the whole AGI system. For example, the first two bytes will always be 0x0800 since the header is a fixed size of eight bytes.
Each offset is two bytes in length where the first byte is the low byte and the second byte is the high byte as is the case in the whole AGI system. For example, the first two bytes will always be 0x0800 since the header is a fixed size of eight bytes.
Line 48: Line 50:


<span id="Vol2"></span>
<span id="Vol2"></span>
==Format of Vol files (version 2)==
==Version 2 volume format==


Written by Lance Ewing (Last updated: 31 August 1997).
Written by Lance Ewing (Last updated: 31 August 1997).


Vol files are the main data files for AGI games. They contain four types of data: LOGIC, PICTURE, VIEW, and SOUND data. A Vol file is a collection of a large number of these "resource" files which can be in any order. The directory files determine the start of each resource.
Volumes are the main data files for AGI games. They contain four types of data: LOGIC, PICTURE, VIEW, and SOUND data. A Vol file is a collection of a large number of these "resource" files which can be in any order. The directory files determine the start of each resource.


The start of every resource file has a five byte header.
The start of every resource file has a five byte header.
Line 68: Line 70:


<span id="Vol3"></span>
<span id="Vol3"></span>
==Format of Vol files (version 3)==
==Version 3 resource storage==


Written by Lance Ewing (Last updated: 27 January 1997).
Written by Lance Ewing (Last updated: 27 January 1997).
Line 86: Line 88:
Instead of one resource size as in AGIv2, there are now two sizes. Most of the resources in AGIv3 games are compressed with a form of LZW. Some of them are not though. The interpreter determines whether the resource is compressed by comparing the values of the two sizes given in the header information. If they are equal, then it knows that the resource is stored uncompressed. However, if the sizes do not match, this does not mean that the file is compressed with LZW. If the file is a PICTURE file, then it is stored with its own limited form of compression. This is why the top bit of the third byte in the header is used to tell the interpreter that the resource is a PICTURE file, otherwise it would think that the resource was compressed with LZW.
Instead of one resource size as in AGIv2, there are now two sizes. Most of the resources in AGIv3 games are compressed with a form of LZW. Some of them are not though. The interpreter determines whether the resource is compressed by comparing the values of the two sizes given in the header information. If they are equal, then it knows that the resource is stored uncompressed. However, if the sizes do not match, this does not mean that the file is compressed with LZW. If the file is a PICTURE file, then it is stored with its own limited form of compression. This is why the top bit of the third byte in the header is used to tell the interpreter that the resource is a PICTURE file, otherwise it would think that the resource was compressed with LZW.


As far as I can tell, none of the PICTUREs are compressed with LZW. This may well be possible though. It could also be possible for the PICTURE to be totally uncompressed (i.e. it wouldn't use the PICTURE compression method), but I haven't seen any examples of either of the above two cases.
<!-- Footnote --> ''Note: As far as I can tell, none of the PICTUREs are compressed with LZW. This may well be possible though. It could also be possible for the PICTURE to be totally uncompressed (i.e. it wouldn't use the PICTURE compression method), but I haven't seen any examples of either of the above two cases. (L.E.)''
LZW compression
 
===LZW compression===


The compression used with version 3 games is an adaptive form of LZW. The LZW algorithm is not explained here, but it basically compresses data by representing previous strings by single codes. When these strings are encountered again, the code can be stored instead. The following information states how the AGIv3 algorithm differs from the standard LZW algorithm. There are plenty of places on the net where you can find a description of the LZW algorithm if you are not familiar with it.
The compression used with version 3 games is an adaptive form of LZW. The LZW algorithm is not explained here, but it basically compresses data by representing previous strings by single codes. When these strings are encountered again, the code can be stored instead. The following information states how the AGIv3 algorithm differs from the standard LZW algorithm. There are plenty of places on the net where you can find a description of the LZW algorithm if you are not familiar with it.
Line 98: Line 101:
Code 256 seems to be the first code stored in all compressed resources. This is probably just to make sure everything is initialized for beginning the compression process. As was mentioned above, the first code used for the LZW table itself is code 258. From there it stores pairs of prefix codes and appended characters for each table entry until it reaches code 512 at which stage it switches to storing the codes using 10 bits and then 11 and so on. It appears that it will never get to 12 bits because code 256 always seems to turn up just before it needs to switch up to 12 bits, i.e. when code 2048 is required. Carl Muckenhoupt's decrypt routine for SCI games specifically prevents it from switching to 12 bits anyway. Whether there is ever a case where code 256 does not intervene, it has not yet been determined.
Code 256 seems to be the first code stored in all compressed resources. This is probably just to make sure everything is initialized for beginning the compression process. As was mentioned above, the first code used for the LZW table itself is code 258. From there it stores pairs of prefix codes and appended characters for each table entry until it reaches code 512 at which stage it switches to storing the codes using 10 bits and then 11 and so on. It appears that it will never get to 12 bits because code 256 always seems to turn up just before it needs to switch up to 12 bits, i.e. when code 2048 is required. Carl Muckenhoupt's decrypt routine for SCI games specifically prevents it from switching to 12 bits anyway. Whether there is ever a case where code 256 does not intervene, it has not yet been determined.


Note: I should point out that Carl and myself both arrived at the above algorithm independently which confirms that the compression used in the early SCI games was identical to that used in AGIv3.
 
Picture compression
<!-- Footnote -->''Note: I should point out that Carl and myself both arrived at the above algorithm independently which confirms that the compression used in the early SCI games was identical to that used in AGIv3.''
 
===Picture compression===


Pictures in AGI version 3 use a simple form of compression to shrink their size my a tiny amount. It was obviously recognised by the interpreter coders that four bits were being wasted for picture codes 0xF0 and 0xF2. These are the two codes that change the visual and the priority colour respectively. Since there are only 16 colours, there need not be a whole byte set aside for storing the colour. All the picture compression does is store these colours in 4 bits rather than 8.
Pictures in AGI version 3 use a simple form of compression to shrink their size my a tiny amount. It was obviously recognised by the interpreter coders that four bits were being wasted for picture codes 0xF0 and 0xF2. These are the two codes that change the visual and the priority colour respectively. Since there are only 16 colours, there need not be a whole byte set aside for storing the colour. All the picture compression does is store these colours in 4 bits rather than 8.
Line 116: Line 121:
* agifiles.h by Lance Ewing: header file for agifiles.c
* agifiles.h by Lance Ewing: header file for agifiles.c
* general.h by Lance Ewing: general definitions
* general.h by Lance Ewing: general definitions
* volx2.c by Lance Ewing, Joakim Möller and Martin Tillenius: program to extract resources from AGI version 2 games (UNIX version available in the AGI Utils package).
* volx2.c by Lance Ewing, Joakim Mueller and Martin Tillenius: program to extract resources from AGI version 2 games (UNIX version available in the AGI Utils package).
* xv3.pas by Lance Ewing: program to extract resources from AGI version 3 games (UNIX version available in the AGI Utils package).
* xv3.pas by Lance Ewing: program to extract resources from AGI version 3 games (UNIX version available in the AGI Utils package).
* agiver.pas by Jeremy Hayes: displays version number of game and interpreter
* agiver.pas by Jeremy Hayes: displays version number of game and interpreter