SCI/Specifications/Introduction

From ScummVM :: Wiki
Jump to navigation Jump to search

Preface

Throughout the documentation, the term SCI will be used to describe the original Sierra Creative Interpreter, in any version. SCI0 will refer to all games using the SCI version 0.xxx, except for those games who use the 'in-between' game engine referred to as SCI01 (such as Quest for Glory 2). SCI1 will refer to the interpreter version 1.xxx. FreeSCI will refer specifically to either implementation details of the FreeSCI engine or to extensions of the original SCI engine specific to FreeSCI.

I would like to take this opportunity to thank the members of the FreeSCI and SCI Decoding Projects and their supporters, as well as Carl Muckenhoupt, who took the first steps of SCI decoding, for their valuable help and support.

Please note that some of the text contributions have been cut, reformatted or slightly modified in an attempt to improve the general quality of this document.


The basics

The Sierra Creative Interpreter is a stack-based virtual machine ("P-Machine"). In addition to its roughly 125 basic opcodes, it provides a set of extended functions for displaying graphics, playing sound, receiving input, writing and reading data to and from the hard disk, and handling complex arithmetical and logical functions. In version 0.xxx of the interpreter, Sierra split the game data into nine different types of information:


  • script data : SCI scripts and local data
  • vocab data : Parser data and debug information
  • patch data : Information pertaining to specific audio output devices
  • sound data : MIDI music tracks
  • cursor data : Mouse pointer shapes
  • view data : Sets of sets of image and hotspot information
  • pic data : Background images and metadata
  • font data : Bitmap fonts
  • text data : Plain text information


Each game may contain up to 1000 different elements of each data type; these elements are referred to as "resources". The index numbers of the various resources need not be in sequence; they are usually assigned arbitrarily. [1]

Resource storage

Individual resources can be stored in one of two ways: Either in resource files (which, surprisingly, are called something like "resource.000" or "resource.001"), or in external patch files (not to be confused with "patch" resources). The external files are called something like "pic.100" or "script.000", and they take precedence over data from resource files.

There is also a file called "resource.map", which contains a lookup table for the individual resources, and another file, "resource.cfg", which contains configuration information; neither of those is used by FreeSCI.

Resource information stored in external patch files is not compressed and therefore easily readable. It is, however, preceeded by two bytes: The first byte contains the resource type ORed with 0x80, the purpose of the second byte is unknown (but it appears to be ignored by the original SCI version 0 engine).

As stated before, external patch files take precedence over resource resource files. Applying those external files as patches is an option since FreeSCI version 0.2.2.

The resource files, however, are more complicated. Each of them contains a sequence of resources preceeded by a header; these resources may be compressed. It is, also, quite common to find resources shared by several resource files. The reason for this appears to be that that, back when hard disks were rare and hard to come by, the games had to be playable from floppy disks. To prevent unneccessary disk-jockeying, common stuff was placed in several resource files, each of which was then stored on one disk.


The individual resources: A summary

The resource types of SCI0 can be roughly grouped into four sets:

  • Graphics (pic, view, font, cursor)
  • Sound (patch, sound)
  • Logic (script, vocab)
  • Text

Text resources are nothing more than a series of ASCIIZ strings; but the other resources deserve further discussion.

Graphical resources summarized

The screen graphics are compromised of the four graphics resources. The background pictures are drawn using vector-oriented commands from at least one pic resource (several resources may be overlaid). The fact that vector graphics were used for SCI0 allows for several interesting picture quality improvements. Pic resources also include two additional "maps": The priority map, which marks parts of the pictures with a certain priority, so that other things with less priority can be fully or partially covered by them even if they are drawn at a later time, and the control map, which delimits the walking area and some special places used by the game logic. FreeSCI uses a fourth auxiliary map for during drawing time (this is a heritage from Carl Muckenhoupt's original code).

View resources contain most of the games' pixmaps (multi-color bitmaps). Each view contains a list of loops, and each loop contains a list of cels. The cels themselves contain the actual image information: RLE encoded pixmaps with transparency information, and relative offsets.

View resources are used for foreground images as well as for background images (for example, the "Spielburg" sign in QfG1 (EGA) is stored in a view resource and added to the background picture after it is drawn).

The cursor resource contains simple bitmaps for drawing the mouse pointer. It only allows for black, white, and transparent pixels in SCI0.

The fourth graphics resource is font data. It contains bitmapped fonts which are used to draw most of the text in the games. Text is used in one of four places: Text boxes, Text input fields, the title bar menu, and occasionally on-screen.


Sound resources summarized

SCI0 uses two types of resources for sound: Patch resources, and sound resources. Sound resources contain a rather simple header, and music data stored in a slightly modified version of the MIDI standard.

Patch resources contain device-dependant instrument mapping information for the instruments used in the sound resources. SCI0 sound resources do not adhere to the General Midi (GM) standard (which was, to my knowledge, written several years after the first SCI0 game was released), though later SCI versions may do so.


Logic resources summarized

Whenever the parser needs to look up a word, it looks for it in one of the vocab resources. This is not the sole purpose of the vocab resources, though; they provide information required by the debugger, including the help text for the debugger help menu and the names of the various SCI opcodes and kernel functions.

Script resources are the heart (or, rather, the brains) of the game. Consqeuently, they also are its most complex aspects, containing class and object information, local data, pointer relocation tables, and, of course, SCI bytecode.

To run the game, scripts are loaded on the SCI stack, their pointers are relocated appropriately, and their functions are executed by a virtual machine. They use a set of 0x7d opcodes, which may take either 8 or 16 bit parameters (so, effectively, there is twice the amount of commands). The functions may refer to global data, local temporary data, local function parameter data, or object data (selectors). They may, additionally, indirectly refer to "hunk" data, which is stored outside of the SCI heap. Since the whole design is object oriented, functions may re-use or overload the functions of their superclasses.


SCI01 extensions

SCI01 differs only in very few respects: It uses different compression algorithms (all of which are supported since FreeSCI 0.2.1), and a different type of sound resources, which may contain digitized sound effects (PCM data). The basic music data, however, still resembles MIDI data.

Also, scripts are split into two parts when loaded: A dynamic part, which resides in the heap as before, and a static part, which is stored externally to conserve heap space. [2]


SCI1 extensions

SCI1, which is not covered by FreeSCI at the moment, introduces new concepts like Palettes, scaled bitmap images and several new compression algorithms. In SCI1.0, the resource limit was first increased to 16383 [3], and then to 65535 in SCI1. Because of the inherent limitations of the FAT file system the primary target OS of Sierra's SCI interpreter was limited to, patch file names were altered accordingly, with the resource number (not padded) before the dot and a three-letter resource ID behind it; examples are "0.scr" or "100.v56".

The complete list of suffixes is as follows:


  • 80: v56: 256 color views
  • 81: p56: 256 color background pictures
  • 82: scr: Scripts (static data)
  • 83: tex: Texts (apparently deprecated in favor of messages)
  • 84: snd: Sound data (MIDI music)
  • 86(16)[4] :voc: Vocabulary (not used)
  • 87: fon: Fonts
  • 88: cur: Mouse cursors (deprecated in favor of v56-based cursors)
  • 89: pat: Audio patch files
  • 8a: bit: Bitmap files (purpose unknown)
  • 8b: pal: 256 color palette files
  • 8c: cda: CD Audio resources
  • 8d: aud: Audio resources (probably sound effects)
  • 8e: syn: Sync (purpose unknown)
  • 8f: msg: Message resources: Text plus metadata
  • 90: map: Map (purpose unknown)
  • 91: hep: Heap resources: Dynamic script data

Apparently, the script resource split introduced in SCI01 was incorporated into the actual resource layout in SCI1.

Notes

  1. With several notable exceptions, such as script 0 and most vocab resources.
  2. The background for this is that heap space started running out in Quest for Glory 2. In order to compensate for this, changes were made to both the script library and the interpreter.
  3. This appears to be the limit- none of the SCI1.0 games I tested used resource numbers beyond 16383
  4. Type 0x85 resources are 'memory' resources, which are only used internally.