SCI/Specifications/SCI virtual machine/Introduction

From ScummVM :: Wiki
< SCI‎ | Specifications‎ | SCI virtual machine
Revision as of 06:03, 7 January 2009 by Timofonic (talk | contribs) (Merging of the SCI documentation)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Introduction

Script resources

Like any processor, the SCI virtual machine is virtually useless without code to execute. This code is provided by script resources, which constitute the logic behind any SCI game.

In order to operate on the script resource, those first have to be loaded to the heap. The heap is the only memory space that the VM can work on directly (with some restrictions); all other memory spaces have to be used implicitly or explicitly by using kernel calls. The heap also contains a stack, which is heavily used by SCI bytecode.

Each script resource may contain one or several of various script objects, listed here:

  • Type 1: Object
  • Type 2: Code
  • Type 3: Synonym word lists
  • Type 4: Said specs
  • Type 5: Strings
  • Type 6: Class
  • Type 7: Exports
  • Type 8: Relocation table
  • Type 9: Preload text (a flag, rather than a real section)
  • Type 10: Local variables

Standard SCI0 scripts (of post-0.000.396 SCI0, approximately) consist of a four-byte header, followed by a list of bytes:

  • [00][01]: Block type as LE 16 bit value, or 0 to terminate script resource
  • [02][03]: Block size as LE 16 bit value; includes header size
  • [04].@.@.@: Data

The code blocks contain the SCI bytecode that actually gets executed. The export block (of which there may be only one (or none at all)) contains script-relative pointers to exported functions, which can be called by the SCI operations calle and callb. The local variables block, which stores one of the four variable types, is used to share variables among the objects and classes of one script.

But the most important script members are Objects and Classes. As in the usual OOP terms, Classes refer to object prototypes, and Objects are instantiated Classes. However, unlike most OOP languages, SCI treats the base class very similar to objects, so that they may actually get called by the SCI bytecode. Therefore, they also have their own space for selectors (see below). Also, each object or class knows which class it inherits from and which class it was instantiated from (in the case of objects).

Note that all script segments are optional and 16 bit aligned; they are described in more detail below:


Object segments

Objects look like this (LE 16 bit values):

  • [00][01]: Magic number 0x1234
  • [02][03]: Local variable offset (filled in at run-time)
  • [04][05]: Offset of the function selector list, relative to its own position
  • [06][07]: Number of variable selectors (= #vs)
  • [08][09]: The 'species' selector
  • [0a][0b]: The 'superClass' selector
  • [0c][0d]: The '-info-' selector
  • [0e][0f]: The 'name' selector (object/class name)
  • [10].@.@.@: (#vs-4) more variable selectors
  • [08+@ #vs*2][09+@ #vs*2]: Number of function selectors (= #fs)
  • [0a+@ #vs*2].@.@.@: Selector IDs for the functions
  • [08+@ #vs*2 +@ #fs*2][09+@ #vs*2 +@ #fs*2]zero
  • [0a+@ #vs*2 +@ #fs*2].@.@.@: Function selector code pointers

For objects, the selectors are simply values for the selector IDs specified in their species class (which is either present by its offset (in-memory) or class ID (in-script)- the same for the species' superclass (superClass selector)). Info typically has one of the following values (although this does not appear to be relevant for SCI):

  • 0x0000: Normal (statical) object
  • 0x0001: Clone
  • 0x8000: Class Other values are used, but do not appear to be of relevance.[1]


Code segments

Code segments contain free-form SCI bytecode. Pointers into this code are held by objects, classes, and export entries; these entries are, in turn, referenced in the export segment.


Synonym word list segments

Inside these, synonyms for certain words may be found. A synonym is a tuple (a, b), where both a and b are word groups, and b is the replacement for a if this synonym is in use. They are stored as 16 bit LE values in sequence (first a, then b). Synonyms must be set explicitly by the kernel function SetSynonyms() (as described Section 5.5.2.39). It is not possible to select synonyms selectively.


Said spec segments

This section contains said specs (explained in Section 6.2.4), tightly grouped.

String segments

This segment contains a sequence of asciiz strings describing class and object names, debug information, and (occasionally) game text.

Class segments

Classes look similar to objects:

  • [00][01]: Magic number 0x1234
  • [02][03]: Local variable offset (filled in at run-time)
  • [04][05]: Offset of the function selector list, relative to its own position
  • [06][07]: Number of variable selectors (= #vs)
  • [08][09]: The 'species' selector
  • [0a][0b]: The 'superClass' selector
  • [0c][0d]: The '-info-' selector
  • [0e][0f]: The 'name' selector (object/class name)
  • [10].@.@.@: (#vs-4) more variable selectors
  • [08+@ #vs*2][09+@ #vs*2]: Selector ID of the first varselector (0)
  • [0a+@ #vs*2].@.@.@: Selector ID of the second etc. varselectors
  • [08+@ #vs*4][09+@ #vs*4]: Number of function selectors (#fs)
  • [0a+@ #vs*4].@.@.@: Function selector code pointers
  • [08+@ #vs*4 +@ #fs*2][09+@ #vs*4 +@ #fs*2]: 0
  • [0a+@ #vs*4 +@ #fs*2].@.@.@: Selector ID of the first etc. funcselectors

Simply put, they look like objects with each selector section followed by a list of selector IDs.

Export segments

External symbols are contained herein, the number of which is described by the first (16 bit LE) value in the segment. All the values that follow point to addresses that the program counter will jump to when a calle operation is invoked. An exception is script 0, entry 0, which points to the first object whose ’play’ method should be invoked during startup (a magical entry point like C’s ’main())’ function).

Relocation tables

This section contains script-relative pointers pointing to pointers inside the script. These refer to script-relative addresses and need to be relocated when the script is loaded to the heap; this is done by adding the offset of the first byte of the script on the heap to each of the values referenced in this section[2] The section itself starts with a 16 bit LE value containing the number of pointers that follow, with each of the script-relative 16 bit pointers beyond having semantics as described above.

The Preload Text flag

This is an actual script section, although it is always of size 4 (i.e. only consists of the script header). It is only checked for presence; if script.x is loaded and contains this section, the text.x resource is also loaded implicitly along with it.[3]

Local variable segments

This section contains the script’s local variable segment, which consists of a sequence of 16 bit little-endian values.

  1. See SQ3’s inventory objects for an example
  2. Thanks to Francois Boyer for this information.
  3. This is ignored by FreeSCI at this moment, since all resources are present in memory all the time.