SCI/Specifications/SCI in action/The message subsystem

From ScummVM :: Wiki
Jump to navigation Jump to search

Document conversion incomplete. Work in progress.

The message subsystem

The message subsystem developed out of a desire to lessen the amount of coordinative work between dialogue writers and programmers. The text resource of early SCI suffered from the limitation of using a tuple ⟨module, message-id⟩[1] as index into the text resources. Worse, text resources were often generated by using special syntax in the source code, thus making the ordering prone to change as the code was extended and reorganized.

In 1990, Sierra had released its first icon-driven game, King's Quest V. The icon-based approach required a new approach to event handling. Each clickable object would be represented as an SCI instance. When an object was clicked, a method the corresponding instance would be called with a parameter specifying which icon was used. The action linked to each icon was called a verb, and the method doVerb.

Later, when creating the message interface, it was natural to re-use this notion, and couple the module and verb with a unique number for each clickable object (since the instance addresses are unpredictable). Although later versions of the message system were more complicated, these are the essentials of its first iteration.

Soon introduced was the concept of stage directions. They are directions to the voice talents in CD-ROM games. The important bit here is that the stage directions are still present in the shipped message files, and the interpreter must know how to remove them. Any string in parentheses not containing any lower case characters or digits including any whitespace following it is considered to be stage directions and is stripped before the script sees it. Character escapes (either in the form of literal-character escapes, such as \( , or escapes using the ASCII value in hex, such as \30 ) were supported in some versions.

Many versions of the message resource also support longer comments, meant for writer/coder communication. It is unclear how to derive their offsets in the resource, though.

The writers soon realised that the indexing model presented above was too simplistic. Developments in the game plot were still not handled adequately by the system, and required programmer assistance. In addition, the response to each action had to fit in one message box. Therefore, Sierra's programmers added two fields to the indexing model, namely condition (sometimes known as case) and sequence. The condition signified the state of the noun with respect to the game plot, in a manner of speaking. The sequence number allowed writers to write more than one screenful of text. Also, a piece of satellite data was introduced, namely the talker. The game might use this to display the face of the speaker on screen. One talker value was reserved for narrated parts, which don't display a face (but this is game-specific, and really outside the domain of the interpreter).

Even later, recursion was added. Actually, two kinds of recursion, which it is necessary to distinguish, were added. One involved resource-internal recursion, in which the writer decides to re-use a part of another dialogue by including a reference to it. This type of recursion was limited, though; it was impossible to refer to other modules, and the reference always pointed to the first message in a sequence. The other kind of recursion was controlled by the script, and was useful for such things as cut-scenes. The two types of recursion could be mixed freely, which is why there are both message stacks and message stack stacks in Sierra SCI (no kidding! see the included error message file INTERP.ERR or SIERRA.ERR).

Ties to audio

Of course, the story does not end there. CD-ROM games contain audio, and having two addressing schemes would have been a mess. So naturally, the same scheme was used. However, this poses another problem: Individual resources are usually addressed by just a type and a number, not by a message tuple like the one we saw above. Sierra's solution was to add a new resource type, other resource files and maps beside the main one. The extra resource files are called something like RESOURCE.AUD and RESOURCE.SFX, and their maps are contained in map resources, either in the main resource file or separately as patches. The resource type is called audio36, and there is a sync36 type as well which provides cueing capabilities to these resources (like sync does to ordinary audio resources, see section ???).

SCI message resources and their capabilities

Version StD Cond/Seq Rec SRec
early[2]
2.101 V
3.340 V V V
3.411 V V V
4.000 V V V
4.010 V V V
4.211 V V V
4.321 V V V V
5.000 V V V V


The maps are indexed by module (room number), so that 100.map contains map entries for all the message tuples that have the module number 100. Map number 65535 (216-1 ) is special - it indexes ordinary audio resources.

To patch resources of this kind, Sierra used a base-36 encoding of the message tuple as the file name. Since this results in oddly looking names, the patch files, if any, are usually stored in a separate directory on the CD.

File formats

As can be seen from figure 6.1, the message file format and interfaces changed quite a bit over time. Interestingly, as perhaps the only part of the SCI system, message resource files incorporated a version number, with one exception. It is marked 'early' in the table. It is still possible to discern them from a corrupt resource, though.

The version numbers given in the table were divided by 1000 to yield an real-numbered representation; thus, the message format represented as 2.101 had a version tag of 0x835.

All versions can be said to follow the general pattern given below; on the following pages, specific file formats are given for each version.

  • HEADER
  • MESSAGE OFFSETS
  • ACTUAL TEXT
  • COMMENTS/DEBUG

early

The exact file format is still not known, but seems to be the same as 2.101, without either the version number or the zero (Drantin?).

Version 2.101

The message resource begins with a 6-byte header, laid out thus:

Offset Size (bytes) Description
0 2 Version number (==0x835)
2 2 Always zero
4 2 Number of messages in file (n)

The n offset records are laid out as follows:

Offset Size (bytes) Description
0 1 Noun
1 1 Verb
2 2 Offset to text (from beginning of resource)

Version 3.411

The message resource begins with an 8-byte header, laid out thus:


Offset Size (bytes) Description
0 2 Version number (== 0xd53)
2 2 Always zero
4 2 Pointer to first byte past text data, not counting this header
6 2 Number of messages in file (n)

The n offset records are laid out as follows:

Offset Size (bytes) Description
0 1 Noun
1 1 Verb
2 1 Condition
3 1 Sequence
4 1 Talker
5 2 Offset to text (from beginning of resource)
7 3 Unknown

Version 4.010

The message resource begins with an 8-byte header, laid out thus:


Offset Size (bytes) Description
0 2 Version number (== 0xfaa)
2 2 Always zero
4 2 Pointer to first byte past text data, not counting this header
6 2 Number of messages in file (n)

The n offset records are laid out as follows:


Offset Size (bytes) Description
0 1 Noun
1 1 Verb
2 1 Condition
3 1 Sequence
4 1 Talker
5 2 Offset to text (from beginning of resource)
7 1 Noun of referenced message
8 1 Verb of referenced message
9 1 Condition of referenced message

The reference fields are set to zero if no reference is intended.


Notes

  1. While module was traditionally called room or resource number, I have choosen to use the terminology of the later message interface here.
  2. May not include a version number (needs confirmation)