Difference between revisions of "SCI/Specifications/SCI virtual machine/The Sierra PMachine"

From ScummVM :: Wiki
Jump to navigation Jump to search
m (Fixed wiki syntax typos)
(Updated `lea` bytes)
 
(12 intermediate revisions by 7 users not shown)
Line 31: Line 31:


;IP           
;IP           
:The instruction pointer.10 Points to the currently executing instruction
:The instruction pointer. Points to the currently executing instruction. In ScummVM this is called the "Program Counter" or PC, which is the more general term.
Vars an array of 4 values, pointing to the current variables of each mentioned type Object points to the currently executing object.
 
;Vars
:An array of four values, each pointing to the current variables of each mentioned type
 
;Object
:Points to the currently executing object.


;SP
;SP
:The current stack pointer. Note that the stack in the original SCI interpreter is used
:The current stack pointer. Note that the stack in the original SCI interpreter is used bottom-up instead of the more usual top-down.
bottom-up instead of the more usual top-down.


The PMachine, apart from the actual instruction pointer, keeps a record of which object is currently executing.
The PMachine, apart from the actual instruction pointer, keeps a record of which object is currently executing.


==The instruction set==
==The instruction set==
Line 56: Line 59:
Certain instructions (in particular, branching ones) take relative addresses as a parameter. The actual address is calculated based on the instruction after the branching instruction itself. In this example, the <tt>bnt</tt> instruction, if the branch is made, jumps over the <tt>ldi</tt> instruction.
Certain instructions (in particular, branching ones) take relative addresses as a parameter. The actual address is calculated based on the instruction after the branching instruction itself. In this example, the <tt>bnt</tt> instruction, if the branch is made, jumps over the <tt>ldi</tt> instruction.


<syntax type="assembler">
<syntaxhighlight lang="asm">
     eq?
     eq?
     bnt +2
     bnt +2
     ldi byte 2
     ldi byte 2
     push
     push
</syntax>
</syntaxhighlight>


Relative addresses are signed values.
Relative addresses are signed values.


===Dispatch addresses==
===Dispatch addresses===
The <tt>callb</tt> and <tt>calle</tt> instructions take a so-called dispatch index as a parameter. This index is used to look up an actual script address, using the so-called dispatch table. The dispatch table is located in script block type 7 in the script file. It is a series of words - the first one, as in so many other places in the script file, is the number of entries.
The <tt>callb</tt> and <tt>calle</tt> instructions take a so-called dispatch index as a parameter. This index is used to look up an actual script address, using the so-called dispatch table. The dispatch table is located in script block type 7 in the script file. It is a series of words - the first one, as in so many other places in the script file, is the number of entries.


Line 71: Line 74:
In every call instruction, a value is included which determines the size of the parameter list, as an offset into the stack. This value discounts the list size pushed by the SCI code. For instance, consider this example from real SCI code:
In every call instruction, a value is included which determines the size of the parameter list, as an offset into the stack. This value discounts the list size pushed by the SCI code. For instance, consider this example from real SCI code:


<syntax type="assembler">
<syntaxhighlight lang="asm">
     pushi 3 ; three parameters passed
     pushi 3 ; three parameters passed
     pushi 4 ; the screen flag
     pushi 4 ; the screen flag
Line 77: Line 80:
     pTos y ; push the y property
     pTos y ; push the y property
     callk OnControl, 6
     callk OnControl, 6
</syntax>
</syntaxhighlight>


Notice that, although the <tt>callk</tt> line specifies 6 bytes of parameters, the kernel routine has access to the list size (which is at offset 8)!
Notice that, although the <tt>callk</tt> line specifies 6 bytes of parameters, the kernel routine has access to the list size (which is at offset 8)!
Line 95: Line 98:




<syntax type="C">
<syntaxhighlight lang="C">
pop(): sp -= 2; return *sp;
pop(): sp -= 2; return *sp;
push(x): *sp = x; sp += 2; return x;
push(x): *sp = x; sp += 2; return x;
</syntax>
</syntaxhighlight>


The following rules apply to opcodes:
The following rules apply to opcodes:
Line 113: Line 116:
;<nowiki>op 0x01: bnot (1 byte)</nowiki>
;<nowiki>op 0x01: bnot (1 byte)</nowiki>
:Binary not
:Binary not
<syntax type="C">
<syntaxhighlight lang="C">
acc ^= 0xffff;
acc ^= 0xffff;
</syntax>
</syntaxhighlight>




Line 121: Line 124:
;<nowiki>op 0x03: add (1 byte)</nowiki>
;<nowiki>op 0x03: add (1 byte)</nowiki>
:Addition:
:Addition:
<syntax type="C">
<syntaxhighlight lang="C">
acc += pop();
acc += pop();
</syntax>
</syntaxhighlight>




Line 129: Line 132:
;<nowiki>op 0x05: sub (1 byte)</nowiki>
;<nowiki>op 0x05: sub (1 byte)</nowiki>
:Subtraction:
:Subtraction:
<syntax type="C">
<syntaxhighlight lang="C">
acc = pop() - acc;
acc = pop() - acc;
</syntax>
</syntaxhighlight>




Line 137: Line 140:
;<nowiki>op 0x07: mul (1 byte)</nowiki>
;<nowiki>op 0x07: mul (1 byte)</nowiki>
:Multiplication:
:Multiplication:
<syntax type="C">
<syntaxhighlight lang="C">
acc *= pop();
acc *= pop();
</syntax>
</syntaxhighlight>




Line 145: Line 148:
;<nowiki>op 0x09: div (1 byte)</nowiki>
;<nowiki>op 0x09: div (1 byte)</nowiki>
:Division:
:Division:
<syntax type="C">
<syntaxhighlight lang="C">
acc = pop() / acc;
acc = pop() / acc;
</syntax>
</syntaxhighlight>
Division by zero is caught => acc = 0.
Division by zero is caught => acc = 0.


Line 154: Line 157:
;<nowiki>op 0x0b: mod (1 byte)</nowiki>
;<nowiki>op 0x0b: mod (1 byte)</nowiki>
:Modulo:
:Modulo:
<syntax type="C">
<syntaxhighlight lang="C">
acc = pop() % acc;
acc = pop() % acc;
</syntax>
</syntaxhighlight>
Modulo by zero is caught => acc = 0.
Modulo by zero is caught => acc = 0.


Line 163: Line 166:
;<nowiki>op 0x0d: shr (1 byte)</nowiki>
;<nowiki>op 0x0d: shr (1 byte)</nowiki>
:Shift Right logical:
:Shift Right logical:
<syntax type="C">
<syntaxhighlight lang="C">
acc = pop() >> acc;
acc = pop() >> acc;
</syntax>
</syntaxhighlight>




Line 171: Line 174:
;<nowiki>op 0x0f: shl (1 byte)</nowiki>
;<nowiki>op 0x0f: shl (1 byte)</nowiki>
:Shift Left logical:
:Shift Left logical:
<syntax type="C">
<syntaxhighlight lang="C">
acc = pop() << acc;
acc = pop() << acc;
</syntax>
</syntaxhighlight>




Line 179: Line 182:
;<nowiki>op 0x11: xor (1 byte)</nowiki>
;<nowiki>op 0x11: xor (1 byte)</nowiki>
:Exclusive or:
:Exclusive or:
<syntax type="C">
<syntaxhighlight lang="C">
acc ^= pop();
acc ^= pop();
</syntax>
</syntaxhighlight>




Line 187: Line 190:
;<nowiki>op 0x13: and (1 byte)</nowiki>
;<nowiki>op 0x13: and (1 byte)</nowiki>
:Logical and:
:Logical and:
<syntax type="C">
<syntaxhighlight lang="C">
acc &= pop();
acc &= pop();
</syntax>
</syntaxhighlight>




Line 195: Line 198:
;<nowiki>op 0x15: or (1 byte)</nowiki>
;<nowiki>op 0x15: or (1 byte)</nowiki>
:Logical or:
:Logical or:
<syntax type="C">
<syntaxhighlight lang="C">
acc |= pop();
acc |= pop();
</syntax>
</syntaxhighlight>




Line 203: Line 206:
;<nowiki>op 0x17: neg (1 byte)</nowiki>
;<nowiki>op 0x17: neg (1 byte)</nowiki>
:Sign negation:
:Sign negation:
<syntax type="C">
<syntaxhighlight lang="C">
acc = -acc;
acc = -acc;
</syntax>
</syntaxhighlight>




Line 211: Line 214:
;<nowiki>op 0x19: not (1 byte)</nowiki>
;<nowiki>op 0x19: not (1 byte)</nowiki>
:Boolean not:
:Boolean not:
<syntax type="C">
<syntaxhighlight lang="C">
acc = !acc;
acc = !acc;
</syntax>
</syntaxhighlight>




Line 219: Line 222:
;<nowiki>op 0x1b: eq? (1 byte)</nowiki>
;<nowiki>op 0x1b: eq? (1 byte)</nowiki>
:Equals?:
:Equals?:
<syntax type="C">
<syntaxhighlight lang="C">
prev = acc;
prev = acc;
acc = (acc == pop());
acc = (acc == pop());
</syntax>
</syntaxhighlight>




Line 228: Line 231:
;<nowiki>op 0x1d: ne? (1 byte)</nowiki>
;<nowiki>op 0x1d: ne? (1 byte)</nowiki>
:Is not equal to?
:Is not equal to?
<syntax type="C">
<syntaxhighlight lang="C">
prev = acc;
prev = acc;
acc = !(acc == pop());
acc = !(acc == pop());
</syntax>
</syntaxhighlight>




Line 237: Line 240:
;<nowiki>op 0x1f: gt? (1 byte)</nowiki>
;<nowiki>op 0x1f: gt? (1 byte)</nowiki>
:Greater than?
:Greater than?
<syntax type="C">
<syntaxhighlight lang="C">
prev = acc;
prev = acc;
acc = (pop() > acc);
acc = (pop() > acc);
</syntax>
</syntaxhighlight>




Line 246: Line 249:
;<nowiki>op 0x21: ge? (1 byte)</nowiki>
;<nowiki>op 0x21: ge? (1 byte)</nowiki>
:Greater than or equal to?
:Greater than or equal to?
<syntax type="C">
<syntaxhighlight lang="C">
prev = acc;
prev = acc;
acc = (pop() >= acc);
acc = (pop() >= acc);
</syntax>
</syntaxhighlight>




Line 255: Line 258:
;<nowiki>op 0x23: lt? (1 byte)</nowiki>
;<nowiki>op 0x23: lt? (1 byte)</nowiki>
:Less than?
:Less than?
<syntax type="C">
<syntaxhighlight lang="C">
prev = acc;
prev = acc;
acc = (pop() < acc);
acc = (pop() < acc);
</syntax>
</syntaxhighlight>




Line 264: Line 267:
;<nowiki>op 0x25: le? (1 byte)</nowiki>
;<nowiki>op 0x25: le? (1 byte)</nowiki>
:Less than or equal to?
:Less than or equal to?
<syntax type="C">
<syntaxhighlight lang="C">
prev = acc;
prev = acc;
acc = (pop() <= acc);
acc = (pop() <= acc);
</syntax>
</syntaxhighlight>




Line 273: Line 276:
;<nowiki>op 0x27: ugt? (1 byte)</nowiki>
;<nowiki>op 0x27: ugt? (1 byte)</nowiki>
:Unsigned: Greater than?
:Unsigned: Greater than?
<syntax type="C">
<syntaxhighlight lang="C">
acc = (pop() > acc);
acc = (pop() > acc);
</syntax>
</syntaxhighlight>




Line 281: Line 284:
;<nowiki>op 0x29: uge? (1 byte)</nowiki>
;<nowiki>op 0x29: uge? (1 byte)</nowiki>
:Unsigned: Greather than or equal to?
:Unsigned: Greather than or equal to?
<syntax type="C">
<syntaxhighlight lang="C">
acc = (pop() >= acc);
acc = (pop() >= acc);
</syntax>
</syntaxhighlight>




Line 289: Line 292:
;<nowiki>op 0x2b: ult? (1 byte)</nowiki>
;<nowiki>op 0x2b: ult? (1 byte)</nowiki>
:Unsigned: Less than?
:Unsigned: Less than?
<syntax type="C">
<syntaxhighlight lang="C">
acc = (pop() < acc);
acc = (pop() < acc);
</syntax>
</syntaxhighlight>




Line 297: Line 300:
;<nowiki>op 0x2d: ule? (1 byte)</nowiki>
;<nowiki>op 0x2d: ule? (1 byte)</nowiki>
:Unsigned: Less than or equal to?
:Unsigned: Less than or equal to?
<syntax type="C">
<syntaxhighlight lang="C">
acc = (pop() >= acc);
acc = (pop() >= acc);
</syntax>
</syntaxhighlight>




Line 305: Line 308:
;<nowiki>op 0x2f: bt B relpos (2 bytes)</nowiki>
;<nowiki>op 0x2f: bt B relpos (2 bytes)</nowiki>
:Branch relative if true
:Branch relative if true
<syntax type="C">
<syntaxhighlight lang="C">
if (acc) pc += relpos;
if (acc) pc += relpos;
</syntax>
</syntaxhighlight>




Line 313: Line 316:
;<nowiki>op 0x31: bnt B relpos (2 bytes)</nowiki>
;<nowiki>op 0x31: bnt B relpos (2 bytes)</nowiki>
:Branch relative if not true
:Branch relative if not true
<syntax type="C">
<syntaxhighlight lang="C">
if (!acc) pc += relpos;
if (!acc) pc += relpos;
</syntax>
</syntaxhighlight>




Line 321: Line 324:
;<nowiki>op 0x33: jmp B relpos (2 bytes)</nowiki>
;<nowiki>op 0x33: jmp B relpos (2 bytes)</nowiki>
:Jump
:Jump
<syntax type="C">
<syntaxhighlight lang="C">
pc += relpos;
pc += relpos;
</syntax>
</syntaxhighlight>




Line 329: Line 332:
;<nowiki>op 0x35: ldi B data (2 bytes)</nowiki>
;<nowiki>op 0x35: ldi B data (2 bytes)</nowiki>
:Load data immediate
:Load data immediate
<syntax type="C">
<syntaxhighlight lang="C">
acc = data;
acc = data;
</syntax>
</syntaxhighlight>
:Sign extension is done for 0x35 if required.
:Sign extension is done for 0x35 if required.


Line 338: Line 341:
;<nowiki>op 0x37: push (1 byte)</nowiki>
;<nowiki>op 0x37: push (1 byte)</nowiki>
:Push to stack
:Push to stack
<syntax type="C">
<syntaxhighlight lang="C">
push(acc)
push(acc)
</syntax>
</syntaxhighlight>




Line 346: Line 349:
;<nowiki>op 0x39: pushi B data (2 bytes)</nowiki>
;<nowiki>op 0x39: pushi B data (2 bytes)</nowiki>
:Push immediate
:Push immediate
<syntax type="C">
<syntaxhighlight lang="C">
push(data)
push(data)
</syntax>
</syntaxhighlight>
:Sign extension for 0x39 is performed where required.
:Sign extension for 0x39 is performed where required.


Line 355: Line 358:
;<nowiki>op 0x3b: toss (1 byte)</nowiki>
;<nowiki>op 0x3b: toss (1 byte)</nowiki>
:TOS subtract
:TOS subtract
<syntax type="C">
<syntaxhighlight lang="C">
pop();
pop();
</syntax>
</syntaxhighlight>
:For confirmation: Yes, this simply tosses the TOS value away.
:For confirmation: Yes, this simply tosses the TOS value away.


Line 364: Line 367:
;<nowiki>op 0x3d: dup (1 byte)</nowiki>
;<nowiki>op 0x3d: dup (1 byte)</nowiki>
:Duplicate TOS element
:Duplicate TOS element
<syntax type="C">
<syntaxhighlight lang="C">
push(*TOS);
push(*TOS);
</syntax>
</syntaxhighlight>




;<nowiki>op 0x3e: link W size (3 bytes)</nowiki>
;<nowiki>op 0x3e: link W size (3 bytes)</nowiki>
;<nowiki>op 0x3f: link B size (2 bytes)</nowiki>
;<nowiki>op 0x3f: link B size (2 bytes)</nowiki>
<syntax type="C">
<syntaxhighlight lang="C">
sp += (size * 2);
sp += (size * 2);
</syntax>
</syntaxhighlight>




Line 380: Line 383:
:Call inside script.
:Call inside script.
:(See description below)
:(See description below)
<syntax type="C">
<syntaxhighlight lang="C">
sp -= (framesize + 2 + &rest_modifier);
sp -= (framesize + 2 + &rest_modifier);
&rest_modifier = 0;
&rest_modifier = 0;
</syntax>
</syntaxhighlight>
:This calls a script subroutine at the relative position relpos, setting up the <tt>ParmVar</tt> pointer first. <tt>ParmVar points</tt> to sp-<tt>''framesize''</tt> (but see also the <tt>&rest</tt> operation). The number of parameters is stored at word offset <tt>-1</tt> relative to <tt>ParmVar</tt>.
:This calls a script subroutine at the relative position relpos, setting up the <tt>ParmVar</tt> pointer first. <tt>ParmVar points</tt> to sp-<tt>''framesize''</tt> (but see also the <tt>&rest</tt> operation). The number of parameters is stored at word offset <tt>-1</tt> relative to <tt>ParmVar</tt>.


Line 390: Line 393:
;<nowiki>op 0x42: callk W kfunct, B kparams (4 bytes)</nowiki>
;<nowiki>op 0x42: callk W kfunct, B kparams (4 bytes)</nowiki>
;<nowiki>op 0x43: callk B kfunct, B kparams (3 bytes)</nowiki>
;<nowiki>op 0x43: callk B kfunct, B kparams (3 bytes)</nowiki>
:Call kernel function (see the Kernel functions section)
:Call kernel function (see the [[SCI/Specifications/SCI virtual machine/Kernel functions|Kernel functions]] section)
<syntax type="C">
<syntaxhighlight lang="C">
sp -= (kparams + 2 + &rest_modifier);
sp -= (kparams + 2 + &rest_modifier);
&rest_modifier = 0;
&rest_modifier = 0;
(call kernel function kfunct)
(call kernel function kfunct)
</syntax>
</syntaxhighlight>




Line 402: Line 405:
:Call base script
:Call base script
:(See description below)
:(See description below)
<syntax type="C">
<syntaxhighlight lang="C">
sp -= (framesize + 2 + &rest_modifier);
sp -= (framesize + 2 + &rest_modifier);
&rest_modifier = 0;
&rest_modifier = 0;
</syntax>
</syntaxhighlight>
:This operation starts a new execution loop at the beginning of script 0, public method <tt>dispindex</tt> (Each script comes with a dispatcher list (type 7) that identifies public methods). Parameters are handled as in the call operation.
:This operation starts a new execution loop at the beginning of script 0, public method <tt>dispindex</tt> (Each script comes with a dispatcher list (type 7) that identifies public methods). Parameters are handled as in the call operation.




;<nowiki>op 0x46: calle W script, W dispindex, B framesize (5 bytes)</nowiki>
;<nowiki>op 0x46: calle W script, W dispindex, B framesize (6 bytes)</nowiki>
;<nowiki>op 0x47: calle B script, B dispindex, B framesize (4 bytes)</nowiki>
;<nowiki>op 0x47: calle B script, B dispindex, B framesize (4 bytes)</nowiki>
:Call external script
:Call external script
:(See description below)
:(See description below)
<syntax type="C">
<syntaxhighlight lang="C">
sp -= (framesize + 2 + &rest_modifier);
sp -= (framesize + 2 + &rest_modifier);
&rest_modifier = 0;
&rest_modifier = 0;
</syntax>
</syntaxhighlight>
:This operation performs a function call (implicitly placing the current program counter on the execution stack) to an ``external'' procedure of a script. More precisely, exported procedure <tt>dispindex</tt> of script <tt>script</tt> is invoked, where <tt>dispindex</tt> is an offset into the script's Exports list (i.e., <tt>dispindex = ''n'' * 2</tt> references the ''n''th exported procedure).
:This operation performs a function call (implicitly placing the current program counter on the execution stack) to an ``external'' procedure of a script. More precisely, exported procedure <tt>dispindex</tt> of script <tt>script</tt> is invoked, where <tt>dispindex</tt> is an offset into the script's Exports list (i.e., <tt>dispindex = ''n'' * 2</tt> references the ''n''th exported procedure).
:The ``Exports list'' is defined in the script's type 7 object (cf. section Script resources). It is an error to invoke a script which does not exist or which does not provide an Exports list, or to use a dispatch index which does not point into an even address within the Exports list.
:The ``Exports list'' is defined in the script's type 7 object (cf. section [[SCI/Specifications/SCI_virtual_machine/Introduction#Script_resources|Script resources]]). It is an error to invoke a script which does not exist or which does not provide an Exports list, or to use a dispatch index which does not point into an even address within the Exports list.




Line 431: Line 434:
:Send looks up the supplied selector(s) in the object pointed to by the accumulator. If the selector is a variable selector, it is read (to the accumulator) if it was sent for with zero parameters. If a parameter was supplied, this selector is set to that parameter. Method selectors are called with the specified parameters.
:Send looks up the supplied selector(s) in the object pointed to by the accumulator. If the selector is a variable selector, it is read (to the accumulator) if it was sent for with zero parameters. If a parameter was supplied, this selector is set to that parameter. Method selectors are called with the specified parameters.
:The selector(s) and parameters are retreived from the stack frame. Send first looks up the selector ID at the bottom of the frame, then retreives the number of parameters, and, eventually, the parameters themselves. This algorithm is iterated until all of the stack frame has been "used up". Example:
:The selector(s) and parameters are retreived from the stack frame. Send first looks up the selector ID at the bottom of the frame, then retreives the number of parameters, and, eventually, the parameters themselves. This algorithm is iterated until all of the stack frame has been "used up". Example:
<syntax type="assembler">
<syntaxhighlight lang="asm">
; This is an example for usage of the SCI send operation
; This is an example for usage of the SCI send operation
   pushi x      ; push the selector ID of x
   pushi x      ; push the selector ID of x
Line 443: Line 446:
   push0        ; This will read foo and return the value in acc.
   push0        ; This will read foo and return the value in acc.
   send 12      ; This operation does three quite different things.
   send 12      ; This operation does three quite different things.
</syntax>
</syntaxhighlight>




Line 480: Line 483:
:<tt>function a(y,z,...) and function b(x,y,z,...)</tt>
:<tt>function a(y,z,...) and function b(x,y,z,...)</tt>
:Since lsp does not support register indirection, we can't just push the variables in a loop (as we would in C). Instead this function is used. In this case, the instruction would be &rest 2, since we want the copying to start from y (inclusive), the second parameter.
:Since lsp does not support register indirection, we can't just push the variables in a loop (as we would in C). Instead this function is used. In this case, the instruction would be &rest 2, since we want the copying to start from y (inclusive), the second parameter.
:Note that the values are copied to the stack '''immediately'''. The <tt><nowiki>&rest_@modifier</nowiki></tt> is set to the number of variables pushed afterwards.
:Note that the values are copied to the stack '''immediately'''. The <tt><nowiki>&rest_modifier</nowiki></tt> is set to the number of variables pushed afterwards.






;<nowiki>op 0x5a: lea W type, W index ( bytes)</nowiki>
;<nowiki>op 0x5a: lea W type, W index (5 bytes)</nowiki>
;<nowiki>op 0x5b: lea B type, B index ( bytes)</nowiki>
;<nowiki>op 0x5b: lea B type, B index (3 bytes)</nowiki>
:Load Effective Address
:Load Effective Address
:The variable type is a bit-field used as follows:
:The variable type is a bit-field used as follows:
Line 501: Line 504:
::set if the accumulator is to be used as additional index  
::set if the accumulator is to be used as additional index  
:::Because it is so hard to explain, I have made a transcription of it here:
:::Because it is so hard to explain, I have made a transcription of it here:
<syntax type="C">
<syntaxhighlight lang="c">
short *vars[4];
short *vars[4];


Line 508: Line 511:
int lea(int vt, int vi)
int lea(int vt, int vi)
{
{
   return &((vars[(vt >> 1) &amp; 3])[vt &amp; 0x10 ? vi+acc : vi]);
   return &((vars[(vt >> 1) & 3])[vt & 0x10 ? vi+acc : vi]);
}
}
</syntax>
</syntaxhighlight>




Line 516: Line 519:
;<nowiki>op 0x5d: selfID (1 bytes)</nowiki>
;<nowiki>op 0x5d: selfID (1 bytes)</nowiki>
:Get 'self' identity: SCI uses heap pointers to identify objects, so this operation sets the accumulator to the address of the current object.
:Get 'self' identity: SCI uses heap pointers to identify objects, so this operation sets the accumulator to the address of the current object.
<syntax type="C">acc = object</syntax>
<syntaxhighlight lang="C">acc = object</syntaxhighlight>




Line 527: Line 530:
;<nowiki>op 0x61: pprev (1 bytes)</nowiki>
;<nowiki>op 0x61: pprev (1 bytes)</nowiki>
:Push prev: Pushes the value of the prev register, set by the last comparison bytecode (eq?, lt?, etc.), on the stack.
:Push prev: Pushes the value of the prev register, set by the last comparison bytecode (eq?, lt?, etc.), on the stack.
<syntax type="C">push(prev)</syntax>
<syntaxhighlight lang="C">push(prev)</syntaxhighlight>




Line 573: Line 576:
;<nowiki>op 0x73: lofsa B offset (2 bytes)</nowiki>
;<nowiki>op 0x73: lofsa B offset (2 bytes)</nowiki>
:Load Offset to Accumulator:
:Load Offset to Accumulator:
<syntax type="C">acc = pc + offset</syntax>
<syntaxhighlight lang="C">acc = pc + offset</syntaxhighlight>
:Adds a value to the post-operation pc and stores the result in the accumulator.
:Adds a value to the post-operation pc and stores the result in the accumulator.


Line 580: Line 583:
;<nowiki>op 0x75: lofss B offset (2 bytes)</nowiki>
;<nowiki>op 0x75: lofss B offset (2 bytes)</nowiki>
:Load Offset to Stack:
:Load Offset to Stack:
<syntax type="C">push(pc + offset)</syntax>
<syntaxhighlight lang="C">push(pc + offset)</syntaxhighlight>
:Adds a value to the post-operation pc and pushes the result on the stack.
:Adds a value to the post-operation pc and pushes the result on the stack.


Line 587: Line 590:
;<nowiki>op 0x77: push0 (1 bytes)</nowiki>
;<nowiki>op 0x77: push0 (1 bytes)</nowiki>
:Push 0:
:Push 0:
<syntax type="C">push(0)</syntax>
<syntaxhighlight lang="C">push(0)</syntaxhighlight>




Line 593: Line 596:
;<nowiki>op 0x79: push1 (1 bytes)</nowiki>
;<nowiki>op 0x79: push1 (1 bytes)</nowiki>
:Push 1:
:Push 1:
<syntax type="C">push(1)</syntax>
<syntaxhighlight lang="C">push(1)</syntaxhighlight>




Line 599: Line 602:
;<nowiki>op 0x7b: push2 (1 bytes)</nowiki>
;<nowiki>op 0x7b: push2 (1 bytes)</nowiki>
:Push 2:
:Push 2:
<syntax type="C">push(2)</syntax>
<syntaxhighlight lang="C">push(2)</syntaxhighlight>




Line 605: Line 608:
;<nowiki>op 0x7d: pushSelf (1 bytes)</nowiki>
;<nowiki>op 0x7d: pushSelf (1 bytes)</nowiki>
:Push self:
:Push self:
<syntax type="C">push(object)</syntax>
<syntaxhighlight lang="C">push(object)</syntaxhighlight>





Latest revision as of 09:03, 6 February 2022

The Sierra PMachine

Original document by Lars Skovlund, Dark Minister and Christoph Reichenbach

This document describes thee design of the Sierra PMachine (the virtual CPU used for executing SCI programs). It is a special CPU, in the sense that it is designed for object oriented programs. There are three kinds of memory in SCI: Variables, objects, and stack space. The stack space is used in a Last-In-First-Out manner, and is primarily used for temporary space in a routine, as well as passing data from one routine to another. Note that the stack space is used bottom-up by the original interpreter, instead of the more usual top-down. I don’t know if this has any significance for us.

Scripts are loaded into the PMachine by creating a memory image of it on the heap. For this reason, the script file format may seem a bit obscure at times. It is optimized for in-memory performance, not readability. It should be mentioned here that a lot of fixup stuff is done by the interpreter. In the script files, all addresses are specified as script-relative. These are converted to absolute offsets. The species and superClass fields of all objects are converted into pointers to the actual class etc.

There are four types of variables. These are called global, local, temporary, and parameter. All four types are simple arrays of 16-bit words. A pointer is kept for each type, pointing to the list that is cur­rently active. In fact, only the global variable list is constant in memory. The other pointers are changed frequently, as scripts are loaded/unloaded, routines called, etc. The variables are always referenced as an index into the variable list. I’ll explain the four types below - the names in parentheses will be used occasionally in the rest of the text:

Local variables (LocalVar)

This variable type is called "local" because it belongs to a specific script. Each script may have its own set of local variables, defined by script block type 10. As long as the code from a specific script is running, the local variables for that script are "active" (pointed to by the mentioned pointer).

Global variables

These, like the local variables, reside in script space (in fact, they are the local variables of script 0!). But the pointer to them remains constant for the whole duration of the program.

Temporary variables

These are allocated by specific subroutines in a script. They reside on the PMachine stack and are allocated by the link opcode. The temp variables are automatically discarded when the subroutine returns.

Parameter variables

These variables also reside on the stack. They contain information passed from one routine to another. Any routine in SCI is capable of taking a variable number of parameters, if need be. This is possible because a list size is pushed as the first thing before calling a routine. In addition to this, a frame size is passed to the call* functions.

Objects

While two adjacent variables may be entirely unrelated, the contents of an object is always related to one task. The object, like the variable tables, provides storage space. This storage space is called properties. Depending on the instructions used, a property can be referred to by index into the object structure, or by property IDs (PIDs). For instance, the name property has the PID 17h, but the offset 6. The property IDs are assigned by the SCI compiler, and it is the "compatible" way of accessing object data. Whereas the offset method is used only internally by an object to access its own data, the PID method is used externally by objects to read/write the data fields of other objects. The PID method is also used to call methods in an object, either by the object itself, by another object, or by the SCI interpreter. Yes, this really happens sometimes.

The PMachine “registers”

The PMachine can be said to have a number of registers, although none of them can be accessed explic­itly by script code. They are used/changed implicitly by the script opcodes:

Acc
The accumulator. Used for result storage and input for a number of opcodes.
IP
The instruction pointer. Points to the currently executing instruction. In ScummVM this is called the "Program Counter" or PC, which is the more general term.
Vars
An array of four values, each pointing to the current variables of each mentioned type
Object
Points to the currently executing object.
SP
The current stack pointer. Note that the stack in the original SCI interpreter is used bottom-up instead of the more usual top-down.

The PMachine, apart from the actual instruction pointer, keeps a record of which object is currently executing.

The instruction set

The PMachine CPU potentially has 128 instructions (however, a couple of these are invalid and generate an error). Some of these instructions have a flag which specify whether the opcode has byte- or word-sized operands (I will refer to this as variably-sized parameters, as opposed to constant parameters). Other instructions have only one calling form. These instructions simply disregard the operand size flag. Ideally, however, all script instructions should be prepared to take variably-sized operands. Yet another group of instructions take both a constant parameter and a variably-sized parameter. The format of an opcode byte is as follows:

bit7-1 opcode number
bit 0 operand size flag

Relative addresses

Certain instructions (in particular, branching ones) take relative addresses as a parameter. The actual address is calculated based on the instruction after the branching instruction itself. In this example, the bnt instruction, if the branch is made, jumps over the ldi instruction.

    eq?
    bnt +2
    ldi byte 2
    push

Relative addresses are signed values.

Dispatch addresses

The callb and calle instructions take a so-called dispatch index as a parameter. This index is used to look up an actual script address, using the so-called dispatch table. The dispatch table is located in script block type 7 in the script file. It is a series of words - the first one, as in so many other places in the script file, is the number of entries.

Frame sizes

In every call instruction, a value is included which determines the size of the parameter list, as an offset into the stack. This value discounts the list size pushed by the SCI code. For instance, consider this example from real SCI code:

     pushi 3 ; three parameters passed
     pushi 4 ; the screen flag
     pTos x ; push the x property
     pTos y ; push the y property
     callk OnControl, 6

Notice that, although the callk line specifies 6 bytes of parameters, the kernel routine has access to the list size (which is at offset 8)!

PErrors

These are internal errors in the interpreter. They are usually caused by buggy script code. The PErrors end up displaying an ”Oops!” box in the original interpreter (it is interesting to see how Sierra likes to believe that PErrors are caused by the user - judging by the message ”You did something we weren’t ex­pecting”!). In the original interpreter, specifying -d on the command line causes it to give more detailed information about PErrors, as well as activating the internal debugger if one occurs.

Class numbers and adresses

The key to finding a specific class lies in the class table. This class table resides in VOCAB.996, and contains the numbers of scripts that carry classes. If a script has more than one class defintion, the script number is repeated as necessary. Notice how each script number is followed by a zero word? When the interpreter loads a script, it checks to see if the script has classes. If it does, a pointer to the object structure is put in this empty space.


The instructions

The instructions are described below. I have used Dark Minister's text on the subject as a starting point, but many things have changed; stuff explained more thoroughly, errors corrected, etc. The first 23 instructions (up to, but not including, bt) take no parameters.

These functions are used in the pseudocode explanations:


pop(): sp -= 2; return *sp;
push(x): *sp = x; sp += 2; return x;

The following rules apply to opcodes:

  1. Parameters are signed, unless stated otherwise. Sign extension is performed.
  2. Jumps are relative to the posisition of the next operation.
  3. *TOS refers to the TOS (Top Of Stack) element.
  4. "tmp" refers to a temporary register that is used for explanation purposes only.



op 0x00: bnot (1 byte)
op 0x01: bnot (1 byte)
Binary not
acc ^= 0xffff;


op 0x02: add (1 byte)
op 0x03: add (1 byte)
Addition:
acc += pop();


op 0x04: sub (1 byte)
op 0x05: sub (1 byte)
Subtraction:
acc = pop() - acc;


op 0x06: mul (1 byte)
op 0x07: mul (1 byte)
Multiplication:
acc *= pop();


op 0x08: div (1 byte)
op 0x09: div (1 byte)
Division:
acc = pop() / acc;

Division by zero is caught => acc = 0.


op 0x0a: mod (1 byte)
op 0x0b: mod (1 byte)
Modulo:
acc = pop() % acc;

Modulo by zero is caught => acc = 0.


op 0x0c: shr (1 byte)
op 0x0d: shr (1 byte)
Shift Right logical:
acc = pop() >> acc;


op 0x0e: shl (1 byte)
op 0x0f: shl (1 byte)
Shift Left logical:
acc = pop() << acc;


op 0x10: xor (1 byte)
op 0x11: xor (1 byte)
Exclusive or:
acc ^= pop();


op 0x12: and (1 byte)
op 0x13: and (1 byte)
Logical and:
acc &= pop();


op 0x14: or (1 byte)
op 0x15: or (1 byte)
Logical or:
acc |= pop();


op 0x16: neg (1 byte)
op 0x17: neg (1 byte)
Sign negation:
acc = -acc;


op 0x18: not (1 byte)
op 0x19: not (1 byte)
Boolean not:
acc = !acc;


op 0x1a: eq? (1 byte)
op 0x1b: eq? (1 byte)
Equals?:
prev = acc;
acc = (acc == pop());


op 0x1c: ne? (1 byte)
op 0x1d: ne? (1 byte)
Is not equal to?
prev = acc;
acc = !(acc == pop());


op 0x1e: gt? (1 byte)
op 0x1f: gt? (1 byte)
Greater than?
prev = acc;
acc = (pop() > acc);


op 0x20: ge? (1 byte)
op 0x21: ge? (1 byte)
Greater than or equal to?
prev = acc;
acc = (pop() >= acc);


op 0x22: lt? (1 byte)
op 0x23: lt? (1 byte)
Less than?
prev = acc;
acc = (pop() < acc);


op 0x24: le? (1 byte)
op 0x25: le? (1 byte)
Less than or equal to?
prev = acc;
acc = (pop() <= acc);


op 0x26: ugt? (1 byte)
op 0x27: ugt? (1 byte)
Unsigned: Greater than?
acc = (pop() > acc);


op 0x28: uge? (1 byte)
op 0x29: uge? (1 byte)
Unsigned: Greather than or equal to?
acc = (pop() >= acc);


op 0x2a: ult? (1 byte)
op 0x2b: ult? (1 byte)
Unsigned: Less than?
acc = (pop() < acc);


op 0x2c: ule? (1 byte)
op 0x2d: ule? (1 byte)
Unsigned: Less than or equal to?
acc = (pop() >= acc);


op 0x2e: bt W relpos (3 bytes)
op 0x2f: bt B relpos (2 bytes)
Branch relative if true
if (acc) pc += relpos;


op 0x30: bnt W relpos (3 bytes)
op 0x31: bnt B relpos (2 bytes)
Branch relative if not true
if (!acc) pc += relpos;


op 0x32: jmp W relpos (3 bytes)
op 0x33: jmp B relpos (2 bytes)
Jump
pc += relpos;


op 0x34: ldi W data (3 bytes)
op 0x35: ldi B data (2 bytes)
Load data immediate
acc = data;
Sign extension is done for 0x35 if required.


op 0x36: push (1 byte)
op 0x37: push (1 byte)
Push to stack
push(acc)


op 0x38: pushi W data (3 bytes)
op 0x39: pushi B data (2 bytes)
Push immediate
push(data)
Sign extension for 0x39 is performed where required.


op 0x3a: toss (1 byte)
op 0x3b: toss (1 byte)
TOS subtract
pop();
For confirmation: Yes, this simply tosses the TOS value away.


op 0x3c: dup (1 byte)
op 0x3d: dup (1 byte)
Duplicate TOS element
push(*TOS);


op 0x3e: link W size (3 bytes)
op 0x3f: link B size (2 bytes)
sp += (size * 2);


op 0x40: call W relpos, B framesize (4 bytes)
op 0x41: call B relpos, B framesize (3 bytes)
Call inside script.
(See description below)
sp -= (framesize + 2 + &rest_modifier);
&rest_modifier = 0;
This calls a script subroutine at the relative position relpos, setting up the ParmVar pointer first. ParmVar points to sp-framesize (but see also the &rest operation). The number of parameters is stored at word offset -1 relative to ParmVar.


op 0x42: callk W kfunct, B kparams (4 bytes)
op 0x43: callk B kfunct, B kparams (3 bytes)
Call kernel function (see the Kernel functions section)
sp -= (kparams + 2 + &rest_modifier);
&rest_modifier = 0;
(call kernel function kfunct)


op 0x44: callb W dispindex, B framesize (4 bytes)
op 0x45: callb B dispindex, B framesize (3 bytes)
Call base script
(See description below)
sp -= (framesize + 2 + &rest_modifier);
&rest_modifier = 0;
This operation starts a new execution loop at the beginning of script 0, public method dispindex (Each script comes with a dispatcher list (type 7) that identifies public methods). Parameters are handled as in the call operation.


op 0x46: calle W script, W dispindex, B framesize (6 bytes)
op 0x47: calle B script, B dispindex, B framesize (4 bytes)
Call external script
(See description below)
sp -= (framesize + 2 + &rest_modifier);
&rest_modifier = 0;
This operation performs a function call (implicitly placing the current program counter on the execution stack) to an ``external procedure of a script. More precisely, exported procedure dispindex of script script is invoked, where dispindex is an offset into the script's Exports list (i.e., dispindex = n * 2 references the nth exported procedure).
The ``Exports list is defined in the script's type 7 object (cf. section Script resources). It is an error to invoke a script which does not exist or which does not provide an Exports list, or to use a dispatch index which does not point into an even address within the Exports list.


op 0x48: ret (1 byte)
op 0x49: ret (1 byte)
Return: returns from an execution loop started by call, calle, callb, send, self or super.


op 0x4a: send B framesize (2 bytes)
op 0x4b: send B framesize (2 bytes)
Send for one or more selectors. This is the most complex SCI operation (together with self and class).
Send looks up the supplied selector(s) in the object pointed to by the accumulator. If the selector is a variable selector, it is read (to the accumulator) if it was sent for with zero parameters. If a parameter was supplied, this selector is set to that parameter. Method selectors are called with the specified parameters.
The selector(s) and parameters are retreived from the stack frame. Send first looks up the selector ID at the bottom of the frame, then retreives the number of parameters, and, eventually, the parameters themselves. This algorithm is iterated until all of the stack frame has been "used up". Example:
; This is an example for usage of the SCI send operation
   pushi x      ; push the selector ID of x
   push1        ; 1 parameter: x is supposed to be set
   pushi 42     ; That's the value x will get set to
   pushi moveTo ; In this example, moveTo is a method selector.
   push2        ; It will get called with two parameters-
   push         ; The accumulator...
   lofss 17     ; ...and PC-relative address 17.
   pushi foo    ; Let's assume that foo is another variable selector.
   push0        ; This will read foo and return the value in acc.
   send 12      ; This operation does three quite different things.


op 0x4c
op 0x4d
op 0x4e
op 0x4f
These opcodes don't exist in SCI.


op 0x50: class W function (3 bytes)
op 0x51: class B function (2 bytes)
Get class address. Sets the accumulator to the memory address of the specified function of the current object.


op 0x52
op 0x53
These opcodes don't exist in SCI.


op 0x54: self B stackframe (2 bytes)
op 0x55: self B stackframe (2 bytes)
Send to self. This operation is the same as the send operation, except that it sends to the current object instead of the object pointed to by the accumulator.


op 0x56: super W class, B stackframe (4 bytes)
op 0x57: super B class, B stackframe (3 bytes)
Send to any class. This operation is the same as the send operation, except that it sends to an arbitrary class.


op 0x58: &rest W paramindex (3 bytes)
op 0x59: &rest B paramindex (2 bytes)
Pushes all or part of the ParmVar list on the stack. The number specifies the first parameter variable to be pushed. I'll give a small example. Suppose we have two functions:
function a(y,z) and function b(x,y,z)
function b wants to call function a with its own y and z parameters. Easy job, using the the normal lsp instruction. Now suppose that both function a and b are designed to take a variable number of parameters:
function a(y,z,...) and function b(x,y,z,...)
Since lsp does not support register indirection, we can't just push the variables in a loop (as we would in C). Instead this function is used. In this case, the instruction would be &rest 2, since we want the copying to start from y (inclusive), the second parameter.
Note that the values are copied to the stack immediately. The &rest_modifier is set to the number of variables pushed afterwards.


op 0x5a: lea W type, W index (5 bytes)
op 0x5b: lea B type, B index (3 bytes)
Load Effective Address
The variable type is a bit-field used as follows:
bit 0
unused
bit 1-2
the number of the variable list to use
0 - globalVar
2 - localVar
4 - tempVar
6 - parmVar
bit 3
unused
bit 4
set if the accumulator is to be used as additional index
Because it is so hard to explain, I have made a transcription of it here:
short *vars[4];

int acc;

int lea(int vt, int vi)
{
  return &((vars[(vt >> 1) & 3])[vt & 0x10 ? vi+acc : vi]);
}


op 0x5c: selfID (1 bytes)
op 0x5d: selfID (1 bytes)
Get 'self' identity: SCI uses heap pointers to identify objects, so this operation sets the accumulator to the address of the current object.
acc = object


op 0x5e
op 0x5f
These opcodes don't exist in SCI.


op 0x60: pprev (1 bytes)
op 0x61: pprev (1 bytes)
Push prev: Pushes the value of the prev register, set by the last comparison bytecode (eq?, lt?, etc.), on the stack.
push(prev)


op 0x62: pToa W offset (3 bytes)
op 0x63: pToa B offset (2 bytes)
Property To Accumulator: Copies the value of the specified property (in the current object) to the accumulator. The property is specified as an offset into the object structure.


op 0x64: aTop W offset (3 bytes)
op 0x65: aTop B offset (2 bytes)
Accumulator To Property: Copies the value of the accumulator into the specified property (in the current object). The property number is specified as an offset into the object structure.


op 0x66: pTos W offset (3 bytes)
op 0x67: pTos B offset (2 bytes)
Property To Stack: Same as pToa, but pushes the property value on the stack instead.


op 0x68: sTop W offset (3 bytes)
op 0x69: sTop B offset (2 bytes)
Stack To Property: Same as aTop, but gets the new property value from the stack instead.


op 0x6a: ipToa W offset (3 bytes)
op 0x6b: ipToa B offset (2 bytes)
Incement Property and copy To Accumulator: Increments the value of the specified property of the current object and copies it into the accumulator. The property number is specified as an offset into the object structure.


op 0x6c: dpToa W offset (3 bytes)
op 0x6d: dpToa B offset (2 bytes)
Decrepent Property and copy to Accumulator: Decrements the value of the specified property of the current object and copies it into the accumulator. The property number is specified as an offset into the object structure.


op 0x6e: ipTos W offset (3 bytes)
op 0x6f: ipTos B offset (2 bytes)
Increment Property and push to Stack Same as ipToa, but pushes the result on the stack instead.


op 0x70: dpTos W offset (3 bytes)
op 0x71: dpTos B offset (2 bytes)
Decrement Property and push to stack: Same as dpToa, but pushes the result on the stack instead.


op 0x72: lofsa W offset (3 bytes)
op 0x73: lofsa B offset (2 bytes)
Load Offset to Accumulator:
acc = pc + offset
Adds a value to the post-operation pc and stores the result in the accumulator.


op 0x74: lofss W offset (3 bytes)
op 0x75: lofss B offset (2 bytes)
Load Offset to Stack:
push(pc + offset)
Adds a value to the post-operation pc and pushes the result on the stack.


op 0x76: push0 (1 bytes)
op 0x77: push0 (1 bytes)
Push 0:
push(0)


op 0x78: push1 (1 bytes)
op 0x79: push1 (1 bytes)
Push 1:
push(1)


op 0x7a: push2 (1 bytes)
op 0x7b: push2 (1 bytes)
Push 2:
push(2)


op 0x7c: pushSelf (1 bytes)
op 0x7d: pushSelf (1 bytes)
Push self:
push(object)


op 0x7e
op 0x7f
These operations don't exist in SCI.


op 0x80 - 0xfe: [ls+-][as][gltp]i? W index (3 bytes)
op 0x81 - 0xff: [ls+-][as][gltp]i? B index (2 bytes)
The remaining SCI operations work on one of the four variable types. The variable index is retreived by taking the heap pointer for the specified variable type, adding the index and possibly the accumulator, and executing the operation according to the following table:
Bit 0
Used as with all other opcodes with variably-sized parameters:
0: 16 bit parameter
1: 8 bit parameter
Bits 1,2
The type of variable to operate on:
0: Global
1: Local
2: Temporary
3: Parameter
Bit 3
Whether to use the accumulator or the stack for operations:
0: Accumulator
1: Stack
Bit 4
Whether to use the accumulator as a modifier to the supplied index:
0: Don't use accumulator as an additional index
1: Use the accumulator as an additional index
Bits 5,6
The type of execution to perform:
0: Load the variable to the accumulator or stack
1: Store the accumulator or stack in the variable
2: Increment the variable, then load it into acc or on the stack
3: Decrement the variable, then load it into acc or on the stack
Bit 7
Always 1 (identifier for these opcodes)
Example: "sagi 2" would Store the Accumulator in the Global variable indexed with 2 plus the current accumulator value (this rarely makes sense, obviously). "+sp 6" would increment the parameter at offset 6 (the third parameter, not counting the argument counter), and push it on the stack.