Difference between revisions of "SCUMM/Technical Reference/Charset resources"

From ScummVM :: Wiki
Jump to navigation Jump to search
 
(16 intermediate revisions by 5 users not shown)
Line 2: Line 2:
Character sets define the fonts used by SCUMM to draw text, such as dialogue, on the screen.
Character sets define the fonts used by SCUMM to draw text, such as dialogue, on the screen.


=== V2 charset format ===
== SCUMM V1 and V2 ==


TODO
The V1 and V2 font format is identical to that found in V3 games; the big difference is that in V1 and V2, the font is not stored in the game data files, but rather is hardcoded into the executable.


=== V3 charset format ===
Therefore, ScummVM has to include fonts for the various versions of the two affected games (Maniac Mansion and Zak McKracken). The fonts differ depending on the localization of the game.


TODO
Currently, ScummVM includes fonts for English, French, German, Italian and Spanish game variants. It is not known whether there were other localizations, but if you encounter any make sure to tell the team about it!


=== V4 charset format ===
Another minor difference compared to the V3 format is that all characters are exactly 8x8 pixels big. Thus the font is always monospaced, unlike V3 fonts.
 
== SCUMM V3 ==
 
The header looks as follows:
 
{| border="1" cellpadding="2" cellspacing="0"
!Size  !! Type            !! Description
|-
|4      ||                || unknown
|-
|1      || byte            || number of characters
|-
|1      || byte            || height of the font
|-
|6  || bytes    || character width table (one byte for every char)
|}
 
After this header the character data starts. Every character in the charset takes up exactly 8 bytes, representing 8x8 pixels in which the actual character is contained (the actual width and height of the char should be computed from the charset header).
 
== SCUMM V4 ==


The header looks as follows:
The header looks as follows:
Line 19: Line 39:
|2      ||                || unknown
|2      ||                || unknown
|-
|-
|15    || bytes          || colour map
|15    || bytes          || color map
|-
|-
|1      || byte            || number of bits per pixel
|1      || byte            || number of bits per pixel
Line 32: Line 52:
Character glyphs may be 1, 2, 4 or 8 bits per pixel, and can be masked.
Character glyphs may be 1, 2, 4 or 8 bits per pixel, and can be masked.


The colour map contains the colours each pixel of the character glyph
The color map contains the colors each pixel of the character glyph
is drawn as. Pixel value 0 is used for transparency; the other values are
is drawn as. Pixel value 0 is used for transparency; the other values are
mapped using the color map in the header.
mapped using the color map in the header.


The character data pointers contain the offset, relative to the byte
The character data pointers contain the offset, relative to the byte
after the end of the colour map (byte 29), of the character data.
after the end of the color map (byte 29), of the character data.
This can be 0 if that particular character is not encoded in the character
This can be 0 if that particular character is not encoded in the character
set. The character data itself is formatted as follows:
set. The character data itself is formatted as follows:
Line 85: Line 105:
</pre>
</pre>


=== V5/V6 charset format ===
== SCUMM V5 and V6 ==


Like all other resources in V5 and later, the charset data is stored in a chunk, in this case a 'CHAR' chunk. The header looks as follows:
Like all other resources in V5 and later, the charset data is stored in a chunk, in this case a 'CHAR' chunk. The header looks as follows:
Line 94: Line 114:
|8      || chunk tag      || CHAR chunk tag
|8      || chunk tag      || CHAR chunk tag
|-
|-
|6     ||                 || unknown
|4     || quad LE        || size-23
|-
|-
|15    || bytes           || colour map
|2      || short           || version ? (always 0x6303 in dott)
|-
|-
|1      || byte            || number of bits per pixel
|15    || bytes          || color map
|-
|-
|1      || byte            || number of bits per pixel
|1      || byte            || number of bits per pixel
Line 104: Line 124:
|1      || byte            || height of the font
|1      || byte            || height of the font
|-
|-
|2      || short (LE)           || number of characters
|2      || short (LE)     || number of characters
|-
|-
|1024  || 256*quad LE    || character data offsets
|nchar*4|| nchar*quad LE    || character data offsets
|}
|}


Character glyphs may be 1, 2, 4 or 8 bits per pixel, and can be masked.
Observe that this header is identical to the V4 header with a few bytes added to the start of it. The charset format is otherwise identical to the V4 format described above.
 
== SCUMM V7 and V8 ==
 
=== NUT (V7 & V8) charset format ===


The colour map contains the colours each pixel of the character glyph
In V7 and V8 (Dig, FT, Comi), the fonts where stored in separate files with the extension "nut". We thus call the format used in these games the "NUT format".  
is drawn as. Pixel value 0 is used for transparency; the other values are
mapped using the color map in the header.


The character data pointers contain the offset, relative to the byte
Header of NUT file
after the end of the colour map (byte 29), of the character data.
This can be 0 if that particular character is not encoded in the character
set. The character data itself is formatted as follows:


{| border="1" cellpadding="2" cellspacing="0"
{| border="1" cellpadding="2" cellspacing="0"
!Size  !! Type            !! Description
!Size  !! Type            !! Description
|-
|-
|1       || byte            || width of character
|4      || chunk tag       || ANIM chunk tag
|-
|4      || quad LE        || size of ANIM chunk (AHDR and number FRME chunks included)
|-
|-
|1      || byte            || height of character
|4      || chunk tag      || AHDR chunk tag
|-
|-
|1      || byte            || X offset
|4      || quad LE        || size of AHDR chunk (datas until FRME chunk)
|-
|-
|1      || byte            || Y offset
|2      || short LE        || number of chars
|-
|-
|many    || bytes...        || glyph data bitstream
|}
|}


The X and Y offsets are added to the screen coordinates of the top-left
After AHDR chunk there is FRME chunk for per char of number chars:
corner of the glyph before drawing. This is useful for, say, shadowed
text. Needless to say, glyphs don't all have to be the same size,
although in all the examples I have they are the same height.


The data bitstream encodes the pixels in the glyph in left-to-right,
{| border="1" cellpadding="2" cellspacing="0"
top-to-bottom order. Multiple pixels are encoded per byte. The pixels
!Size  !! Type            !! Description
are arranged in big-endian format; so, the first pixel in the stream
|-
is in the top bits of the first data byte; then the bits below that;
|4      || chunk tag      || FRME chunk tag
and so on. For example, at one bit per pixel:
|-
 
|4      || quad LE        || size of FRME chunk (with whole FOBJ chunk too)
<pre>
|-
Bit position:  7     0 7     0 ...
|4      || chunk tag      || FOBJ chunk tag
Words of data: 01234567 89ABCDEF
|-
</pre>
|4     || quad LE        || size of FOBJ chunk
 
|-
At two bits per pixel:
|2     || short LE        || id of codec (could be 1, 21, 44)
 
|-
<pre>
|2      || short LE        || X display position of char
Bit position:  7     0 7      0 ...
|-
Words of data: 00112233 44556677
|2      || short LE        || Y display position of char
</pre>
|-
 
|2     || short LE        || width of char
And at four bits per pixel:
|-
 
|2      || short LE        || height of char
<pre>
|-
Bit position:  7     0 7      1 ...
|2      || short LE        || unknown
Words of data: 00001111 22223333
|-
</pre>
|2     || short LE        || unknown
 
|-
=== NUT (V7 & V8) charset format ===
|unk|| byte        || font gfx data, size of data is rest of FRME size
 
|}
TODO

Latest revision as of 14:32, 7 December 2023

Introduction

Character sets define the fonts used by SCUMM to draw text, such as dialogue, on the screen.

SCUMM V1 and V2

The V1 and V2 font format is identical to that found in V3 games; the big difference is that in V1 and V2, the font is not stored in the game data files, but rather is hardcoded into the executable.

Therefore, ScummVM has to include fonts for the various versions of the two affected games (Maniac Mansion and Zak McKracken). The fonts differ depending on the localization of the game.

Currently, ScummVM includes fonts for English, French, German, Italian and Spanish game variants. It is not known whether there were other localizations, but if you encounter any make sure to tell the team about it!

Another minor difference compared to the V3 format is that all characters are exactly 8x8 pixels big. Thus the font is always monospaced, unlike V3 fonts.

SCUMM V3

The header looks as follows:

Size Type Description
4 unknown
1 byte number of characters
1 byte height of the font
6 bytes character width table (one byte for every char)

After this header the character data starts. Every character in the charset takes up exactly 8 bytes, representing 8x8 pixels in which the actual character is contained (the actual width and height of the char should be computed from the charset header).

SCUMM V4

The header looks as follows:

Size Type Description
2 unknown
15 bytes color map
1 byte number of bits per pixel
1 byte height of the font
2 short (LE) number of characters
1024 256*quad LE character data offsets

Character glyphs may be 1, 2, 4 or 8 bits per pixel, and can be masked.

The color map contains the colors each pixel of the character glyph is drawn as. Pixel value 0 is used for transparency; the other values are mapped using the color map in the header.

The character data pointers contain the offset, relative to the byte after the end of the color map (byte 29), of the character data. This can be 0 if that particular character is not encoded in the character set. The character data itself is formatted as follows:

Size Type Description
1 byte width of character
1 byte height of character
1 byte X offset
1 byte Y offset
many bytes... glyph data bitstream

The X and Y offsets are added to the screen coordinates of the top-left corner of the glyph before drawing. This is useful for, say, shadowed text. Needless to say, glyphs don't all have to be the same size, although in all the examples I have they are the same height.

The data bitstream encodes the pixels in the glyph in left-to-right, top-to-bottom order. Multiple pixels are encoded per byte. The pixels are arranged in big-endian format; so, the first pixel in the stream is in the top bits of the first data byte; then the bits below that; and so on. For example, at one bit per pixel:

Bit position:  7      0 7      0 ...
Words of data: 01234567 89ABCDEF

At two bits per pixel:

Bit position:  7      0 7      0 ...
Words of data: 00112233 44556677

And at four bits per pixel:

Bit position:  7      0 7      1 ...
Words of data: 00001111 22223333

SCUMM V5 and V6

Like all other resources in V5 and later, the charset data is stored in a chunk, in this case a 'CHAR' chunk. The header looks as follows:

Size Type Description
8 chunk tag CHAR chunk tag
4 quad LE size-23
2 short version ? (always 0x6303 in dott)
15 bytes color map
1 byte number of bits per pixel
1 byte height of the font
2 short (LE) number of characters
nchar*4 nchar*quad LE character data offsets

Observe that this header is identical to the V4 header with a few bytes added to the start of it. The charset format is otherwise identical to the V4 format described above.

SCUMM V7 and V8

NUT (V7 & V8) charset format

In V7 and V8 (Dig, FT, Comi), the fonts where stored in separate files with the extension "nut". We thus call the format used in these games the "NUT format".

Header of NUT file

Size Type Description
4 chunk tag ANIM chunk tag
4 quad LE size of ANIM chunk (AHDR and number FRME chunks included)
4 chunk tag AHDR chunk tag
4 quad LE size of AHDR chunk (datas until FRME chunk)
2 short LE number of chars

After AHDR chunk there is FRME chunk for per char of number chars:

Size Type Description
4 chunk tag FRME chunk tag
4 quad LE size of FRME chunk (with whole FOBJ chunk too)
4 chunk tag FOBJ chunk tag
4 quad LE size of FOBJ chunk
2 short LE id of codec (could be 1, 21, 44)
2 short LE X display position of char
2 short LE Y display position of char
2 short LE width of char
2 short LE height of char
2 short LE unknown
2 short LE unknown
unk byte font gfx data, size of data is rest of FRME size