Supporting GUI Translation/Translations DAT Format

From ScummVM :: Wiki
Jump to navigation Jump to search


The translations.dat file is generated from the po/*.po files in the ScummVM source code repository. It contains the data needed by ScummVM to display a translated GUI.

The file is a binary file with

  1. a header
  2. a block with the list of languages
  3. a block with the list of codepages
  4. a block with the the english messages
  5. a block with the translated messages for language 1
  6. a block with the translated messages for language 2
  7. ...
  8. a block with the translated messages for language n
  9. a block with the mapping for codepage 1
  10. a block with the mapping for codepage 2
  11. ...
  12. a block with the mapping for codepage m

The latest file format version is 4. The description below is for version 4 but indicate blocks added, removed, or changed from version 1, 2 and 3.

  • Version 2 adds context information in the translated messages.
  • Version 3 adds code page descriptions (for translations not using ASCII or ISO-8859-1).
  • Version 4 uses UTF-8 for translations and drops the code page descriptions.

The header

Type Size Order Description
String 12 'TRANSLATIONS'
Byte 1 Version (of the file format)
uint16 2 BE Number of translations
(uint16) (2) (BE) Only in version 3: Number of code pages
uint32 4 BE Size in bytes of block 1 (list of languages)
In version 3 and below the type is uint16
(uint16) (2) (BE) Only in version 3: Size in bytes of block 2 (list of codepages)
uint32 4 BE Size in bytes of block 3 (english messages)
In version 3 and below the type is uint16
uint32 4 BE Size in bytes of block 4 (first translation)
In version 3 and below the type is uint16
uint32 4 BE Size in bytes of block 5 (second translation)
In version 3 and below the type is uint16
... ... ... ...
uint32 4 BE Size in bytes of block n+2 (nth translation)
In version 3 and below the type is uint16

In version 3 with code page mapping information, the size for the codepage mapping blocks is not written since they all are 256 * 4 bytes long.

List of Languages

For each translation there is the following entry:

Type Size Order Description
uint16 2 BE Size (in bytes) of the following string (including the terminating '\0')
String ?? Language and country code (e.g. 'de_DE'). The country code is optional (e.g. 'eu').
uint16 2 BE Size (in bytes) of the language name string (including the terminating '\0')
String ?? Language name (e.g. 'Deutsch'). This is the name that appears in the GUI.

List of Codepages

This block is only present in version 3 and was removed in version 4.

For each codepage there is the following entry:

Type Size Order Description
uint16 2 BE Size (in bytes) of the following string (including the terminating '\0')
String ?? Codepage name (e.g. 'iso-8859-5')

English messages

Type Size Order Description
uint16 2 BE Number of messages
First message entry (see below)
Second message entry
... ... ... ...
Last message entry

Each message entry has the following format:

Type Size Order Description
uint16 2 BE Size (in bytes) of the english message string (including the terminating '\0')
String ?? English message

The messages are sorted in alphabetical order.

Translated messages

For each translation there is a block with the following format:

Type Size Order Description
uint16 2 BE Number of translated messages
(uint16) (2) (BE) Only in version 3 and below: Size (in bytes) of the charset string (including the terminating '\0')
(String) (??) Only in version 3 and below: Charset (e.g. 'iso-8859-1')
In version 4 and above the charset is always UTF-8
First translation entry (see below)
... ... ... ...
Last translation entry

Each translation entry has the following format:

Type Size Order Description
uint16 2 BE Index of the entry in the english message table (index starts at 0)
uint16 2 BE Size (in bytes) of the translated message string (including the terminating '\0')
String ?? Translated message
uint16 2 BE Size (in bytes) of the context string (including the terminating '\0').
Size is 0 when there is no context.
String ?? Context string (if a context is defined).

Codepage mapping

This block is only present in version 3 and was removed in version 4.

For each codepage there is a block giving the mapping from each character of the 8 bits codepage (e.g. iso-8859-5) to the equivalent unicode glyph.

Type Size Order Description
uint32 4 BE Mapping for a char value of 0
uint32 4 BE Mapping for a char value of 1
... ... ... ...
uint32 4 BE Mapping for a char value of 255