Supporting GUI Translation/Translations DAT Format
The translations.dat file is generated from the po/*.po files in the ScummVM source code repository. It contains the data needed by ScummVM to display a translated GUI.
The file is a binary file with
- a header
- a block with the list of languages
- a block with the list of codepages
- a block with the the english messages
- a block with the translated messages for language 1
- a block with the translated messages for language 2
- ...
- a block with the translated messages for language n
- a block with the mapping for codepage 1
- a block with the mapping for codepage 2
- ...
- a block with the mapping for codepage m
The latest file format version is 4. The description below is for version 4 but indicate blocks added, removed, or changed from version 1, 2 and 3.
- Version 2 adds context information in the translated messages.
- Version 3 adds code page descriptions (for translations not using ASCII or ISO-8859-1).
- Version 4 uses UTF-8 for translations and drops the code page descriptions.
The header
Type | Size | Order | Description |
---|---|---|---|
String | 12 | 'TRANSLATIONS' | |
Byte | 1 | Version (of the file format) | |
uint16 | 2 | BE | Number of translations |
(uint16) | (2) | (BE) | Only in version 3: Number of code pages |
uint32 | 4 | BE | Size in bytes of block 1 (list of languages) In version 3 and below the type is uint16 |
(uint16) | (2) | (BE) | Only in version 3: Size in bytes of block 2 (list of codepages) |
uint32 | 4 | BE | Size in bytes of block 3 (english messages) In version 3 and below the type is uint16 |
uint32 | 4 | BE | Size in bytes of block 4 (first translation) In version 3 and below the type is uint16 |
uint32 | 4 | BE | Size in bytes of block 5 (second translation) In version 3 and below the type is uint16 |
... | ... | ... | ... |
uint32 | 4 | BE | Size in bytes of block n+2 (nth translation) In version 3 and below the type is uint16 |
In version 3 with code page mapping information, the size for the codepage mapping blocks is not written since they all are 256 * 4 bytes long.
List of Languages
For each translation there is the following entry:
Type | Size | Order | Description |
---|---|---|---|
uint16 | 2 | BE | Size (in bytes) of the following string (including the terminating '\0') |
String | ?? | Language and country code (e.g. 'de_DE'). The country code is optional (e.g. 'eu'). | |
uint16 | 2 | BE | Size (in bytes) of the language name string (including the terminating '\0') |
String | ?? | Language name (e.g. 'Deutsch'). This is the name that appears in the GUI. |
List of Codepages
This block is only present in version 3 and was removed in version 4.
For each codepage there is the following entry:
Type | Size | Order | Description |
---|---|---|---|
uint16 | 2 | BE | Size (in bytes) of the following string (including the terminating '\0') |
String | ?? | Codepage name (e.g. 'iso-8859-5') |
English messages
Type | Size | Order | Description |
---|---|---|---|
uint16 | 2 | BE | Number of messages |
First message entry (see below) | |||
Second message entry | |||
... | ... | ... | ... |
Last message entry |
Each message entry has the following format:
Type | Size | Order | Description |
---|---|---|---|
uint16 | 2 | BE | Size (in bytes) of the english message string (including the terminating '\0') |
String | ?? | English message |
The messages are sorted in alphabetical order.
Translated messages
For each translation there is a block with the following format:
Type | Size | Order | Description |
---|---|---|---|
uint16 | 2 | BE | Number of translated messages |
(uint16) | (2) | (BE) | Only in version 3 and below: Size (in bytes) of the charset string (including the terminating '\0') |
(String) | (??) | Only in version 3 and below: Charset (e.g. 'iso-8859-1') In version 4 and above the charset is always UTF-8 | |
First translation entry (see below) | |||
... | ... | ... | ... |
Last translation entry |
Each translation entry has the following format:
Type | Size | Order | Description |
---|---|---|---|
uint16 | 2 | BE | Index of the entry in the english message table (index starts at 0) |
uint16 | 2 | BE | Size (in bytes) of the translated message string (including the terminating '\0') |
String | ?? | Translated message | |
uint16 | 2 | BE | Size (in bytes) of the context string (including the terminating '\0'). Size is 0 when there is no context. |
String | ?? | Context string (if a context is defined). |
Codepage mapping
This block is only present in version 3 and was removed in version 4.
For each codepage there is a block giving the mapping from each character of the 8 bits codepage (e.g. iso-8859-5) to the equivalent unicode glyph.
Type | Size | Order | Description |
---|---|---|---|
uint32 | 4 | BE | Mapping for a char value of 0 |
uint32 | 4 | BE | Mapping for a char value of 1 |
... | ... | ... | ... |
uint32 | 4 | BE | Mapping for a char value of 255 |