Difference between revisions of "Supporting GUI Translation/Translations DAT Format"
(Update description to cover version 3) |
(Update description for version 4) |
||
Line 17: | Line 17: | ||
# a block with the mapping for codepage m | # a block with the mapping for codepage m | ||
The latest file format version is | The latest file format version is 4. The description below is for version 4 but indicate blocks added, removed, or changed from version 1, 2 and 3. | ||
* Version 2 adds context information in the [[#Translated messages|translated messages]]. | * Version 2 adds context information in the [[#Translated messages|translated messages]]. | ||
* Version 3 adds code page descriptions (for translations not using ASCII or ISO-8859-1). | * Version 3 adds code page descriptions (for translations not using ASCII or ISO-8859-1). | ||
* Version 4 uses UTF-8 for translations and drops the code page descriptions. | |||
== The header == | == The header == | ||
Line 25: | Line 26: | ||
!Type !! Size !! Order !! Description | !Type !! Size !! Order !! Description | ||
|- | |- | ||
|String || 12 | |String || 12 || || 'TRANSLATIONS' | ||
|- | |- | ||
|Byte | |Byte || 1 || || Version (of the file format) | ||
|- | |- | ||
|uint16 || 2 || BE || Number of translations | |uint16 || 2 || BE || Number of translations | ||
|- | |- | ||
|uint16 || 2 | |(uint16)|| (2) || (BE) || Only in version 3: Number of code pages | ||
|- | |- | ||
| | |uint32 || 4 || BE || Size in bytes of block 1 (list of languages)<br>In version 3 and below the type is uint16 | ||
|- | |- | ||
|uint16 || 2 | |(uint16)|| (2) || (BE) || Only in version 3: Size in bytes of block 2 (list of codepages) | ||
|- | |- | ||
| | |uint32 || 4 || BE || Size in bytes of block 3 (english messages)<br>In version 3 and below the type is uint16 | ||
|- | |- | ||
| | |uint32 || 4 || BE || Size in bytes of block 4 (first translation)<br>In version 3 and below the type is uint16 | ||
|- | |- | ||
| | |uint32 || 4 || BE || Size in bytes of block 5 (second translation)<br>In version 3 and below the type is uint16 | ||
|- | |- | ||
|... | |... || ... || ... || ... | ||
|- | |- | ||
| | |uint32 || 4 || BE || Size in bytes of block n+2 (n<sup>th</sup> translation)<br>In version 3 and below the type is uint16 | ||
|} | |} | ||
In version 3 with code page mapping information, the size for the codepage mapping blocks is not written since they all are 256 * 4 bytes long. | |||
== List of Languages == | == List of Languages == | ||
Line 67: | Line 68: | ||
== List of Codepages == | == List of Codepages == | ||
'''This block is only present in version 3 and was removed in version 4.''' | |||
For each codepage there is the following entry: | For each codepage there is the following entry: | ||
Line 108: | Line 111: | ||
!Type !! Size !! Order !! Description | !Type !! Size !! Order !! Description | ||
|- | |- | ||
|uint16 || 2 || BE || Number of translated messages | |uint16 || 2 || BE || Number of translated messages | ||
|- | |- | ||
|uint16 || 2 | |(uint16)||(2) || (BE) || Only in version 3 and below: Size (in bytes) of the charset string (including the terminating '\0') | ||
|- | |- | ||
|String | |(String)|| (??) || || Only in version 3 and below: Charset (e.g. 'iso-8859-1')<br>In version 4 and above the charset is always UTF-8 | ||
|- | |- | ||
| | | || || || First translation entry (see below) | ||
|- | |- | ||
|... | |... || ... || ... || ... | ||
|- | |- | ||
| | | || || || Last translation entry | ||
|} | |} | ||
Line 137: | Line 140: | ||
== Codepage mapping == | == Codepage mapping == | ||
'''This block is only present in version 3 and was removed in version 4.''' | |||
For each codepage there is a block giving the mapping from each character of the 8 bits codepage (e.g. iso-8859-5) to the equivalent unicode glyph. | For each codepage there is a block giving the mapping from each character of the 8 bits codepage (e.g. iso-8859-5) to the equivalent unicode glyph. | ||
{| border="1" cellpadding="2" cellspacing="0" | {| border="1" cellpadding="2" cellspacing="0" |
Latest revision as of 13:32, 30 August 2020
The translations.dat file is generated from the po/*.po files in the ScummVM source code repository. It contains the data needed by ScummVM to display a translated GUI.
The file is a binary file with
- a header
- a block with the list of languages
- a block with the list of codepages
- a block with the the english messages
- a block with the translated messages for language 1
- a block with the translated messages for language 2
- ...
- a block with the translated messages for language n
- a block with the mapping for codepage 1
- a block with the mapping for codepage 2
- ...
- a block with the mapping for codepage m
The latest file format version is 4. The description below is for version 4 but indicate blocks added, removed, or changed from version 1, 2 and 3.
- Version 2 adds context information in the translated messages.
- Version 3 adds code page descriptions (for translations not using ASCII or ISO-8859-1).
- Version 4 uses UTF-8 for translations and drops the code page descriptions.
The header
Type | Size | Order | Description |
---|---|---|---|
String | 12 | 'TRANSLATIONS' | |
Byte | 1 | Version (of the file format) | |
uint16 | 2 | BE | Number of translations |
(uint16) | (2) | (BE) | Only in version 3: Number of code pages |
uint32 | 4 | BE | Size in bytes of block 1 (list of languages) In version 3 and below the type is uint16 |
(uint16) | (2) | (BE) | Only in version 3: Size in bytes of block 2 (list of codepages) |
uint32 | 4 | BE | Size in bytes of block 3 (english messages) In version 3 and below the type is uint16 |
uint32 | 4 | BE | Size in bytes of block 4 (first translation) In version 3 and below the type is uint16 |
uint32 | 4 | BE | Size in bytes of block 5 (second translation) In version 3 and below the type is uint16 |
... | ... | ... | ... |
uint32 | 4 | BE | Size in bytes of block n+2 (nth translation) In version 3 and below the type is uint16 |
In version 3 with code page mapping information, the size for the codepage mapping blocks is not written since they all are 256 * 4 bytes long.
List of Languages
For each translation there is the following entry:
Type | Size | Order | Description |
---|---|---|---|
uint16 | 2 | BE | Size (in bytes) of the following string (including the terminating '\0') |
String | ?? | Language and country code (e.g. 'de_DE'). The country code is optional (e.g. 'eu'). | |
uint16 | 2 | BE | Size (in bytes) of the language name string (including the terminating '\0') |
String | ?? | Language name (e.g. 'Deutsch'). This is the name that appears in the GUI. |
List of Codepages
This block is only present in version 3 and was removed in version 4.
For each codepage there is the following entry:
Type | Size | Order | Description |
---|---|---|---|
uint16 | 2 | BE | Size (in bytes) of the following string (including the terminating '\0') |
String | ?? | Codepage name (e.g. 'iso-8859-5') |
English messages
Type | Size | Order | Description |
---|---|---|---|
uint16 | 2 | BE | Number of messages |
First message entry (see below) | |||
Second message entry | |||
... | ... | ... | ... |
Last message entry |
Each message entry has the following format:
Type | Size | Order | Description |
---|---|---|---|
uint16 | 2 | BE | Size (in bytes) of the english message string (including the terminating '\0') |
String | ?? | English message |
The messages are sorted in alphabetical order.
Translated messages
For each translation there is a block with the following format:
Type | Size | Order | Description |
---|---|---|---|
uint16 | 2 | BE | Number of translated messages |
(uint16) | (2) | (BE) | Only in version 3 and below: Size (in bytes) of the charset string (including the terminating '\0') |
(String) | (??) | Only in version 3 and below: Charset (e.g. 'iso-8859-1') In version 4 and above the charset is always UTF-8 | |
First translation entry (see below) | |||
... | ... | ... | ... |
Last translation entry |
Each translation entry has the following format:
Type | Size | Order | Description |
---|---|---|---|
uint16 | 2 | BE | Index of the entry in the english message table (index starts at 0) |
uint16 | 2 | BE | Size (in bytes) of the translated message string (including the terminating '\0') |
String | ?? | Translated message | |
uint16 | 2 | BE | Size (in bytes) of the context string (including the terminating '\0'). Size is 0 when there is no context. |
String | ?? | Context string (if a context is defined). |
Codepage mapping
This block is only present in version 3 and was removed in version 4.
For each codepage there is a block giving the mapping from each character of the 8 bits codepage (e.g. iso-8859-5) to the equivalent unicode glyph.
Type | Size | Order | Description |
---|---|---|---|
uint32 | 4 | BE | Mapping for a char value of 0 |
uint32 | 4 | BE | Mapping for a char value of 1 |
... | ... | ... | ... |
uint32 | 4 | BE | Mapping for a char value of 255 |