Difference between revisions of "Supporting GUI Translation/Translations DAT Format"
(Description of the translations.dat binary file) |
(Update description for version 4) |
||
(2 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
__NOTOC__ | |||
The ''translations.dat'' file is generated from the ''po/*.po'' files in the ScummVM source code repository. It contains the data needed by ScummVM to display a translated GUI. | The ''translations.dat'' file is generated from the ''po/*.po'' files in the ScummVM source code repository. It contains the data needed by ScummVM to display a translated GUI. | ||
Line 4: | Line 6: | ||
# [[#The header|a header]] | # [[#The header|a header]] | ||
# [[#List of Languages|a block with the list of languages]] | # [[#List of Languages|a block with the list of languages]] | ||
# [[#List of Codepages|a block with the list of codepages]] | |||
# [[#English messages|a block with the the english messages]] | # [[#English messages|a block with the the english messages]] | ||
# [[#Translated messages|a block with the translated messages for language 1]] | # [[#Translated messages|a block with the translated messages for language 1]] | ||
Line 9: | Line 12: | ||
# ... | # ... | ||
# a block with the translated messages for language n | # a block with the translated messages for language n | ||
# [[#Codepage mapping|a block with the mapping for codepage 1]] | |||
# a block with the mapping for codepage 2 | |||
# ... | |||
# a block with the mapping for codepage m | |||
The | The latest file format version is 4. The description below is for version 4 but indicate blocks added, removed, or changed from version 1, 2 and 3. | ||
* Version 2 adds context information in the [[#Translated messages|translated messages]]. | |||
* Version 3 adds code page descriptions (for translations not using ASCII or ISO-8859-1). | |||
* Version 4 uses UTF-8 for translations and drops the code page descriptions. | |||
== The header == | == The header == | ||
Line 16: | Line 26: | ||
!Type !! Size !! Order !! Description | !Type !! Size !! Order !! Description | ||
|- | |- | ||
|String || 12 | |String || 12 || || 'TRANSLATIONS' | ||
|- | |||
|Byte || 1 || || Version (of the file format) | |||
|- | |- | ||
| | |uint16 || 2 || BE || Number of translations | ||
|- | |- | ||
|uint16 || 2 | |(uint16)|| (2) || (BE) || Only in version 3: Number of code pages | ||
|- | |- | ||
| | |uint32 || 4 || BE || Size in bytes of block 1 (list of languages)<br>In version 3 and below the type is uint16 | ||
|- | |- | ||
|uint16 || 2 | |(uint16)|| (2) || (BE) || Only in version 3: Size in bytes of block 2 (list of codepages) | ||
|- | |- | ||
| | |uint32 || 4 || BE || Size in bytes of block 3 (english messages)<br>In version 3 and below the type is uint16 | ||
|- | |- | ||
| | |uint32 || 4 || BE || Size in bytes of block 4 (first translation)<br>In version 3 and below the type is uint16 | ||
|- | |- | ||
|... | |uint32 || 4 || BE || Size in bytes of block 5 (second translation)<br>In version 3 and below the type is uint16 | ||
|- | |||
|... || ... || ... || ... | |||
|- | |- | ||
| | |uint32 || 4 || BE || Size in bytes of block n+2 (n<sup>th</sup> translation)<br>In version 3 and below the type is uint16 | ||
|} | |} | ||
In version 3 with code page mapping information, the size for the codepage mapping blocks is not written since they all are 256 * 4 bytes long. | |||
== List of Languages == | == List of Languages == | ||
Line 44: | Line 60: | ||
|uint16 || 2 || BE || Size (in bytes) of the following string (including the terminating '\0') | |uint16 || 2 || BE || Size (in bytes) of the following string (including the terminating '\0') | ||
|- | |- | ||
|String || ?? || || Language and country code (e.g. 'de_DE') | |String || ?? || || Language and country code (e.g. 'de_DE'). The country code is optional (e.g. 'eu'). | ||
|- | |- | ||
|uint16 || 2 || BE || Size (in bytes) of the language name string (including the terminating '\0') | |uint16 || 2 || BE || Size (in bytes) of the language name string (including the terminating '\0') | ||
|- | |- | ||
|String || ?? || || Language name (e.g. 'Deutsch') | |String || ?? || || Language name (e.g. 'Deutsch'). This is the name that appears in the GUI. | ||
|} | |||
== List of Codepages == | |||
'''This block is only present in version 3 and was removed in version 4.''' | |||
For each codepage there is the following entry: | |||
{| border="1" cellpadding="2" cellspacing="0" | |||
!Type !! Size !! Order !! Description | |||
|- | |||
|uint16 || 2 || BE || Size (in bytes) of the following string (including the terminating '\0') | |||
|- | |||
|String || ?? || || Codepage name (e.g. 'iso-8859-5') | |||
|} | |} | ||
Line 82: | Line 111: | ||
!Type !! Size !! Order !! Description | !Type !! Size !! Order !! Description | ||
|- | |- | ||
|uint16 || 2 || BE || Number of translated messages | |uint16 || 2 || BE || Number of translated messages | ||
|- | |- | ||
|uint16 || 2 | |(uint16)||(2) || (BE) || Only in version 3 and below: Size (in bytes) of the charset string (including the terminating '\0') | ||
|- | |- | ||
|String | |(String)|| (??) || || Only in version 3 and below: Charset (e.g. 'iso-8859-1')<br>In version 4 and above the charset is always UTF-8 | ||
|- | |- | ||
| | | || || || First translation entry (see below) | ||
|- | |- | ||
|... | |... || ... || ... || ... | ||
|- | |- | ||
| | | || || || Last translation entry | ||
|} | |} | ||
Line 104: | Line 133: | ||
|- | |- | ||
|String || ?? || || Translated message | |String || ?? || || Translated message | ||
|- | |||
|uint16 || 2 || BE || Size (in bytes) of the context string (including the terminating '\0').<br>Size is 0 when there is no context. | |||
|- | |||
|String || ?? || || Context string (if a context is defined). | |||
|} | |||
== Codepage mapping == | |||
'''This block is only present in version 3 and was removed in version 4.''' | |||
For each codepage there is a block giving the mapping from each character of the 8 bits codepage (e.g. iso-8859-5) to the equivalent unicode glyph. | |||
{| border="1" cellpadding="2" cellspacing="0" | |||
!Type !! Size !! Order !! Description | |||
|- | |||
|uint32 || 4 || BE || Mapping for a char value of 0 | |||
|- | |||
|uint32 || 4 || BE || Mapping for a char value of 1 | |||
|- | |||
|... || ... || ... || ... | |||
|- | |||
|uint32 || 4 || BE || Mapping for a char value of 255 | |||
|} | |} |
Latest revision as of 13:32, 30 August 2020
The translations.dat file is generated from the po/*.po files in the ScummVM source code repository. It contains the data needed by ScummVM to display a translated GUI.
The file is a binary file with
- a header
- a block with the list of languages
- a block with the list of codepages
- a block with the the english messages
- a block with the translated messages for language 1
- a block with the translated messages for language 2
- ...
- a block with the translated messages for language n
- a block with the mapping for codepage 1
- a block with the mapping for codepage 2
- ...
- a block with the mapping for codepage m
The latest file format version is 4. The description below is for version 4 but indicate blocks added, removed, or changed from version 1, 2 and 3.
- Version 2 adds context information in the translated messages.
- Version 3 adds code page descriptions (for translations not using ASCII or ISO-8859-1).
- Version 4 uses UTF-8 for translations and drops the code page descriptions.
The header
Type | Size | Order | Description |
---|---|---|---|
String | 12 | 'TRANSLATIONS' | |
Byte | 1 | Version (of the file format) | |
uint16 | 2 | BE | Number of translations |
(uint16) | (2) | (BE) | Only in version 3: Number of code pages |
uint32 | 4 | BE | Size in bytes of block 1 (list of languages) In version 3 and below the type is uint16 |
(uint16) | (2) | (BE) | Only in version 3: Size in bytes of block 2 (list of codepages) |
uint32 | 4 | BE | Size in bytes of block 3 (english messages) In version 3 and below the type is uint16 |
uint32 | 4 | BE | Size in bytes of block 4 (first translation) In version 3 and below the type is uint16 |
uint32 | 4 | BE | Size in bytes of block 5 (second translation) In version 3 and below the type is uint16 |
... | ... | ... | ... |
uint32 | 4 | BE | Size in bytes of block n+2 (nth translation) In version 3 and below the type is uint16 |
In version 3 with code page mapping information, the size for the codepage mapping blocks is not written since they all are 256 * 4 bytes long.
List of Languages
For each translation there is the following entry:
Type | Size | Order | Description |
---|---|---|---|
uint16 | 2 | BE | Size (in bytes) of the following string (including the terminating '\0') |
String | ?? | Language and country code (e.g. 'de_DE'). The country code is optional (e.g. 'eu'). | |
uint16 | 2 | BE | Size (in bytes) of the language name string (including the terminating '\0') |
String | ?? | Language name (e.g. 'Deutsch'). This is the name that appears in the GUI. |
List of Codepages
This block is only present in version 3 and was removed in version 4.
For each codepage there is the following entry:
Type | Size | Order | Description |
---|---|---|---|
uint16 | 2 | BE | Size (in bytes) of the following string (including the terminating '\0') |
String | ?? | Codepage name (e.g. 'iso-8859-5') |
English messages
Type | Size | Order | Description |
---|---|---|---|
uint16 | 2 | BE | Number of messages |
First message entry (see below) | |||
Second message entry | |||
... | ... | ... | ... |
Last message entry |
Each message entry has the following format:
Type | Size | Order | Description |
---|---|---|---|
uint16 | 2 | BE | Size (in bytes) of the english message string (including the terminating '\0') |
String | ?? | English message |
The messages are sorted in alphabetical order.
Translated messages
For each translation there is a block with the following format:
Type | Size | Order | Description |
---|---|---|---|
uint16 | 2 | BE | Number of translated messages |
(uint16) | (2) | (BE) | Only in version 3 and below: Size (in bytes) of the charset string (including the terminating '\0') |
(String) | (??) | Only in version 3 and below: Charset (e.g. 'iso-8859-1') In version 4 and above the charset is always UTF-8 | |
First translation entry (see below) | |||
... | ... | ... | ... |
Last translation entry |
Each translation entry has the following format:
Type | Size | Order | Description |
---|---|---|---|
uint16 | 2 | BE | Index of the entry in the english message table (index starts at 0) |
uint16 | 2 | BE | Size (in bytes) of the translated message string (including the terminating '\0') |
String | ?? | Translated message | |
uint16 | 2 | BE | Size (in bytes) of the context string (including the terminating '\0'). Size is 0 when there is no context. |
String | ?? | Context string (if a context is defined). |
Codepage mapping
This block is only present in version 3 and was removed in version 4.
For each codepage there is a block giving the mapping from each character of the 8 bits codepage (e.g. iso-8859-5) to the equivalent unicode glyph.
Type | Size | Order | Description |
---|---|---|---|
uint32 | 4 | BE | Mapping for a char value of 0 |
uint32 | 4 | BE | Mapping for a char value of 1 |
... | ... | ... | ... |
uint32 | 4 | BE | Mapping for a char value of 255 |