Advanced Detector

From ScummVM :: Wiki
Revision as of 09:07, 10 December 2020 by SupSuper (talk | contribs) (Replace AdvancedMetaEngine with AdvancedMetaEngineDetection)
Jump to navigation Jump to search

Advanced Detector

If your engine supports a large number of games (or variants of games) then detecting them can be tricky.

Since some of the game variants will have files with the same name, but differing contents, detection by filename alone is not sufficient, and a fair number of variants may only differ by small sections of data within the entire file, so opening the file and looking for a "magic header" is not reliable either.

So instead, most engines take a checksum (or even better, a hash) of the file to detect the exact version, and to do that, you would need to write code to open the files and run this check into your custom MetaEngine...

Sounds like a lot of work?
Well, to avoid every engine author having to do this themselves (and the codebase ending up with the maintenance headache of 20+ implementations of this which are almost, but not exactly the same!), the ScummVM Infrastructure Team have provided the Advanced Detector!

This provides a standard framework for filename and MD5 based game detection.

The code for this can be found in engines/advancedDetector.*

To use this, you will have to follow the instructions here within your engine's detection.h and detection.cpp.

All you will have to provide is a standard data table of ADGameDescription entries describing each game variant, which is usually placed in a separate detection_tables.h header, which is included in detection.cpp for use there.

This structure plus other parameters are passed to the AdvancedMetaEngineDetection constructor, which can also contain overrides of the default parameters for detection e.g. _md5Bytes is the number of bytes used for the MD5 hash for each file, etc.

It is suggested you consult the code and header comments in engines/advancedDetector.* and look at the examples provided by current engines for a more complete example.

Game detection entry in ScummVM config file

When you look into your .scummvmrc or scummvm.ini (depending on the platform), you will find that generally it has following structure

[scummvm]
globalkey1=foo
globalkey2=bar
versioninfo=1.5.0git2516-g30a372d

[monkey2-vga]
description=Monkey Island 2: LeChuck's Revenge (DOS/English)
path=/Users/sev/games/scumm/monkey2
gameid=monkey2
language=en
platform=pc

What you see here is several sections designated by identifiers in square brackets and set of key/value pairs belonging to each such section.

The main section with predefined name 'scummvm' contains global options, which are mainly editable in Options dialog in GUI. Then there are sections for each separate game. Additionally some of ports define their own service sections.

Name of each game is what we are calling target. Target, which is in the sample above specified as monkey2-vga is user-editable identifier unique to the specific user, and could be used for launching the game from command line.

Then each entry has description which is also user-editable, path to the game, and gameid. gameid is a service name which identifies the game within whole ScummVM. There should be no clashes, and each engine knows which gameids it does support. First engine which finds a match for a given gameid will be used to run the game. This is why it is important to keep this ID unique, since there is no guarantee in sequence of engines which ScummVM probes when launching a game.

Keys platform and language are used for narrowing down the possible game candidate but are fully optional.

How Advanced Detector works

Advanced detector tries to match the files in the probed directory against specified lists of file characteristics provided in an array of ADGameDescription structures. It can take into account the md5sum of a part of the file (by default, its first several hundred bytes), its size and name. It then creates a list of candidates which then it tries to narrow down to a single ADGameDescription instance, unless it is told to do otherwise. In case of ambiguous matches, it returns a list of games.

It is important to know that currently there are in fact two modes of the Advanced Detector.

The first one is used during the game detection when user tries to add a game (detection mode), and the second one when the user launches already detected game (running mode). Both modes call the same method findGames() which could potentially return a list of games.

In detection mode, the user is then presented with a list of games to choose from, but in the running mode, if the findGames() method returns more than one game, only first one in the list will be used. This may lead to situation when the game gets detected but doesn't run, thus it is important to test detection and avoid any ambiguous situations.

This is also the main reason for some of the features in Advanced Detector which are geared towards resolving such conflicts.

In the running mode Advanced Detector tries to match as much information stored in the config game entry as possible. The typical keys it matches against are gameid, platform and language, but it may also use extra when instructed to do so.

In case there are no matches in the ADGameDescription list, there are two additional fallback detection modes. One is file-based detection, which matches just the file names, and second one is a hook which gets called and could contain code of any complexity. The most prominent example of advanced fallback detection is SCI engine.

Generated targets

Targets generated by Advanced Detector have the following structure:

 GAMEID-DEMO-CD-PLATFORM-LANG

The target generation is affected by AD flags. The flags which have influence are: ADGF_CD, ADGF_DEMO, ADGF_DROPLANGUAGE.

PlainGameDescriptor table

struct PlainGameDescriptor {
	const char *gameid;
	const char *description;
};

This table contains all gameids which are known by the engine. Also each gameid contains a full human-readable description, which is used to provide the description field in the ScummVM configuration file.

Only gameids which are present in this table could be used in ADGameDescription table.

Typical PlainGameDescriptor table:

static const PlainGameDescriptor cineGames[] = {
	{"cine", "Cinematique evo.1 engine game"},
	{"fw", "Future Wars"},
	{"os", "Operation Stealth"},
	{0, 0}
};

Please note that it is NULL-terminated, and also contains the generic gameid cine which is used by fallback detection.

ADGameDescription table

ADGameDescription table has the following structure:

struct ADGameDescription {
	const char *gameid;
	const char *extra;
	ADGameFileDescription filesDescriptions[14];
	Common::Language language;
	Common::Platform platform;
	uint32 flags;
	const char *guioptions;
};

gameid -- This is the gameid. Mainly it is used for taking the game description from the PlainGameDescriptor table.

extra -- This is used to distinguish between different variants of a game. The content of this field is inserted in the generated description for the config file game entry. In case the kADFlagUseExtraAsHint ADFlag is set, the contents of this field are stored in the config file, and is used to additionally distinguish between game variants. Also, if the ADGF_USEEXTRAASTITLE game flag is set, the contents of this field will be put into description rather than one extracted from 'PlainGameDescriptor table.

filesDescriptions -- a list of individual file entries used for detection. 13 files (last is zero terminated) is the maximum number of files currently used in ScummVM. We are forced to specify a hardcoded number, due to a C++ limitation for defining const arrays.

language -- language of the game variant.

platform -- platform of the game variant.

flags -- game feature flags. Contains both engine-specific ones as well as global ones (see ADGameFlags)

guioptions -- game features which are user controllable. Basically this list reflects which features of GUI should be turned on or off in order to minimize user confusion. For instance, there is no point in changing game language in single language games or have MIDI controls with game which supports only digital music. (See GUI Options)


Typical ADGameDescription table will look as follows:

static const ADGameDescription gameDescriptions[] = {
	{
		"fw",
		"",
		AD_ENTRY1("part01", "61d003202d301c29dd399acfb1354310"),
		Common::EN_ANY,
		Common::kPlatformPC,
		ADGF_NO_FLAGS,
		GUIO0()
	},
	{ AD_TABLE_END_MARKER, 0, 0 }
};

ADGameFileDescription structure

struct ADGameFileDescription {
	const char *fileName;	///< Name of described file.
	uint16 fileType; ///< Optional. Not used during detection, only by engines.
	const char *md5; ///< MD5 of (the beginning of) the described file. Optional. Set to NULL to ignore.
	int32 fileSize;  ///< Size of the described file. Set to -1 to ignore.
};

fileName -- name of the file. It is case insensitive, but historically we use lowercase names.

fileType -- rarely used field where ADGameFileDescription structure is used by the engine. May specify music file, script file, etc.

md5 -- MD5 of the file. Most often it is MD5 of the beginning of the file for performance reasons. See _md5Bytes setting of AdvancedMetaEngineDetection. If set to NULL, the md5 is not used in detection and the entry matches against any content.

fileSize -- file size in bytes. Optional too, set to -1 in order to match against any file size.

Game Entry flags ADGameFlags

Game flags are used to tell the engine which features this particular game has. There are both engine-specific and Advanced Detector-specific game flags. The latter, besides being more or less universal, also affects the detection behaviour.

ADGF_ADDENGLISH -- Used for dual language games. In this case the user will be presented with a selection between localised and English version of the game. Affects GUIOs.

ADGF_CD -- Specifies a CD version. Generated target will get '-cd' suffix.

ADGF_DEMO -- Specifies a game demo. Generated target will get '-demo' suffix.

ADGF_DROPLANGUAGE -- the generated target will not have a language specified. Used mainly for multilanguage games which have language selector internally. Thus the language will be selected within the game and the setting stored in the config file, but the game entry will stay intact.

ADGF_MACRESFORK

ADGF_NO_FLAGS -- No flags are set.

ADGF_PIRATED -- Specifies a blacklisted game. The game will be detected but refuse to run.

There are known hacked variants for some of the games exist in the wild. We used to ignore user reports on them, but with the number of engines growing, it became tough to remember that some particular game is really a hack. When it was widespread enough, we were getting recurrent reports that the game is not detected. To avoid this situation we now accept md5s of such games but mark them accordingly.

ADGF_TESTING -- Specifies game which was announced for public testing. The user will get a relevant warning when launching the game.

ADGF_UNSTABLE -- Specifies game which is not publicly supported and is in a heavy development. The user will get a relevant warning when launching the game.

ADGF_USEEXTRAASTITLE -- Instead of description specified in PlainGameDescriptor table, extra field will be used as game description. Good example is AGI fan games where the game title is known but it is not feasible to add it to PlainGameDescriptor table, or minor composer engine demos with games combined for the same reason.

Advanced Detector flags ADFlags

kADFlagUseExtraAsHint -- Specify this flag in situation when there is more than a single game stored in the same directory, with the same gameid. I.e. there is no way to know which game the user wants to run without asking him. The typical example is the VGA version of Lure of the Temptress, which contained both EGA and VGA datafiles in the game directory.

Upgrading obsolete gameids

static const Engines::ObsoleteGameID obsoleteGameIDsTable[] = {
        {"simon1acorn", "simon1", Common::kPlatformAcorn},
        {"simon1amiga", "simon1", Common::kPlatformAmiga},
        {"simon2talkie", "simon2", Common::kPlatformPC},
        {"simon2mac", "simon2", Common::kPlatformMacintosh},
        {"simon2win", "simon2", Common::kPlatformWindows},
        {0, 0, Common::kPlatformUnknown}
};

AdvancedMetaEngineDetection

Is a generic MetaEngine wrapper which is aware of the Advanced Detector. It should be used whenever AD is used.

Engine constructor

AdvancedMetaEngineDetection(const void *descs, uint descItemSize, const PlainGameDescriptor *gameids);

descs must point to a list of ADGameDescription structures, or their supersets.

descItemSize is sizeof of the descs element used for iterating over it.

gameids must point to a list of PlainGameDescriptor structures defining supported gameids.

Additional Advanced MetaEngine parameters

_md5bytes -- number of bytes used to compute md5. If set to 0 then whole file will be used. Beware of doing this if your detection might encounter large files, since that can dramatically slow down the detection. Typically a sane value is 5000 bytes (the default), but often experimentation is required for many game variants with subtle differences.

_singleid -- Used to override gameid. A recommended setting to prevent global gameid pollution. With this option set, the gameid effectively turns into engineid.

In the past we started to have clashes in game names, thus the option was introduced. Also it was mentioned that in the ideal world it should be enough to point just the game directory and ScummVM correctly detects and runs the game. This is a step towards this direction, however there are several cases when it is not possible to identify the game to run, particularly in those cases when there are more than single game stored in a directory.

_flags -- same as individual game flags but user for engine-wide settings. For instance, we know for sure that all games in the engine are unstable, so instead of modifying every game entry, we specify it here.

_guioptions -- same as individual GUI options, but applied engine-wide. For example, when none of the games have speech, we may specify it in this spot.

_maxScanDepth -- Maximum traversal depth for directories. Default is 1, that is do not go inside of subdirectories for detection.

_directoryGlobs -- Case-insesitive list of nested directoriy globs AD will search games in. Null-terminated. Must be set if detection should go into subdirectories.

AdvancedMetaEngine usage

TODO