Difference between revisions of "Cloud Storages API comparison"
(Created page with "== Overview == ScummVM would be using open REST API of such cloud storage providers as Dropbox, Google Drive and OneDrive due to cloud integration project. These API allow us...") |
(No difference)
|
Revision as of 06:41, 2 May 2016
Overview
ScummVM would be using open REST API of such cloud storage providers as Dropbox, Google Drive and OneDrive due to cloud integration project. These API allow user to authenticate in order to recieve access token, which would be passed to an application (i.e. ScummVM). With this token application is allowed to make API calls and manage files in user's cloud storage.
In REST APIs there are scopes or permissions, which application can require. Those are shown to users when they authorize the application. Application would be limited to use only those methods which are defined by requested scopes.
Dropbox has only two scopes: one grants access to whole user's storage and the other - to application's special folder only.
Google Drive and OneDrive, on the other hand, have more finegrained scopes.
Each file in these services has its id, which is not changing while file is present. In Dropbox and OneDrive files can also be accessed by file path. In Google Drive there is only id option, but that allows files to have the same names under the same directory and files to have a lot of "parents" (meaning the same file could be in different directories at the same time but to be storaged only once, like in hardlinks).
Dropbox and Google Drive also have revisions with their own ids. Any file can be restored to one of its saved revisions, yet I'm not sure ScummVM would be using this feature.
REST APIs are HTTP-based APIs, so they use HTTP methods in order to specify which operation must be applied to specified resource. Server can use HTTP status codes in response in order to indicate whether request was processed correctly. Response is a JSON object (some REST APIs provide XML option, but I haven't seen such in Dropbox, Google Drive or OneDrive APIs).
These storages use ISO 8601 date format ("2015-05-12T15:50:38Z"). Google Drive uses RFC 3339 format, which is ISO 8601 with few modifications like decimal seconds.
Application Folder
All three storages have an application's folder option. In this case application is limited to manage only files within that folder and cannot access any other file on user's storage. In Dropbox and OneDrive this folder is open for users, so they can navigate there and manage their files manually. In Google Drive application folder is hidden, and user can only see quota usage in application settings.
It looks more preferable to ask user to give access to application folder only, so we can either ask for non-application folder access in Google Drive case or use the hidden folder with no user access to it. If not using application folder, we can ask user to specify the desired folder, so ScummVM would be keeping its files there, but also have access to whole storage, so user would be able to download games from any other directory.
I'd like to keep save files in "Saves" directory within application's root folder. Thus ScummVM would always know where to look for those and only app folder access would be required. Of course, we can make Saves folder path customizable, but I don't see why users would like to specify one folder for ScummVM and the other for saves. Special application folders cannot be moved though, they are created in special place of user's storage. That might not be that comfortable for users when they cannot move such folder, but this way it would be hard for them to forget where this folder is. ScummVM would be able to easily specify how to find such folder in F.A.Q., because it would be unified. For this reasons we might want to have visible "ScummVM" folder in Google Drive without asking users where they want to put it (it's kind of evil though).
File IDs vs File Paths
Local storage doesn't have any ids for files. Instead, we have unique file paths there. When we would like to download a file or sync files between storages, we would have to know which path corresponds to cloud storage file id.
First problem is quite technical: only Dropbox's file metadata contains full path (in lowercase and "display" representations). We would have to recreate full paths in case of Google Drive and OneDrive by listing ScummVM's folder, then listing all its subfolders and so on. Path prefix for files is prefix for its parent plus parent's name (full path is parent's prefix "ScummVM/Games/" + parent's name "Sam & Max/" + file name "file.ext").
At the same time, there is case-sensitivity problem: "file.txt" and "FiLe.txt" on Windows file systems are the same file name and are different on Unix. Names can have uppercase characters, but as Dropbox operates with lowercase path, it would, similarly to Windows, believe these files to be the same. OneDrive path's are not case-sensitive too. Google Drive doesn't have any paths, and it allows files to have the same name within one directory.
The solution is to avoid naming files in a way it would lead to such ambiguity. That means ScummVM should be case-insensitive too, and all files should have unique lowercase path. If there are files with the same name within one folder (in Google Drive), we should either ask user to select one he wants ScummVM to operate with or choose one ourselves: for example, newest one or one we know id of.
I believe that paths (even if those are lowercase) are more clear for both developers and users than ids are. If, for example, we would storage a special metadata file with ids of all downloaded files and user moved or removed file with known id, ScummVM would make an API call using this saved id. If file is moved, it might cause strange effects - user thought that when file is moved somewhere it won't be under ScummVM control anymore, but instead it still is. That might rewrite user's save files he thought wouldn't be touched anymore. If file was removed, ScummVM would recieve an error from cloud storage. Even if there is a file under that path (a new one with new id), application would believe that there is no file (because file with old id is inaccessible).
So, ScummVM should work in terms of file paths, and use ids only when it's actually needed (to download/update/sync files) without caching those in any metadata file.
HTTP errors
There are some HTTP status codes, which indicate whether application's request was successful.
Services use 200 and 201 to indicate everything's OK, files were created successfully.
There is 429 error, which says our application is making a lot of requests and we should retry after specified number of seconds.
Google recommends using exponential backoff (https://developers.google.com/drive/v3/web/manage-uploads#exp-backoff) error handling strategy when application recieves HTTP 503 or similar status codes.
File Metadata Representation
Each service sends metadata about the files, and I tried to mark the fields we're probably interested in.
Dropbox
name - the last piece of display path path_lower, path_display - paths client_modified, server_modified - in ISO 8601 size - in bytes //might be useful: id - inner Dropbox file id rev - inner Dropbox file revision id
Google Drive
name - might not be unique within a folder id - inner Google Drive file id modifiedTime - RFC 3339 date-time size - in bytes webContentLink - where to download //might be useful: originalFilename - probably might be used to determine unique filename properties, appProperties - key/value maps to keep whatever we want version - autoincreased every time file changes parents[] - list of folders which contain this file headRevisionId - inner Google Drive file revision id
OneDrive
name id lastModifiedDateTime size //might be useful: File.mimeType File.hashes Folder.childCount
Custom Metadata
Google Drive and OneDrive can also storage our own metadata with the files. In Google Drive there are two key/value maps for that: properties (public one, accessible to all applications) and appProperties (private one, accessible to our application only). OneDrive requires metadata to have some specified format. Developers must register their "facet" with schema and properties definitions. If we'd like to change those, we would have to do it "only in ways that can't break old apps", meaning we can't completely remove a field or change fields boundaries.
As this feature is not really common in these services and is not available in Dropbox, I believe we should not use it. I'm not sure we even need any custom metadata on files anyway.
Required API Methods
This is a list of API methods we would have to use.
Dropbox
/create_folder /delete /download may get "rev:abcdef" in order to download specific version of a file /list_folder knows "recursive", returns "has_more" for /list_folder/continue call /upload less than 150 MB /upload_session/start /upload_session/append /upload_session/finish more than 150 MB (less than 150 MB per request) /get_current_account user id, name, email, photo url /get_space_usage used and allocated in bytes
Google Drive
/about user (id, name, photo link, email) quota (usage, limit) maxUploadSize, appInstalled /files/<f> /create supports simple ("media"), multipart ("multipart") and resumable ("resumable") uploading /delete /get get metadata /list search (probably can list folder contents when using "path/to/directory/" prefix as query) /update change metadata or file contents
OneDrive
/drives/special id, quota, owner /drives/special/approot folder's Item resource /drive/special/approot/children /drive/items/{id}/children /drive/special/approot:/{path}/children list children (directory contents) @odata.nextLink is the request url for next page /drive/special/approot/children /drive/items/{parent-id}/children/{name} /drive/root:/{parent-path}/{name} create item /drive/special/approot:/{fileName}:/content /drive/items/{parent-id}/children/{name}/content /drive/root:/{parent-path}/{name}:/content upload contents (less than 100 MB in one piece) (supports multipart) /drive/root:/{path_to_item}:/upload.createSession /drive/items/{parent_item_id}:/{filename}:/upload.createSession resumable upload (less than 60 MB in one fragment, 10 MB is recommended size for a fragment) (this method is recommended to use for any file >= 10 MB) /drive/items/{id} /drive/root:/{path} update contents (HTTP method PATCH) /drive/items/{id} /drive/root:/{path} delete item (HTTP method DELETE) /drive/items/{id}/content /drive/root:/{path}:/content download
Other API Methods
Some API methods which might be useful.
Dropbox
/get_temporary_link link to stream contents /list_folder/get_latest_cursor get cursor for files that were changed since last call /restore rollback file to specified revision /copy /copy_reference to other user's folder /get_metadata /get_preview /get_thumbnail /list_folder/longpoll await for changes /list_revisions /move /search indexed search, might not know about latest changes <lots of sharing methods we won't use>
Google Drive
/files/<f> /copy /watch /emptyTrash /generateIds /changes/* list/watch changes (what was removed/added in which time) /channels/stop stop watching file /files/<f>/comments/* work with users comments (totally not used by ScummVM) /files/<f>/replies/* work with replies to comments /files/<f>/permissions/* work with permissions (why would we do this? =)
OneDrive
/drive/root/view.search?q= /drive/items/{item-id}/view.search?q= search /drive/items/{item-id}/view.delta /drive/root:/{item-path}:/view.delta see changes from previous time <move, copy, view thumbnail, share>