GIT Migration

From ScummVM :: Wiki
Revision as of 19:42, 24 January 2011 by Wjp (talk | contribs) (→‎Conversions in progress: Add link to scummvm-2010-archive repo)
Jump to navigation Jump to search

This page is a place holder for gathering information on the potential migration from SVN to GIT for the ScummVM source code repositories.

Please use this page to track any blockers that you may foresee (Windows users, I am looking at you ;-)) so the problems can be addressed and the work moved forward.

Windows users may want to take a look at http://code.google.com/p/gitextensions/ and http://code.google.com/p/tortoisegit/ --Fingolfin 12:47, 2 November 2010 (UTC)
There's also http://www.syntevo.com/smartgit/index.html (which is free for personal use). A good article on alternative clients for Windows is here: http://www.makeuseof.com/tag/5-windows-git-clients-git-job/ --Md5 10:30, 5 December 2010 (UTC)

Conversions in progress

The Tools repository is currently being migrated to git. The script can be found here and the work-in-progress repository is available on GitHub.

The part of the main scummvm repository pre-0.9.0 is currently being migrated to git. Due to the many cvs2svn artifacts, it has been a manual process without a script. Work in progress repository available at https://github.com/wjp/scummvm-archive .

The remainder of the main scummvm repository is in progress using scripts in http://www.usecode.org/scummvm/svn2git/ . Work in progress repository at https://github.com/wjp/scummvm-2010-archive .

For historical purposes, a cleaned up git conversion of the FreeSCI darcs repo: https://github.com/wjp/freesci-archive .

An introduction to the whole process

This page contains some ideas on how to perform the migration process.

Things to tweak in the repository conversion

Author names/emails: see thread.
Do we maybe want to use unknown@scummvm.org as email address for some of the unknown authors rather than username@users.sf.net? Wjp 13:27, 2 November 2010 (UTC)
Graft in FreeSCI history from vendor/, rewriting the commit messages
Cleaned up FreeSCI history: https://github.com/wjp/freesci-archive Wjp 14:23, 9 November 2010 (UTC)
Maybe: make merges of branches back into trunk actual merges Wjp 22:35, 3 November 2010 (UTC)
Maybe: broken endlines in SCI r50329, r50331 (breaking annotations) Wjp 13:09, 2 November 2010 (UTC)
Broken endlines in the merge r54051, fixed in r54052
If possible: restore author/date of SWORD25 import commits? Wjp 13:09, 2 November 2010 (UTC)
Are there other external histories we can and want to graft in? (Sarien?) Wjp 13:09, 2 November 2010 (UTC)
Well, there are several. All the engines that were developed outside the main tree are likely candidates, e.g.: lastexpress, M4, MADE, mohawk, Sarien (as you mentioned), tinsel ... to name a few --Md5 09:22, 3 January 2011 (UTC)
Groovie is one of them too: http://code.google.com/p/t7gre/ --jvprat 18:30, 7 January 2011 (UTC)
I'd rather not have the Mohawk history in there and I don't see it ever being useful (at least I've not had to dive into the old source to hunt for regressions since Mohawk was merged, and I highly doubt I ever will have to). -Clone2727 20:52, 7 January 2011 (UTC)
After thinking about this some more, we may just want to leave these out of the main repository, and leave it to the user to graft them in (with the .git/info/grafts file) on-demand. This way we can do a proper git conversion of any desired external history at any time. Wjp 10:57, 24 January 2011 (UTC)

From Fingolfin's mail:

There was also some cleanup I performed after the CVS -> SVN switch, which in retrospect is not nice to have in the history. E.g. I cleaned up various branches, and many SVN properties. So commits like <http://scummvm.svn.sourceforge.net/viewvc/scummvm?view=revision&revision=20471> would be nice to suppress. Those SVN properties wouldn't be visible in git anyway, I guess.
Property change commits can be pruned with filter-branch --prune-empty
Also, what about fake branching commits created by cvs2svn, like <http://scummvm.svn.sourceforge.net/viewvc/scummvm?view=revision&revision=3488>, can we get rid of those, too?
Many of those actually change things (like updating version numbers or removing files), so probably not all of them.
Correction: yes, we can get rid of all of them, and I did in my scummvm-archive repo on github. Wjp 14:11, 3 January 2011 (UTC)
Also, I think when we made the CVS conversion back then, something went slightly wrong with the very old history. Consider e.g. <http://scummvm.svn.sourceforge.net/viewvc/scummvm/scummvm/trunk/?pathrev=3486>, where the code that should be in trunk is hidden away in a subdirectory "scummvm-old" of trunk. Not terrible, but certainly annoying. Esp. since later on, at some point the two are mixed: Trunk is populated but "scummvm-old" is still there.This happened in revision 4785, see <http://scummvm.svn.sourceforge.net/viewvc/scummvm?view=revision&revision=4785>, and while it lists me as the committer with commit message "Initial revision", I am not quite sure why that is... probably because we switch to a new CVS repository back then, with a revised file structure... And things were never properly stitched together. The annoying part is that "scummvm-old" is still visible much, much later, see e.g. <http://scummvm.svn.sourceforge.net/viewvc/scummvm/scummvm/trunk/?pathrev=18000>. It was only removed in revision 20419, see <http://scummvm.svn.sourceforge.net/viewvc/scummvm?view=revision&revision=20419>, manually.
Something built on the following filters should take care of scummvm-old: Wjp 22:20, 3 November 2010 (UTC)
 git checkout -b remove_old_branch  `git svn find-rev r4784`
 git filter-branch --index-filter 'git ls-files -s | 
   sed "s-\tscummvm\-old\/\"*-\t-" | GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
   git update-index --index-info && test -f $GIT_INDEX_FILE.new &&
   mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE || true' HEAD
 git checkout -b remove_old_branch_2 `git svn find-rev r20419`
 git filter-branch -f --prune-empty --parent-filter \
   'sed s/`git svn find-rev r4784`/`git rev-parse remove_old_branch`/' \
   --index-filter 'git ls-files -s | grep -v scummvm-old |
   GIT_INDEX_FILE=$GIT_INDEX_FILE.new git update-index --index-info &&
   test -f $GIT_INDEX_FILE.new && mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE || true' HEAD
Work-in-progress git repository with clean-up of cvs2svn conversion: https://github.com/wjp/scummvm-archive
Alternatively, one could directly take content from the two (!) old CVS repositories, and convert these, then graft them together with the newer history of the SVN repository, ignoring the old revisions in it coming from the CVS->SVN conversion. Not sure if this would mean more or less work in the end... --Fingolfin 10:46, 31 December 2010 (UTC)

Configure/tools updates to consider

  • Rename SCUMMVM_SVN_REVISION to SCUMMVM_REVISION
  • Change "svn" for trunk builds to "devel" (DVCS name can be part of the revision number, since the latter is DVCS-specific)
  • Update configure to extract the revision number from a git repository (in addition to svn - and maybe hg/others also?)
  • Add branch name to revision number (if not trunk)
  • Update create_project to allow extraction of revision numbers from svn/git/hg repositories

Where to host the GIT repository?

  • SF.Net Internal GIT hosting? - Not ideal. Features lacking.
  • From Fingolfin: One thing to settle is where to host the master repository: I for once would like using github, due to the many nice features it offers, like very easy code review. (FYI, I registered a scummvm project account there some time ago, in case we want to use it, and to prevent others from (ab)using it). ScummVM on GITHub.

Some of our devs already use github for their own ScummVM trees, and since github makes it super-easy for anybody to branch from another user's tree, and then push changes back (resp. file a "pull request)", this seems ideal to me.

Client alternatives

How does that compare to TortoiseGit and Git Extensions ? The latter seems to be pretty good (though I have not yet tried it myself, will see if I can do this tonight or tomorrow). It supposedly even features Visual Studio integration. --Fingolfin 14:38, 2 November 2010 (UTC)
TortoiseHG on Windows runs very well against a HG repo. I've also pulled a Git repo with hg-git and it worked but I haven't done any major development in that configuration yet (eg, branch/merge, etc). --ScottT 20:58, 2 November 2010 (UTC)
I've also been using TortoiseHG for lastexpress/create_project/scummvm development (no branching/merging used either yet). I'm also using the subversion bridge to commit back to the current repo. The svn<=>mercurial<=>git conversion is not perfect (putting extra HG info in the git comments), but that will be a moot point once the transition to git is complete. I'm also using a modified version of Posh-hg for better command-line integration.
I'm seeing performance problems (apparently caused by dulwich) when trying to push a converted scummvm repo to github though. It works great for small projects but will abort (out of memory) on huge repositories (not to mention the slowness in converting revisions and packing for sending). Is there a public scummvm tree on github so I can try importing it and pushing back to a new repo? (note that you can convert using hg-git and then push back the results directly with git, but that's painful).
For Visual Studio integration with Mercurial, there is Visual HG (and a couple other projects), which work relatively well (although lacking features compared to AnkhSVN) --Julien 14:35, 3 November 2010 (UTC)
Folks, you all write in reply to what I wrote, but without addressing at all the content of what I wrote. That's really a pity. Before debating which hg<->git bridge to use, could you maybe explain to us why TortoiseGit and Git Extensions (which, by the way, also includes Visual Studio integration) are to be dismissed out of hand? Thanks! :) --Fingolfin 21:53, 3 November 2010 (UTC)
There is no reason to dismiss them. If one wants to use git on windows, TortoiseGit is relatively nice (but not quite as good as TortoiseSVN in terms of stability). I haven't checked lately but Git Extensions did not integrate into Visual Studio as well as other source control providers (that might have changed since then). Another tool for Visual Studio is Git SCC. On a personal note, I tried Git (both command line only and msysgit) and it didn't work for me (and the maintainers attitude and complete lack of release testing sure didn't help). Nevertheless, alternatives are just that. The main recommendation for hacking on ScummVM should be git and the related tools, how to setup mercurial and a git bridge can be relegated to a subpage on the wiki (including that discussion). --Julien 00:15, 4 November 2010 (UTC)
In my instance, I used TortoiseGit about a year ago, and it was rather ordinary back then - to the point I actually jumped to using the command-line version instead. m_kiewitz's running commentary on IRC about TortoiseGit has also put me off attempting it again since. Git Extensions looks alright, though with some of the expressed interest in shell integration, the integration provided doesn't look as polished as that of TortoiseSVN or TortoiseHg. Choosing a particular client doesn't change what should happen on the server side - the choice of command-line msysgit, TortoiseGit, TortoiseHg (+ hg-git), or any others isn't a key problem. For me, I'll take good shell integration before a Visual Studio plugin, while others may want something that is similar to TortoiseSVN (if that is what they're used to) that works well :) --ScottT 03:27, 4 November 2010 (UTC)