Migrating Subversion to VSTS Git

I was asked to migrate an on-premise Subversion repository to cloud-hosted Git for quite a large project. There was a lot to find out so I did many attempts and experiments until I finally managed to migrate the complete repository including all history, branches and tags. As this process took me weeks or even months to complete, I will not go into too much detail, but the steps necessary to complete the job are listed below for your, and my own, future reference.

Important to mention:

  • The entire process was hard enough already, so I decided to not support two-way synchronization, which means I communicated a specific date to the development-teams involved on which we made the final “switch”. From that moment on, all commits needed to go to Git and Svn was set to read-only.
  • Because the usernames in Svn and VSTS are not the same, an authors.txt file was needed in which I defined the Svn-usernames and their respective VSTS counterpart. By using such a file, the git-svn process will re-write the commits as-if they were done by the VSTS-user. You could omit this part, but in that case the history of your commits will not be mapped to actual VSTS-users. The commit-author will just be handled as a string.

Example of authors file:
The contents of my authors.txt file are illustrated below. Of course these names are fictional.

johndoe = John Doe <john.doe@vstsexpert.nl>
robinpaardekam = Robin Paardekam <robin@vstsexpert.nl>

When cloning the svn-repository into a Git-repo, this file is read to make the translation. Whenever the process came across a commit of a user who was not listed in the .txt file, the process stopped as it was not able to assign the commit to a valid user. In that case, I checked the name that was missing for the error message, added the corresponding line and re-run the command. It simply continues from the point it stopped, so that was not too bad. If you want to be better prepared, you’ll be able to find all kinds of scripts that extract the list of names for you if you search online for it. For me this worked pretty well.

The commands used:
I needed to create a clone in Git-format, but I add the svn sub-command so it would create a local Git-clone from the Subversion repository, where I specified the location of the authors-file. Also, I supplied the –stdlayout parameter, specifying that the Subversiohn-repository is formatted in the standard manner, so there is a trunk folder, and next to that a branches-folder, etc. The –ignore-paths parameter is provided to exclude a specific folder from my synchronization. In my case, I did not want to include a folder that was actually an external.

git svn clone svn://svn.linux.vstsexpert.nl:1234/project --authors-file=..\authors.txt --stdlayout --ignore-paths=”/obsoleteexternal/”

Running this command will more-or-less replay the complete Subversion commits, starting at revision 1. If something would fail (e.g. the author could not be found) the clone-process stopped. If you’re lucky, you can fix the cause of the issue and run a “git svn fetch” to continue. If something more serious needs to be fixed, you will need to rerun the “git clone” into an empty directory (so either remove all contents of the previous folder, or use a another folder) and the entire process will start all over again.

The clone I made took approximatelty 23 hours to complete. Many factors can influence the time you will need. Of course the number of commits, but also the size of the project and in my case (yes, I know this is not desirable, but that is what I had to work with) there were quite some assemblies in this repository. After many attempts, finally the complete clone succeeded. It was almost too good to be true, but it was a great experience to finally have that clone on my harddisk.

After this, the clone was converted into a so-called “bare clone”. This means you do have the complete repository on disk, but there are no files “available” to look at. This makes it easier to work with. Converting it to a bare clone was done as follows:

cp -a project/.git newprojectfolder
cd newprojectfolder
git config --bool core.bare true

As you might have guessed: this is not more then just copying the .git folder from the local clone into a new location. Then change the workingdirectory into that new folder and call the git config subcommand to do the conversion. The “newprojectfolder” is now a bare repository that can easily be worked with.

Finetuning your freshly created local repository
Now that the clone was on disk, it was time to check the branches that needed to migrate as well. Not all historical branches were needed, in fact we just needed about 5 of them, so that could easily be done manually. I created a local branch for each of the Svn-remote branches that were still relevant. You can easily view which branches exist, both local (white, green) and remote (red) by running

git branch -av

For each of the branches you want to migrate, perform the command as illustrated here:

git branch newgitbranch origin/aremotesvnbranch

This will create a local branch called “newgitbranch” which is taken from the remote repository, known as “origin/aremotesvnbranch”. This could also be a convenient moment for renaming branches / housekeeping, which I did here as well. Just repeat the commands one after the other, so your local repository will have all branches set up properly. This will be done instantly.

The principle of “tags” in Git differs slightly from what you might have been used to in Svn. While a tag should be a “label on a certain state on a certain moment in time”, Svn tends to see tags similar to branches, having a complete set of files, etc. In Git we need to add a tag to an existing changeset, meaning you simply add a label to a changeset. Therefor we need to figure out which changesets were committed in Svn to create the tags: for each relevant tag we will need to perform something similar like the statement below. This adds a tag “release_2016.1” with a specific message, on the commit hash that is added at the end of the command. This hash you will need to find in the logs of Git, you could find scripts to do that online as well, I did this manually for the few tags I wanted to migrate.

git tag -a release_2016.1 -m "Create tag for Application v2017.1 final deployment" 95f140b3c5dc58a45165a4530c14f08ec9157c0c

Again, repeat this command one after the other for all tags you want to create in your repository. Once all branches and tags are created, the are ready to be migrated into the remote Git repository.

Pushing to the remote
It is now time to add a so-called remote to the repo. This way, we can push the local repository to VSTS. Note that in the example below we name our remote “sandbox” so we can easily distinguish the SVN-remote (the one that is already available in your local repository) from the VSTS-remote (which will now be addded).

git remote add sandbox https://abc.visualstudio.com/DefaultCollection/fgh/_git/xyz

Our local repo is now ready to be pushed to VSTS. To do this, execute the following command. You should use the name of your remote, in this example we continue with “sandbox”.

git push -u sandbox --all

Master and all branches are now pushed to the Git-repository in VSTS. Although we specified –all, this did not push the tags we might have created earlier: that needs a seperate command:

git push -u sandbox --tags

That is all you need: your Svn-sources, branches, tags and all history and authors have been migrated into the VSTS Git repository. For future updates from Svn into the new Git repository, it will be convenient to keep your local repository in a safe place. If you need to push more updates to the VSTS repository, e.g. when new changes were committed to Svn after you created the Git-clone, you can easily update your existing bare clone. This article will not cover any changes that were made in the meantime in the Git repository as well: in my project we decided to do a big-bang implementation, so once all changes were in Git, we set the Svn repository to read-only. As mentioned before: this is a one-way approach, no changes to Git are allowed in the meantime.

git svn fetch --fetch-all

Now your bare clone is updated with all recent changes as well. Make sure to repeat all branch subcommands as well, if something changed of course. Example (again)

git branch newgitbranch origin/aremotesvnbranch

When everything is synchronized locally again, simply push it to the remote (sandbox in my case).

git push -u sandbox --all

Final note
Let me be clear: I was not an expert on Svn nor on Git when I started on this project. I still don’t consider myself to be an expert on those two topics. I came a long way myself, but I also had help of co-workers who had the necessary experience to come up with the complete walkthrough above. I know there are more tools available. Also, many things can be “scripted” or automated in some convenient way. I personally liked the fact that it was all done by hand so it was clear what I did and I could repeat it quite easily to discover the challenges without having to go through a (complex) script or application. So as a disclaimer: the above does not come with any guarantees, it is just intended to explain what I did, hoping it will help others as well.

That’s all there is to it! Happy migrations! 🙂

Geef een reactie

Het e-mailadres wordt niet gepubliceerd. Vereiste velden zijn gemarkeerd met *

Deze site gebruikt Akismet om spam te verminderen. Bekijk hoe je reactie-gegevens worden verwerkt.