Converting CVS and SVN repositories to Git
I had the wonderful opportunity of converting several CVS and SVN repositories to Git, this is of course a very enjoyable process that is both quick and easy especially when the repositories are non-standard and use a few different code pages 🙃
CVS
I initially tried using git-cvsimport which although easy to set up it was incredibly slow and couldn’t handle the various text encodings.
I wound up using cvs2git along with git fast-import this was much faster and cvs2git allows you to stack the potential text encodings used which it will try in order so you can prioritize them.
SVN
I first tried using git-svn with help from this svn-migration-scripts.jar made by Atlassian to get a list of the authors in the SVN repo. This worked ok with standard SVN repositories, but any that were weird did not play well. Which of course a number of the SVN repositories were non-standard.
Instead, I used svn-all-fast-export to do the actual svn to git migration and used svneverever to figure out the structure of the SVN repository to make the rules for svn-all-fast-export. I still used svn-migration-scripts.jar to create the identity-map for svn-all-fast-export. This gave much better results but was much more tedious.
To make the rules for svn-all-fast-export I used svneverever to generate a few different outputs of the structure of the SVN repository to make sure I had all the information I needed.
svneverever example using –depth 1 on a non-standard SVN repo
( 1; 785) /branches
[..]
(510; 622) /random1
[..]
(511; 583) /random2
[..]
(548; 785) /random_branches
[..]
(555; 785) /random_tags
[..]
(508; 785) /random_trunk
[..]
( 1; 785) /tags
[..]
( 1; 785) /trunk
[..]
Example rules for svn-all-fast-export
create repository git_repo_name
end repository
match /trunk/
repository git_repo_name
branch master
end match
match /(random1|random2)/
repository git_repo_name
branch \1
end match
match /branches/([^/]+)/
repository git_repo_name
branch \1
end match
match /tags/([^/]+)/
repository git_repo_name
branch tags/\1
end match
match /random_trunk/
repository git_repo_name
branch random
end match
match /random_branches/([^/]+)/
repository git_repo_name
branch \1
end match
match /random_tags/([^/]+)/
repository git_repo_name
branch tags/\1
end match
match /([^/]+)$
repository git_repo_name
branch master
prefix \1
end match
One issue I ran into was when there was no author for a given SVN commit svn-all-fast-export would throw an error and quit, a workaround for this is to add (no author) = no_author <no_author@no_author>
to the identity-map for svn-all-fast-export. Also when dealing with character encoding Linux is your friend, trying to handle character encoding issues in Windows is much more difficult if not effectively impossible.