-
Notifications
You must be signed in to change notification settings - Fork 0
SVN to Git Conversion
Das2 java sources were converted from two different SVN repositories into a single git repository while retaining history back to 2003. This was possible thanks to the svn-all-fast-export provided by the KDE project and due to fact that git uses universally unique tags.
Here's the recipe that was followed for the conversion process.
Get the svn-all-fast-export tool and read about it:
apt install svn-get-all-fast-export
man svn-get-all-fast-export
if the package is not available from your distribution clone the svn-all-fast-export git repository and build it from source.
Get a copy of the server files using rsync. Don't use the same name for the
output directory that any of the new repos may use. The --delete
option
is used below in case you've run rsync before and want to make sure the local
copy only contains files that are available at the remote source. It will not
affect the remote source location.
rsync -avP --delete USER@saturn:/PATH/TO/REPO/das2/ ./das2_svn_repo/
ssh -t [email protected] create
rsync -avP --delete [email protected]:/home/svn/p/autoplot/code/ ./ap_svn_repo/
Make an authors file for all commits to a given branch. NOTE: All paths given after /das2_svn_repo/ below are repository paths and have nothing to do with the file layout underneath the repository directory. So:
file://$(pwd)/das2_svn_repo/ <-- Path to local copy of the repository
dasCore/community/autoplot2011/trunk <-- Virtual path within the repository
Here's an example. You'll probably have to run this multiple times for various branches in your repository:
svn log --quiet file://$(pwd)/das2_svn_repo/dasCore/community/autoplot2011/trunk | \
grep -E "r[0-9]+ \| .+ \|" | cut -d'|' -f2 | sed 's/ //g' | sort | uniq > \
authors.txt
Note that if a branch is missing in the HEAD version of your SVN repository
the path in a particular version of the repository may be specified using the
@
symbol after the path. For example:
file://$(pwd)/das2_svn_repo/dasCore/netbeans_trunk@9324
All git commits require an email address. Each line of the authors.txt
file
used by the conversion rules below looks similar to:
eewest Edward West <[email protected]>
jbfaden Jeremy Faden <[email protected]>
cwp Chris Piker <[email protected]>
and so on.
Follow the source paths as they move around the repository back to the original location at revision 8. This involves many commands like:
svn log -v --stop-on-copy \
file://$(pwd)/das2_svn_repo/core/stable/dasCore/src/main/java/org/das2/graph/DasAxis.java
It's probably easiest to pick an old representative file that was committed
very early in the project, such as DasAxis.java
or QDataSet.java
. Each
time log output stops. Amend the path, set to the next earlier revision
number and start again:
svn log -v --stop-on-copy \
file://$(pwd)/das2_svn_repo/dasCore/netbeans_trunk/src/org/das2/graph/DasAxis.java@9324
Keep playing out the logs, jumping to earlier and earlier locations going back to the original commit. The resulting rules files manually created via this process are provided at the end of this page.
Run the rules to create a bare git repository, i.e. one with no working files:
svn-all-fast-export --identity-map=authors.txt --rules=dasCore.rules \
--stats --add-metadata --dry-run das2_svn_repo
Base on the rules file below this creates the a bare das2java
git
repository. To convert this to a standard git repository, and to repack
the objects:
cd das2java
mkdir .git
mv * .git
git config --local --bool core.bare false
git reset --hard
git repack -a -d -f --window=250 --depth=250
Repacking git repositories created by svn-all-fast-export is essential for better performance down the road. The size of repository can be collapsed by half or more.
Next run the rules file on the Autoplot SVN repository. Again this is the actual repository as rsync'ed from SourceForge, not a working copy. The rules file referenced in this command is provided at bottom of this page.
svn-all-fast-export --identity-map=authors.txt --rules=QDataSet.rules \
--stats --add-metadata --dry-run ap_svn_repo
Now turn it into a standard git repository and repack the deltas:
cd qdataset
mkdir .git
mv * .git
git config --local --bool core.bare false
git reset --hard
git repack -a -d -f --window=250 --depth=250
Git commits are tagged by globally unique hash values. This means
that two different git repositories may be merged while maintaining
history, but there is a catch. The paths within each repository
must be different. We are able to merge the das2java
and qdataset
because the rules files have been carefully crafted to avoid using
the same top level paths at any revision.
To combine git repositories, start with the oldest one (by date) and add the newer one:
cd das2java
git checkout main
git remote add -f qdataset /path/to/new/repo/qdataset
git merge --allow-unrelated-histories qdataset/main # other repo/branch
Now test that SVN numbers are findable in both:
git log --follow QDataSet/src/org/das2/qds/QDataSet.java
git log --follow dasCore/src/org/das2/graph/DasAxis.java
Remove the remote tracking branch that leads to local qdataset repository:
git branch -r -d qdataset/main
Create an empty repository on github.com. Don't add LICENSE or README.md! Since the rules files only defined two branches, the following commands push everything to github.
git remote add origin [email protected]:das-developers/das2java.git
git branch -M main
git push -u origin main
git checkout original
git branch -M original
git push -u origin original
The following files provided the input and output rules for svn-all-fast-export.
# Rules to convert the dasCore SVN repository to git, using the
# svn-all-fast-export tool from the KDE project.
#
#
# dasCore dependencies tree
#
# dasCore (main) dasCore (original)
# | |
# |- QDataSet |- (none)
# | |
# | |- dasCoreDatum
# | |- dasCoreUtil
# |
# |- dasCoreDatum
# |- dasCoreUtil
#
#
# The overall dasCore project went through three major changes
#
# 1. Renaming the source packages: edu.uiowa.physics.pw.das -> org.das2
#
# 2. Branching to add a dependency on the QDataSet project
#
# 3. Breaking off the util and datum components.
#
# The sources that are today (r12058) part of dasCore have moved around
# the larger SVN repository roughly as follows:
#
# Revision Range Repo Path Branch
# --------------- -------------------------------------------- ------
# r1 - r4347 dasCore/trunk main
# (big reorg to src/org/das2 ~r4187)
#
# r4348 - r6504 dasCore/trunk main
# r6505 - r9324 dasCore/netbeans_trunk main
#
# (Maybe add maven main here?)
#
# r9325 - r12056 core/stable/dasCore main
#
# r4348 - r4387 dasCore/branches/community/autoplot/trunk autoplot
# r4388 - r5032 dasCore/community/autoplot/trunk autoplot
# r5033 - r5215 dasCore/community/autoplot2010/trunk/dasCore autoplot
# r5216 - r12058 dasCore/community/autoplot2011/trunk/dasCore autoplot
#
#
# Since the autoplot branch has the most support, it will be displayed as the
# default branch at github.com
#
# The rule-set below thus tracks different paths for different revision
# ranges.
create repository das2java
end repository
# The main branch (which maybe should be called the classic branch?)
match /dasCore/trunk/
min revision 1
max revision 4348
branch main
prefix dasCore/
repository das2java
end match
# The autoplot branch (which receives more testing than the main branch)
match /dasCore/branches/community/autoplot/trunk/
min revision 4348
max revision 4387
branch main
prefix dasCore/
repository das2java
end match
match /dasCore/community/autoplot/trunk/
min revision 4388
max revision 5032
branch main
prefix dasCore/
repository das2java
end match
# Starting with the 2010 branches we no longer need a sub directory
# for dasCore since it's separated out in the repository
match /dasCore/community/autoplot2010/trunk/
min revision 5033
max revision 5215
branch main
repository das2java
end match
match /dasCore/community/autoplot2011/trunk/
min revision 5216
branch main
repository das2java
end match
# And the continuation of the original branch
match /dasCore/trunk/
min revision 4348
max revision 6504
branch original
prefix dasCore/
repository das2java
end match
match /dasCore/netbeans_trunk/
min revision 6505
max revision 9324
branch original
prefix dasCore/
repository das2java
end match
match /core/stable/dasCore/
min revision 9325
branch original
prefix dasCore/
repository das2java
end match
# Ignore all the rest of the repo
match /
end match
# Rules to extract the qdataset project from the autoplot repoitory and convent
# it co a get repoistory using the svn-all-fast-export tool from the KDE project
create repository qdataset
end repository
# Only one branch retained in this ruleset, since the focus is on
# retaining SVN revision numbers for later tagging
match /autoplot/trunk/QDataSet/
repository qdataset
branch main
prefix QDataSet/
min revision 5779
end match
match /autoplot/trunk/QStream/
repository qdataset
branch main
prefix QStream/
min revision 5779
end match
match /autoplot/branches/autoplot2010/QDataSet/
repository qdataset
branch main
prefix QDataSet/
max revision 5778
min revision 5287
end match
match /autoplot/branches/autoplot2010/QStream/
repository qdataset
branch main
prefix QStream/
max revision 5778
min revision 5287
end match
match /autoplot/branches/agu2009/QDataSet/
repository qdataset
branch main
prefix QDataSet/
max revision 5286
min revision 3784
end match
match /autoplot/branches/agu2009/QStream/
repository qdataset
branch main
prefix QStream/
max revision 5286
min revision 3784
end match
match /autoplot/trunk/QDataSet/
repository qdataset
branch main
prefix QDataSet/
max revision 3783
min revision 1
end match
match /autoplot/trunk/QStream/
repository qdataset
branch main
prefix QStream/
max revision 3783
min revision 1
end match
# Ignore all the rest of the repo
match /
end match