Subversion vs. Git: Debunking Myths About Debunking Myths

Subversion vs. Git: Myths and Facts claims to dispel some myths about version control systems. I doubted their “facts” and checked some of them. The result of the check was the undermined credibility of the site, and a skeptical attitude to other statements.

A little about why I was interested
I changed my job relatively recently, got into a company that uses svn, but switching to git often pops up in discussions.

Once I witnessed a discussion of this topic. Colleagues discussed the same site and came to the conclusion that "we are changing the flea."
I was an involuntary listener in this dialogue, but what kind of site and its arguments interested me. Went to understand.

Let's start with the first statement


Git repositories are significantly smaller than equivalent Subversion ones
False. A myth.

The particular delta compression algorithms used in both version control systems differ in many details, but in general Subversion and Git store data in the same way. This results in the fact that Subversion and Git repositories with equivalent data will have approximately the same size. Except for the case of storing a lot of binary files, when Subversion repositories could be significantly smaller than Git ones (because Subversion's xdelta delta compression algorithm works both for binary and text files).

Below is an example where they compare the size of the repository. Conclusion - the difference is not significant.

I was confused by the different number of commits, and actually different primary sources (who knows how they synchronize these repositories there). Also, I was not satisfied with the level of detail of the description of the process of obtaining these numbers.

So, let's start our experiment!

Get the svn repository


svnrdump.exe dump https://core.svn.wordpress.org/ > svndump
svnadmin create svn
svnadmin load svn < svndump 

We have a local copy of the entire repository in svn format. This is a 213 MB folder that contains 79,758 files and 88 folders .

At this point, the repository has 39,864 commits. A working copy of the project consists of 1701 files and 160 folders.

We start the migration to git


git svn clone -s --prefix "svn/" file:///%path%/svn git_from_svn 

This stage was the longest, dragged on for more than a day (about 32 hours). As a result, we have a git repository - a copy of the original svn repository (not quite a copy, but for us the differences are not significant).

Here I did a little stupid thing, it would be worth creating a repository without a working copy, but once again I was not ready to wait more than a day.

So, the git_from_svn / .git folder: 208 MB contains 7841 files and 509 folders . At this stage, we can really talk about a slight advantage in favor of git, apparently it was precisely at this stage that those who led the site stopped. But the git has 2 storage formats: “loose” object and “packfile".

Let's see what we have:

git count-objects -v -H

Result:

count: 6826
size: 30.21 MiB
in-pack: 249852
packs: 25
size-pack: 145.51 MiB
prune-packable: 73
garbage: 0
size-garbage: 0 bytes

That is, we have many packages and 6826 (30.21 MB) non-compressed objects.

We optimize storage


git gc
git count-objects -v -H

Result:

count: 0
size: 0 bytes
in-pack: 256605
packs: 1
size-pack: 104.54 MiB
prune-packable: 0
garbage: 0
size-garbage: 0 bytes

The size of the ".git" folder itself is 136 MB (945 files, 255 folders) . It seems to me that this advantage can hardly be called insignificant.

We also clean it


But this is not all, if you get rid of the svn legacy - pushing it all into the bare repository gives an even more interesting picture: 106 MB, 19 files, 8 folders .

Putting it all together


svn - 213 MB (79 758 files, 88 folders)
svn (after pack) - 214 MB (4 644 files, 89 folders) (supplement after publication)
git svn - 208 MB (7 841 files, 509 folders)
git svn (pack ) - 136 MB (945 files, 255 folders)
git bare (pack) - 106 MB (19 files, 8 folders)

It seems to me that at this stage the debunking of the myth can be considered debunked (the statement on the site is disproved).

It is also worth mentioning that you will still have loose objects, they are intended to increase the productivity of working with frequently used files. Typically, this format will store files from branches in which work is currently underway. Their number can be adjusted through configuration files.

Move on


Branches are expensive in Subversion
False. A myth.

Branches in Subversion are implemented with Copy-On-Write strategy (referred to as 'Cheap Copies' in the svnbook). No matter how large a repository or project is, it takes a constant amount of time and space to make a branch. In fact, Subversion branches are extremely cheap beginning with version 1.0 and you can branch even for small bugfixes in a very busy and large project.

And the experiment is "confirming."

Conclusion - the creation of brunch is faster than you have time to blink an eye. It seemed to me that if the whole operation takes less than 0.01 seconds, then there is nothing to compare. But for some reason, on the statement about the high cost of brunches in svn, the site checked only their creation. But there are other operations, such as cloning (or svn checkout). In this experiment, everything happens locally, a possible network effect is ruled out.

The first experiment is cloning


svn checkout %local_path%/trunk

TotalSeconds: 14.0737539

git clone %local_path%

TotalSeconds: 21.8173709

Git lost. But here it is worth considering, git got the whole story, and svn got one revision.

Second experiment - brunch change


svn switch /branches/4.7

TotalSeconds: 4.3741352

git checkout -B "4.7" "origin/4.7"

TotalSeconds: 1.2700857

The opposite is true here, while on one switch we won back half of the loss during cloning.

On this I will probably end. I will go discuss these results with the staff.

PS: Despite the fact that the message of this note is banal "you should not trust untrusted sources on the Internet", but it turns out that there are still people who have not developed an adequate level of healthy skepticism.

Thanks for attention!

Update:
In the comments, VBKesha mentioned that it would be nice to make svnadmin pack.
svnadmin.exe pack svn
214 MB (4,644 Files, 89 Folders) It
helped only with the number of files, the volume increased slightly.

Also popular now: