Version Control with Subversion - Chapter 5. Repository Administration

On-line Guides

Eclipse Documentation

How To Guides

Version Control with Subversion
Prev	Home	Next

Version Control with Subversion - Chapter 5. Repository Administration - Repository Data Stores

Repository Data Stores

As of Subversion 1.1, there are two options for storing data in a Subversion repository. One type of repository stores everything in a Berkeley DB database; the other kind stores data in ordinary flat files, using a custom format. Because Subversion developers often refer to a repository as “the (versioned) filesystem”, they have adopted the habit of referring to the latter type of repository as FSFS ^[14] —a versioned filesystem implementation that uses the native OS filesystem to store data.

When a repository is created, an administrator must decide whether it will use Berkeley DB or FSFS. There are advantages and disadvantages to each, which we'll describe in a bit. Neither back-end is more “official” than another, and programs which access the repository are insulated from this implementation detail. Programs have no idea how a repository is storing data; they only see revision and transaction trees through the repository API.

Table 5.1, “Repository Data Store Comparison” gives a comparative overview of Berkeley DB and FSFS repositories. The next sections go into detail.

Table 5.1. Repository Data Store Comparison

Feature	Berkeley DB	FSFS
Sensitivity to interruptions	very; crashes and permission problems can leave the database “wedged”, requiring journaled recovery procedures.	quite insensitive.
Usable from a read-only mount	no	yes
Platform-independent storage	no	yes
Usable over network filesystems	no	yes
Repository size	slightly larger	slightly smaller
Scalability: number of revision trees	database; no problems	some older native filesystems don't scale well with thousands of entries in a single directory.
Scalability: directories with many files	slower	faster
Speed: checking out latest code	faster	slower
Speed: large commits	slower, but work is spread throughout commit	faster, but finalization delay may cause client timeouts
Group permissions handling	sensitive to user umask problems; best if accessed by only one user.	works around umask problems
Code maturity	in use since 2001	in use since 2004

[an error occurred while processing this directive]

Version Control with Subversion
Prev	Home	Next

Published under the terms of the Creative Commons License

Design by Interspire