Distributed revision control was a breakthru

I strongly believe in revision control. Back in the days when there wasn’t any other choice that CVS I spent countless hours setting up CVS servers and re-freshing my mind on its obscure commands and arguments, which were almost all of them. If you drop my copy of Professional Linux Programming it’ll open on the CVS chapter by itself (much like my copy of The C Programming Language will open itself on the pointers chapter, somewhere on the pointers to pointers section).

Then came Subversion and it was somewhat better. After that we got the explosion of revision control systems with the introduction of distributed ones. We’ve got Darcs, Git, Mercurial, Arch, Bazaar, etc. My choice right now is Git. It used to be Darcs, but unfortunately it stagnated for a long time and I moved on to the next new thing. I’ve used Mercurial as well. From my perspective Mercurial and Git are almost equivalent.

For me distributed revision control was a breakthru. There’s one small feature that makes a huge difference for me. In the old days I used to set up CVS or Subversion servers/repositories (in those, a repository is essentially a server-side thing). I had to decide where that repository was going to reside, how it was going to be named, how it was going to be accessed (ssh, http, custom protocol?), by whom, etc.

Today with Git I just do this

git init

That’s it. I’m done. The directory where I am is now a repository. I can start committing, branching, rolling back, looking at the history, generating patches, stashing, etc. It’s that simple. The fact that it’s so simple goes from a quantitative advantage to a qualitative advantage. What I mean is that I not only revision-control things faster but that I revision-control things I wouldn’t have otherwise. For example the configuration directories on my servers.

I know many people will complain: “But with no centralized repository chaos will ensue”. The fact that it’s distributed doesn’t mean there can’t be a central repository. When I want to collaborate with someone using Git I create one repo somewhere, in my servers or GitHub and we both push and pull from there. The difference is that I commit a lot locally, and when the chances are ready I push them. That means that I can commit broken code, without worrying.

At work we don’t use a distributed revision control system, we use a centralized one and we have a very string peer reviewing policy. My current tasks involve touching code in many different files in many different systems never getting familiar to any of them. That means that it’s common for my peer reviews to say things like “all this methods don’t belong here, these two should go there, that one should be broken into three different ones going here, there and somewhere else”.

Now I have a problem. I can’t commit because my peers don’t consider my code ready. My code works but it has to be refactored in a very destructive way. What happens if during the refactoring it stops working. For example copying and pasting I loose a piece of code. I can’t roll back to my working state and start over. If we were using a distributed revision control system I could.

So, being able to commit non-finished code locally while colaborating with other people is one of my other crucial features in DVCS.

The third one is being able to branch locally. In a similar vein as the last example. When I find myself thinking about a very destructive refactoring that I’m not sure if it’s going to get me anywhere and worst than that is going to take me three days to do; I just create a local branch. I experiment in that branch. If at any time I get tired or I need to do anything else I go back to the main one. That is what branching is about.

Why is locally branching better than globally or centralized branching? Well, one reason is that a local branch doesn’t have to make sense to anyone else. I don’t have to pick a name that’s descriptive for anyone else than me. I don’t have to justify myself for creating a branch with anyone else. Let’s suppose I had an argument with a co-worker where I believe something is doable and (s)he believes is not. Do I want him/her to see that I created a branch to prove him/her wrong? I don’t. And if I prove myself wrong I want to quietly delete that branch and never ever talk about it.

But I am starting to go into the very hypothetical realm. In short, for me, DVCS is about this:

git init

and get to code.

Revisioning /etc with Git

First and foremost I’m a coder, a coder who strongly believes in revision control. Second I am also a sysadmin; but only by accident. I have some servers and someone has to take care of them. I’m not a good sysadmin because I don’t have any interest on it so I learn as little as possible and also because I don’t want to be good at it and get stuck doing that all the time. What I love is to code.

I’m always scare of touching a file in /etc. What if I break something? I decided to treat /etc the same way I treat my software (where I am generally not afraid of touching anything). I decided to use revision control. So far I’ve never had to roll back a change in /etc, but it gives me a big peace of mind knowing that I can.

In the days of CVS and Subversion I thought about it, but I’ve never done it. It was just too cumbersome. But DVCS changed that, it made it trivial. That’s why I believe DVCS is a breakthru. Essentially what you need to revision-control your /etc with Git is to go to /etc and turn it into a repository:

cd /etc
git init

Done. It’s now a repo. Then, submit everything in there.

git add .
git commit -am "Initial commit. Before today it's prehistory, after today it's fun!"

From now on every time you modify a file you can do

git add <filename>
git commit -m "Description of the change to <filename>"

where <filename> is one or many files. Or if you are lazy you can also do:

git commit -am "Description to all the changes."

which will commit all pending changes. To see your pending changes do:

git status


git diff

When there are new files, you add them all with

git add .

and if some files were remove, you have to run

git add -u .

to be sure that they are remove in the repo as well (otherwise they’ll stay as pending changes forever).

That’s essentially all the commands I use when using git and doing sysadmin. If you ever have to do a rollback, merge, branch, etc, you’ll need to learn git for real.

One last thing. I often forget to commit my changes in /etc, so I created this script:

#!/usr/bin/env sh

cd /etc
git status | grep -v "On branch master" | grep -v "nothing to commit"
true # Don't return 1, which is what git status does when there's nothing to do.

on /etc/cron.daily/git-status which reminds me, daily, of my pending changes.

Hope it helps!