Tuesday, March 8, 2011

Playing with Git

So I finally got fed up with subversion for managing my personal research projects and documents, etc. and have been playing around with git. So far, I really enjoy it for a few reasons:
  • I commit more
  • I branch more
  • I divide projects logically


Committing changes really is easier, so you do it more. With every commit pushing changes out to a remote repository, and possibly forcing you to update first, there's just enough overhead that I ended up not doing it as much as I should. With git, everything's local, so it's faster and easier. That said, it means that I'm really only committing for myself. In order to commit changes to other people or bring in changes from other people I have to either pull or push (to a bare repository!). However, for files that are mainly personal, I tend to work for a longer period on a single machine and then occasionally shifting to a different machine either to run large jobs or work not at my desk. So for me, a major point of (much) version control is just versioning stuff for personal consumption, not for coordinating with others.

Branching/Merging is easier, so you do it more. In git, there's no real distinction between pulling from a remote repository (e.g. for collaboration) and pulling in from another branch.

Using examples from the excellent gittutorial, pulling from a remote repository, ala svn update, is:
git pull /home/bob/myrepo
Or for the cautious who want to put the update from Bob's master into a new bob-incoming branch:
git fetch /home/bob/myrepo master:bob-incoming

In contrast, merging the experimental branch into the current one:
git pull . experimental
This means that branching and merging uses utilities and skills that you already need to know if you're using git at all! Since you're already familiar with the tools, branching and merging in the traditional sense become much less intimidating.


My repositories now make sense. Because I'm lazy, I'll tend to set things up in the way that seems easiest, not in the way that makes sense. Since svn makes it easy to checkout just part of a repository, I had basically everything I've done in the past 5 years in one repository. This meant coursework, papers (that I wrote), papers (that I read), code, scripts, etc. all went to one svn repository. This was not a good decision, and was ultimately part of why I left subversion. I had munged up something in part of the repository and didn't really know how to fixing other than exporting everything to a new repository.

In contrast, git does not allow you to checkout part of a repository, so you are required to split it up into the subcomponents that make sense. But if that was the end of the story, I'd be complaining about git instead of lauding it because it'd be a pain to have to manually checkout each repository for large projects. However, as pointed out in the git community book, submodules make this unnecessary.

For some modules, a, b, c, d that you have:

$ mkdir ~/git
$ cd ~/git
$ for i in a b c d
do
mkdir $i
cd $i
git init
echo "module $i" > $i.txt
git add $i.txt
git commit -m "Initial commit, submodule $i"
cd ..
done


You just add them as submodules to the logical container:

$ mkdir super
$ cd super
$ git init
$ for i in a b c d
do
git submodule add ~/git/$i $i
done
Unfortunately, cloning, updating and committing of submodules gets a bit more complicated. Rather than duplicate information that's better described (and maintained) elsewhere, I'll just say to look at the git community book.

Anyway, I'm pretty pleased with git so far, at least for personal stuff.

No comments:

Post a Comment