Git, $Id$ and file names

October 26, 2010


It is already possible to use filters in Git. But embedding the current file name in the expanded string is somewhat harder.

You can do this by some edit wrapper which inserts the file name “at the right time”, but I think it is much more cleaner to do this in the git-filter script.

For Puppet I want system administrators to see from which directory in the Git repository the file came from, like so:

# $puppet: .../blah/files/conf/httpd-blah.conf - 08e39ef # @1288078656 Miek Gieben$

Where ... stands for /etc/puppet/modules. Now someone knows the file came from Puppet and where to edit it in the Puppet repository.

If you want to write a git-filter that does the same you’ll something like the following algorithm:

  • get the last commit hash: hash=$(git show -s):

    commit a2c6f6aa30b309c47d048ddcb7284affde5ce672
    Author: Miek Gieben
    Date:   Sun Oct 24 21:58:03 2010 +0200

    more last minute tweaks;

  • get the git hash from the file you are currently editing:

    fhash=$(cat file | git hash-object –stdin)
    But if you do it like this, you have a chance of getting the hash of the file with the $id$ string expanded, so apply the clean filter (which is just another shell script) first:
    fhash=$(git.clean < file | git hash-object –stdin)
    % cat go-setup.tex | git hash-object –stdin

  • Recurse through your entire Git tree looking this file hash: git ls-tree -r $hash | grep $fhash, which should give the actual file name (with some added awk foo)

    % git ls-tree a2c6f6aa30b309c47d048ddcb7284affde5ce672 | 
    grep ed77d6eb5de5ebedd7d4423a8b8f120323141a60 100644 blob ed77d6eb5de5ebedd7d4423a8b8f120323141a60 go-setup.tex

And there we have the file name.

Putting this all together leads to the following script (called git.smudge):

hash=$(git show -s --pretty=format:%h%n)
# save the file
cat > $ftmp
fname=$(git ls-tree -r $hash | grep $(/etc/puppet/bin/git.clean < $ftmp | \
git hash-object --stdin) | awk '{print $4}' | sed 's/^modules/\.\.\./')
id=$(git show -s --pretty=format:%h\ @%ct\ %aN%n)
cat $ftmp | \
sed -e 's!\([[:space:]]*\$[Pp]uppet\)\$!\1: '"$fname - ${id}"'\$!' &&
rm $ftmp

And for reference, this is git.clean:

sed -e 's!\([[:space:]]*\$[Pp]uppet\):.*\$!\1\$!'