Every now and then, I encounter some complex legacy code that has been moved around and changed so much, git blame doesn’t help in finding the original commit. Git bisect is the solution here.

Why the Original Commit Can Help

Sometimes, I read code and (after a while) it becomes clear what it does. But I’m often left wondering why the code is there, what the context was.

It allows me to read the commit message and maybe several commits before and after that.

Of course, working in extreme legacy code often means the commit messages are of very low quality. But to know that, I first need to find that first commit. The commit that introduced the puzzling code.

Git Blame Won’t Do

Doing a git blame sometimes helps. But a git blame will only show me the last edit of that line. And sure, I can keep going back and back in the history to find when the code was introduced.

But when I’m working in a very old project, this could be thousands of commits. And I don’t want to go through them one by one manually.

Git Bisect!

Git Bisect is a very powerful feature of Git, but not used enough, not understood enough and not even known enough. I don’t even use it all too often myself.

But here’s the thing. With the correct script, Git Bisect will go through those thousands of commits automatically and report when it finds the correct commit.

So first, I need a script. In my case, I knew the code could only be in a single, specific file, so I wrote a script that returns 1 if it finds the complex code (a bad commit) and 0 if it doesn’t (a good commit):

grep text_i_am_looking_for folder/file_to_search.py
if [ $? -eq 0 ];
then
    exit 1
else
    exit 0
fi

Disclaimer: I’m not Bash expert, so there’s probably a better way to do this, but it works.

Now we can start bisecting:

git bisect start
git bisect good <SHA of a good commit, i.e. without the text>
git bisect bad <SHA of a bad commit, i.e. with the text>
git bisect run ./bisect.sh

bisect.sh is the filename of my script

Let that run, and git will show you the commit when the code you’re looking for was first introduced!

Of course, this doesn’t work for blocks of code that could have been refactored. In my case, I was looking for a specific field name in the database and I knew it hadn’t been renamed.

I found the commit, was disappointed in the commit message and could now dig further to find why this code was introduced!

Happy refactoring!

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.