Colourize SVN Logs (or any Text File)

To forestall the likely "you should use git," please just assume I'm dealing with a large legacy code base.

Today I started thinking about how nice it would be if my SVN logs were colourized. Nothing complex: I just wanted the revision number and the username highlighted so I could spot them faster. This turned out to be somewhat challenging to code, but I think it really helps the visibility. And this method of highlighting could be used to highlight almost any type of text file that has repetitive patterns. That said - don't use this method if any other method of colourization is available, and you'll also need a good handle on sed to make this work.

Here's the core code:

userColour="$(echo -e "\033[1;33m")"
revColour="$(echo -e "\033[1;31m")"
noColour="$(echo -e "\033[0m")"
entryCount=100

svn log --verbose --limit "${entryCount}" |\
    sed -e 's/\(^r[1-9][0-9]* \)[|]\( [a-zA-Z][a-zA-Z]* \)[|]/'"${revColour}"'\1'"${noColour}"'|'"${userColour}"'\2'"${noColour}"'|/' \
    | less -R

This has no safeguards, but shows the simplicity of the idea. And the ugliness of the sed code involved. I've wrapped this with options and some safety features, and I'll include that below.

Here's how it looks if you just run svn log:

Plain SVN log without colourization

And here's what it looks like with this code:

SVN long with username and revision number highlighted in yellow and red respectively.

sed Breakdown

Let's look at the sed code again:

sed -e 's/\(^r[1-9][0-9]* \)[|]\( [a-zA-Z][a-zA-Z]* \)[|]/'"${revColour}"'\1'"${noColour}"'|'"${userColour}"'\2'"${noColour}"'|/'

sed is the Stream EDitor: it takes input and filters it to create modified output based on the regex you provide. -e tells it we have a command to execute. s/ starts a substitution. \( starts a capture group: \) ends the capture group, and since it's the first capture group (we use two here) \1 recalls the contents of the capture group later. The grouping ^r[1-9][0-9]* finds text at the beginning of the line (the ^ caret matches only at the beginning of the line) in the form of SVN's revision numbers. Pipes are a special character and a general pain-in-the-ass, so I cheat slightly and make a single pipe into a character group [|]. The next set of characters (also in a capture grouping) are [a-zA-Z][a-zA-Z]* which is a quick way to find the committer name. If you know regex, you may realize that anyone whose user name includes a number or special character (the latter is unlikely) won't be matched. After the middle /, we're doing playback and rewriting the blocks of text we captured. We surround the rev number with '"${revColour}"' (the bizarre quoting is necessary because we're inserting a Bash variable into a sed regex) and '"${noColour}"', which turns colour highlighting off again. Similarly we surround the second capture group, the username, with '"${userColour}"' highlighting.

Finally, here's the code with commenting, safeguards, options, and a help system.

#!/usr/bin/env bash
#
#   filename: svnl
#   Purpose: Apply (limited) colourization to a shortened SVN log.
#       Highlights rev number in $revColour, username in $userColour.
#       Limits playback to $entryCount log entries.  Count is changeable
#       by a command line variable.

userColour="$(echo -e "\033[1;33m")"
revColour="$(echo -e "\033[1;31m")"
noColour="$(echo -e "\033[0m")"
entryCount=100

###############################################################################
#                            Help
###############################################################################

help() {
    # Unlike Python's "argparse," Bash's "getopts" doesn't auto-generate
    # the help or keep this "help" output up-to-date: you have to do that.
    echo "Usage:"
    echo "    $(basename "${0}") [-h] [-n N]"
    echo ""
    echo "Show the local folder SVN log."
    echo ""
    echo "-h    show this help and exit"
    echo "-n    show N log entries (default is ${entryCount})"
}

showLog() {
    svn log --verbose --limit "${1}" |\
        sed -e 's/\(^r[1-9][0-9]* \)[|]\( [a-zA-Z][a-zA-Z]* \)[|]/'"${revColour}"'\1'"${noColour}"'|'"${userColour}"'\2'"${noColour}"'|/' \
        | less -R
}


###############################################################################
#                    Process the command line
###############################################################################

if [ $# -lt 1 ]
then
    showLog ${entryCount}
    exit 1
fi

# http://wiki.bash-hackers.org/howto/getopts_tutorial
while getopts ":hn:" opt
do
    case ${opt} in
        h)
            help
            exit 0
            ;;
        n)
            count="$OPTARG"
            if ! [[ ${count} =~ ^-?[0-9]+$ ]]
            then
                echo "Parameter must be an integer."
                help
                exit 2
            fi
            showLog "${count}"
            exit 0
            ;;

        \?)
            echo "invalid option: -${OPTARG}" >&2
            help
            exit 1
            ;;
        :)
            # 2017-08-03: no option currently requires an option, but this
            # is how to handle it if that changes.
            echo "option -${OPTARG} requires an argument." >&2
            help
            exit 1
            ;;
    esac
done