13 Aug, 2020

Parsing Optional Arguments in Bash

When writing a Bash script, there are times when it can grow to cover multiple scenarios. For example, maptiles is a script I use to convert an image into tiles to use with Leaflet. The image can be a different shape and I might want a different image format for the map tiles.

Here's the command I used to have:

# ./maptiles <input> <output> [<format>]
./maptiles image.png ./tiles jpg

Very simple. So I wanted to add the option to "square" the <input> since map tiles need to be square images. Previously, I squared the image as a separate command before passing it through, but it was a common usecase and I'd rather have it in one script. How would I go about it?

# ./maptiles <input> <output> [<format>] [square]
./maptiles image.png ./tiles jpg square

This approach is okay, but what happens if I don't want to set a <format>? Well, it messes up the argument order so my script won't easily be able to tell what square is intended for.

There's a common solution to this: flags.

# ./maptiles <input> [--format <format>] [--square] <output>
./maptiles image.png --format jpg --square ./tiles

Since each argument here is flagged, there's no ambiguity. Remove --format <format> and it's still obvious --square is its own flag.

Most languages provide libraries to parse these flags into key-value maps. However in Bash, it's not as simple. Since one of the appeals of Bash is its portability, introducing an external dependency has a higher bar and should be avoided where possible.

There is a standard getopts command which can parse single-letter flags, but I'm not a huge fan of those as it affects readability for long-term scripts.

So I looked around and found a StackOverflow answer which is almost perfect. Essentially the solution is to iterate through all of your arguments and handle them accordingly until there's none left. Here's what I got:

POSITIONAL=()
while [[ $# -gt 0 ]]; do
  key="$1"
  case $key in
    -f|--format)
      format="$2"
      shift
      shift
      ;;
    -s|--square)
      square="true"
      shift
      ;;
    -*)
      failure "unknown argument $1"
      exit 1
      ;;
    *)
      POSITIONAL+=("$1")
      shift
      ;;
  esac
done
set -- "${POSITIONAL[@]}"

input="${1}"
output="${2}"

if [ -z ${format} ]; then # if format is not set
  format='png' # default to 'png'
fi

if [ ! -z ${square} ]; then # if square is set
  # run squaring command
fi

Let's go through it.

POSITIONAL is used to store any arguments that aren't flagged. In my case the input and output. Their order is maintained so they can be used later in the script.

The flags are used for conditional logic. All flags are assumed to start with at least one -, so if any unknown flags are used, they're caught by the -* matcher and the script fails.

After one argument is handled, it's removed from the arguments array using shift, which removes the first element of the argument list. If it also handles a value, such as jpg in --format jpg, that will also be removed with a second shift. This way on each iteration $1 will always point to the next argument and $2 will be a value, if any.

$# is the length of the array, so once all the arguments are handled and removed, the parsing stops.

Finally set -- replaces the now empty arguments array with the POSITIONAL array so that those values can be used as normal ($1, $2, etc.).

Conclusion

This is pretty much a perfect solution. It's easy to understand, the process is easy to remember and it's all using simple Bash.

Sure it's a bit error prone; needing to use shift a specific number of times is prone to human error. But that can easily be fixed with some small functions tailored of specific use cases.

Thanks for reading.