This file attempts to provide the basics of bash scripting as relevant to searchfox's automation. Note that all scripts use bash, not sh, which means we have additional tricks available.
Nothing here is authoritative or exhaustive. If you want those, check out these links:
- Bash Sheet - A quick one page reference.
- Bash Guide - Detailed (useful) documentation.
- Bash FAQ and specific pages which focus on best practices, edge cases, and common mistakes, although there's also Practices from the Bash Guide if you're not already dealing with something that's broken.
It's probably a good idea to read the Quotes and Arguments pages if you're touching anything related to variables.
We use the following initialization stanza in all bash scripts at the top:
set -x # Show commands
set -eu # Errors/undefined vars are fatal
set -o pipefail # Check all commands in a pipeline
Because we set that undefined variables are fatal, it's not okay to reference
a positional argument like $4
unless it's a mandatory argument, and ideally
after checking the number of arguments.
Exact match:
if [[ $# -ne 1 ]]; then
echo "Usage: $0 <ARG>"
echo " e.g.: $0 example-arg"
exit 1
fi
Minimum count:
if [[ $# -lt 2 ]]
then
echo "Usage: $0 arg-1 arg-2 [optional-arg]"
exit 1
fi
If [[
is used instead of [
there's no need to quote the variable. Note that
because of set -eu
, this will still error if the variable is not defined. See
the next section for how to handle that.
if [[ $defined_var_that_may_be_empty ]]; then
# logic to run if the variable was non-empty
fi
Default an argument to something if unset or empty (the :
makes it handle
empty in addition to unset):
NAME=${4:-default_value}
This also works if you want to normalize an omitted argument to being empty:
NAME=${4:-}
Check whether a file exists and is a regular file.
if [[ -f $PATH ]]; then
# logic to run if the file existed
fi
Check if it doesn't exist or isn't a regular file.
if [[ ! -f $PATH ]]; then
# logic to run if the file didn't exist / wasn't a file
fi
Other related tests:
-f
is a file (not a directory or something weird)-x
is an executable file-d
is a directory-e
is any kind of file-h
is a symlink
if [[ -d $PATH ]]; then
# logic to run if the dir existed and was a dir
fi
Check if it doesn't exist or isn't a directory.
if [[ ! -d $PATH ]]; then
# logic to run if the dir didn't exist or wasn't a file.
fi
If you don't care if something is a directory or weird thing, use -e
.
It's still probably a good idea to read the Quotes and Arguments pages if you're touching anything related to this. But here are important highlights.
As documented at Quoting Happens Before PE if you put single quotes inside a double quote to try and escape something that you know will be passed to another shell invocation, the single quotes will be escaped as content, which is probably not what you were trying to do. Example:
testfoo='bar' # the use of single-quotes doesn't matter here
set -x # Show commands
$ echo "'$testfoo'"
+ echo ''\''bar'\''' #
'bar'
If you're thinking about doing this because you're using parallel
, see the
section on parallel
.
Using a wildcard that you don't want globbed because you're passing it to
find
? Wrap it in double-quotes, and you can still use variables!
"*.json foo-${VAR}-*.json"
See https://mywiki.wooledge.org/BashFAQ/001 but the basic idea is that instead of doing:
# THIS IS THE BAD EXAMPLE DON'T DO THIS BECAUSE IF THERE ARE SPACES IN THE FILE
# NAME IT WILL BE PARSED AS TWO SEPARATE TOKENS, NOT ONE, AND THEN YOU WILL HAVE
# A BAD TIME.
for file in $(find . -type f | sort -r); do
gzip -f "$file"
touch -r "$file".gz "$file"
done
you want to do:
find . -type f | sort -r | while read -r file; do
gzip -f "$file"
touch -r "$file".gz "$file"
done
because the for loop will tokenize things incorrectly.
GNU parallel does use a shell in each of its invocations. So shell parsing will happen both in the invocation of parallel and each of its sub-invocations.
Passing -q
to parallel will cause it to escape everything it passes to the
shell. This is necessary in cases where arguments contain characters like ;
which the shell will interpret and aren't automatically escaped by parallel.
The -q
option should be used instead of attempting to embed quotes within
quotes, which https://mywiki.wooledge.org/Arguments#Quoting_Happens_Before_PE
tells us will end badly.
Parallel has a --shellquote
argument that can be used to generate a quoted
version of a parallel command so that -q
doesn't need to be used (which could
preclude some shell magic).
See https://www.gnu.org/software/parallel/parallel_tutorial.html#Quoting for more info.
If we want something like an exported RUST_BACKTRACE
to be propagated into the
command run by parallel, we need to pass --env RUST_BACKTRACE
or use the
env_parallel
helper to propagate the entire environment.