Interesting Bash Programming Techniques
- From the sandbox
- Tutorial
These tricks were described in Google’s internal project “Testing on the Toilet” (Testing in the toilet - distributing leaflets in the toilets to remind developers of the tests).
In this article, they have been revised and supplemented.
I start each script with the following lines
This protects against two common errors
1) Attempts to use undeclared variables
2) Ignoring abnormal termination of commands
If a command can fail abnormally and this suits us, you can use the following code:
It must be remembered that some commands do not return crash codes such as “mkdir -p” and “rm -f”.
There are also difficulties with calling chains of subprograms (command1 | command 2 | command3) from a script, to bypass this restriction, you can use the following construction:
In this case, the operator '&&' will not allow the next command to be executed, for more details see 'http://fvue.nl/wiki/Bash:_Error_handling'
Bash allows you to use functions like regular commands, this greatly improves the readability of your code:
Example 1:
Example 2:
Example 3:
Try to transfer all your code into functions, leaving only global variables / constants and calling the function "main" in which there will be all the high-level logic.
Bash allows you to declare variables of several types, the most important:
local (For variables used only inside functions)
readonly (Variables attempting to reassign which cause an error)
It is possible to make a variable of type readonly from an already declared one:
Try to ensure that all your variables are either local or readonly, this will improve readability and reduce the number of errors.
Back quotes are poorly read, and in some fonts can easily be confused with single quotes.
The $ () construct also allows you to use nested calls without a headache with escaping:
Double square brackets prevent unintentional use of paths instead of variables:
In some cases, the syntax is simplified:
And also provide additional functionality:
New operators:
Supplemented / Modified Operators:
Examples:
Starting with bash 3.2, regular expressions or wildcard expressions should not be quoted, if your expression contains spaces, you can put it in a variable:
Comparison of string variables with substitution is also available in the case statement:
Bash has several (underrated) options for working with string variables:
Basic:
Substitution replacement:
Separation of variables:
Delete with substitution:
# Delete from the beginning of the line until the first match
# Delete from the beginning of the line until the last match
# Delete from end of line to first match
# Delete from end of line to last match
Some commands expect a file name to enter, the '<()' operator will help us with it, it takes a command to enter and converts it into something that can be used as a file name:
# download two URLs and pass them to diff
Using a marker to pass multi-line variables:
# MARKER - any word.
If you need to avoid substitution, then the marker can be quoted:
# the construct will return '$ var' and not the value of the variable
Example:
conclusion:
Syntax checking (saves time if the script runs longer than 15 seconds):
Trace:
Tracing with the disclosure of complex commands:
The -v and -x options can be set in the code, this can be useful if your script runs on one machine and logging is done on another:
If your project matches the items on this list, consider the languages Python or Ruby for it.
Links:
Advanced Bash-Scripting Guide: tldp.org/LDP/abs/html
Bash Reference Manual: www.gnu.org/software/bash/manual/bashref.html
Original article: robertmuth.blogspot.ru/2012/08/better -bash-scripting-in-15-minutes.html
In this article, they have been revised and supplemented.
Security
I start each script with the following lines
#!/bin/bash
set -o nounset
set -o errexit
This protects against two common errors
1) Attempts to use undeclared variables
2) Ignoring abnormal termination of commands
If a command can fail abnormally and this suits us, you can use the following code:
if ! ; then
echo "failure ignored"
fi
It must be remembered that some commands do not return crash codes such as “mkdir -p” and “rm -f”.
There are also difficulties with calling chains of subprograms (command1 | command 2 | command3) from a script, to bypass this restriction, you can use the following construction:
(./failing_command && echo A)
In this case, the operator '&&' will not allow the next command to be executed, for more details see 'http://fvue.nl/wiki/Bash:_Error_handling'
Functions
Bash allows you to use functions like regular commands, this greatly improves the readability of your code:
Example 1:
ExtractBashComments() {
egrep "^#"
}
cat myscript.sh | ExtractBashComments | wc
comments=$(ExtractBashComments < myscript.sh)
Example 2:
SumLines() { # iterating over stdin - similar to awk
local sum=0
local line=””
while read line ; do
sum=$((${sum} + ${line}))
done
echo ${sum}
}
SumLines < data_one_number_per_line.txt
Example 3:
log() { # classic logger
local prefix="[$(date +%Y/%m/%d\ %H:%M:%S)]: "
echo "${prefix} $@" >&2
}
log "INFO" "a message"
Try to transfer all your code into functions, leaving only global variables / constants and calling the function "main" in which there will be all the high-level logic.
Variable declaration
Bash allows you to declare variables of several types, the most important:
local (For variables used only inside functions)
readonly (Variables attempting to reassign which cause an error)
## Если DEFAULT_VAL уже объявлена, то использовать ее значение, иначе использовать '-7'
readonly DEFAULT_VAL=${DEFAULT_VAL:-7}
myfunc() {
# Использование локальной переменной со значением глобальной
local some_var=${DEFAULT_VAL}
...
}
It is possible to make a variable of type readonly from an already declared one:
x=5
x=6
readonly x
x=7 # failure
Try to ensure that all your variables are either local or readonly, this will improve readability and reduce the number of errors.
Use $ () instead of backticks ``
Back quotes are poorly read, and in some fonts can easily be confused with single quotes.
The $ () construct also allows you to use nested calls without a headache with escaping:
# обе команды выводят: A-B-C-D
echo "A-`echo B-\`echo C-\\\`echo D\\\`\``"
echo "A-$(echo B-$(echo C-$(echo D)))"
Use double square brackets [[]] instead of single brackets []
Double square brackets prevent unintentional use of paths instead of variables:
$ [ a < b ]
-bash: b: No such file or directory
$ [[ a < b ]]
In some cases, the syntax is simplified:
[ "${name}" \> "a" -o ${name} \< "m" ]
[[ "${name}" > "a" && "${name}" < "m" ]]
And also provide additional functionality:
New operators:
- || Logical OR - with double brackets only.
- && Logical AND - with double brackets only.
- < Comparison of string variables (string comparison) - shielding is not necessary with double brackets.
- == Comparison of string variables with substitution (string matching with globbing) - only with double brackets.
- = ~ Comparison of string variables using regular expressions (string matching with regular expressions) - only with double brackets.
Supplemented / Modified Operators:
- -lt numerical comparison
- -n String variable non-empty
- -z String variable is empty (string is empty)
- -eq numerical equality
- -ne Digital inequality
Examples:
t="abc123"
[[ "$t" == abc* ]] # true (globbing)
[[ "$t" == "abc*" ]] # false (literal matching)
[[ "$t" =~ [abc]+[123]+ ]] # true (regular expression)
[[ "$t" =~ "abc*" ]] # false (literal matching)
Starting with bash 3.2, regular expressions or wildcard expressions should not be quoted, if your expression contains spaces, you can put it in a variable:
r="a b+"
[[ "a bbb" =~ $r ]] # true
Comparison of string variables with substitution is also available in the case statement:
case $t in
abc*) ;;
esac
Work with string variables:
Bash has several (underrated) options for working with string variables:
Basic:
f="path1/path2/file.ext"
len="${#f}" # = 20 (длина строковой переменной)
# выделение участка из переменной: ${<переменная>:<начало_участка>} или ${<переменная>:<начало_участка>:<размер_участка>}
slice1="${f:6}" # = "path2/file.ext"
slice2="${f:6:5}" # = "path2"
slice3="${f: -8}" # = "file.ext" (обратите внимание на пробел перед знаком '-')
pos=6
len=5
slice4="${f:${pos}:${len}}" # = "path2"
Substitution replacement:
f="path1/path2/file.ext"
single_subst="${f/path?/x}" # = "x/path2/file.ext" (змена первого совпадения)
global_subst="${f//path?/x}" # = "x/x/file.ext" (замена всех совпадений)
Separation of variables:
f="path1/path2/file.ext"
readonly DIR_SEP="/"
array=(${f//${DIR_SEP}/ })
second_dir="${array[1]}" # = path2
Delete with substitution:
# Delete from the beginning of the line until the first match
f="path1/path2/file.ext"
extension="${f#*.}" # = "ext"
# Delete from the beginning of the line until the last match
f="path1/path2/file.ext"
filename="${f##*/}" # = "file.ext"
# Delete from end of line to first match
f="path1/path2/file.ext"
dirname="${f%/*}" # = "path1/path2"
# Delete from end of line to last match
f="path1/path2/file.ext"
root="${f%%/*}" # = "path1"
Get rid of temporary files
Some commands expect a file name to enter, the '<()' operator will help us with it, it takes a command to enter and converts it into something that can be used as a file name:
# download two URLs and pass them to diff
diff <(wget -O - url1) <(wget -O - url2)
Using a marker to pass multi-line variables:
# MARKER - any word.
command << MARKER
...
${var}
$(cmd)
...
MARKER
If you need to avoid substitution, then the marker can be quoted:
# the construct will return '$ var' and not the value of the variable
var="text"
cat << 'MARKER'
...
$var
...
MARKER
Built-in variables
- $ 0 name of the script
- $ 1 $ 2 ... $ n Parameters passed to the script / function (positional parameters to script / function)
- $$ PID of the script (PID of the script)
- $! PID of the last command executed (and run in the background)
- $? Status returned by the last command (exit status of the last command ($ {PIPESTATUS} for pipelined commands))
- $ # Number of parameters passed to script / function (number of parameters to script / function)
- $ @ All parameters passed to the script / function, represented as words (sees arguments as separate word)
- $ * All parameters passed to the script / function, presented as a single word (sees arguments as single word)
- Usually:
- $ * Rarely useful
- $ @ Handles empty parameters and parameters with spaces correctly
- $ @ When used, usually enclosed in double quotes - "$ @"
Example:
for i in "$@"; do echo '$@ param:' $i; done
for i in "$*"; do echo '$! param:' $i; done
conclusion:
bash ./parameters.sh arg1 arg2
$@ param: arg1
$@ param: arg2
$! param: arg1 arg2
Debugging
Syntax checking (saves time if the script runs longer than 15 seconds):
bash -n myscript.sh
Trace:
bash -v myscripts.sh
Tracing with the disclosure of complex commands:
bash -x myscript.sh
The -v and -x options can be set in the code, this can be useful if your script runs on one machine and logging is done on another:
set -o verbose
set -o xtrace
Signs that you should not use shell scripts:
- Your script contains more than a few hundred lines.
- You need data structures more complex than regular arrays.
- You are sick of doing obscenities with quotes and escaping.
- You need to process / modify many string variables.
- You do not need to call third-party programs and there is no need for pipes.
- Speed / performance is important to you.
If your project matches the items on this list, consider the languages Python or Ruby for it.
Links:
Advanced Bash-Scripting Guide: tldp.org/LDP/abs/html
Bash Reference Manual: www.gnu.org/software/bash/manual/bashref.html
Original article: robertmuth.blogspot.ru/2012/08/better -bash-scripting-in-15-minutes.html