continue
These are some coding guidelines that helped me to read and understand my own code over the years. They also will help to produce code that will be a bit more robust than “if something breaks, I know how to fix it”.
This is not a bible, of course. But I have seen so much ugly and terrible code (not only in shell) during all the years, that I'm 100% convinced there needs to be some code optics and style. No matter which one you use, use it through all your code (at least don't change it within the same shellscript file). Don't change your code optics with your mood.
Some good code optics help you to read your own code after a while. And of course it helps others to read the code.
Indention is nothing that technically influences a script, it's only for us humans.
I'm used to use the indention of two space characters:
Speaking of hard-tabs: Avoid them if possible. They only make trouble. I can imagine one case where they're useful: Indenting here-documents.
Whenever you need to break lines of long code, you should follow one of these two ways:
Indention using command width:
activate some_very_long_option \
some_other_option
Indention using two spaces:
activate some_very_long_option \ some_other_option
Personally, with some exceptions, I prefer the first form because the optic impression of “this belongs together” is better.
Compound commands form the structures that make a shell script different from a stupid enumeration of commands. Usually they contain a kind of “head” and a “body” that contains command lists. This kind of compound command is relatively easy to indent.
I'm used to (not all points apply to all compound commands, just pick the basic idea):
What?! Well, here again:
HEAD_KEYWORD parameters; BODY_BEGIN BODY_COMMANDS BODY_END
This construct is a bit special, because it has keywords (elif, else) “in the middle”. The optical nice way is to indent them like the if:
if ...; then ... elif ...; then ... else ... fi
for f in /etc/*; do ... done
while [[ $answer != [YyNn] ]]; do ... done
The case construct might need a bit more discussion here, since the structure is a bit more complex.
In general it's the same: Every new “layer” gets a new indention level:
case $input in
hello)
echo "You said hello"
;;
bye)
echo "You said bye"
if foo; then
bar
fi
;;
*)
echo "You said something weird..."
;;
esac
Some notes:
hello)) and the corresponding action terminator (;;) are indented at the same levelCryptic constructs, we all know them, we all love them. If they are not 100% needed, avoid them, since nobody except you may be able to decipher them.
It's - just like in C - the middle between smartness, efficiency and readablity.
If you need to use a cryptic construct, place a small comment that actually tells what your monster is for.
Since all reserved variables are UPPERCASE, the safest way is to only use lowercase variable names. This is true for reading user input, loop counting variables, etc., ... (in the example: file)
If you use UPPERCASE variable names for own variables, do not use reserved variable names (see SUS for an incomplete list).
Usually I use UPPERCASE variable names for variables that are meant to be used like constants, e.g. some configuration the user can make at the top of the script (in the example: MY_LOG_DIRECTORY). In some cases it also might be wise to prepend an identifier separated by an underscore, to get a kind of “namespace” (in the example: MY_).
#!/bin/bash MY_LOG_DIRECTORY=/var/adm/ for file in "$MY_LOG_DIRECTORY"/*; do echo "Found Logfile: $file" done
As in C, it's always a good idea to initialize your variables, though, the shell will initialize fresh variables itself (better: Unset variables will generally behave like variables containing a nullstring).
It's no problem to pass a variable you use as environment to the script. If you blindly assume that all variables you use are empty for the first time, somebody can inject a variable content by just passing it in the environment.
The solution is simple and effective: Initialize them
MY_INPUT="" MY_ARRAY=() MY_NUMBER=0
If you do that for every variable you use, then you also have a kind of documentation for them.
Unless you are really sure what you're doing, quote every parameter expansion.
There are some cases where this isn't needed, e.g.
[[ ... ]]WORD) in case $WORD in ....VAR=$WORDBut quoting these is never a mistake. If you get used to quote every parameter expansion, you're safe.
If you need to parse a parameter as a list of words, you can't quote, of course, like
LIST="one two three" # you MUST NOT quote $LIST here for word in $LIST; do ... done
Function names should be all lowercase and have a good name. The function names should be human readable ones. A function named f1 may be easy and quick to write down, but for debugging and especially for other people, it will tell nothing. Good names help to document the code without using extra comments.
A more or less funny one: If not intended to do so, do not name your functions like common commands, typically new users tend to name their scripts or functions test, which collides with the UNIX test command!
As noted in the article about command substitution you should use the $( ... ) form.
Though, if portability is a concern, you might have to use the backquoted form ` ... `.
In any case, if other expansions and word splitting are not wanted, you should quote the command substitution!
Well, like Greg says: “If eval is the answer, surely you are asking the wrong question.”
Avoid if, unless absolutely neccesary:
eval can be your neckshoteval with your current wayeval is not evil at all)The basic structure of a script simply reads:
#!SHEBANG CONFIGURATION_VARIABLES FUNCTION_DEFINITIONS MAIN_CODE
If possible (I know it's not always possible!), use a shebang.
The shebang serves two purposes for me:
bash!bash when you write a Bash-script, use sh when you write a general Bourne/POSIX script, ...)/bin/sh is a Bash” is a lieI call variables that are meant to be changed by the user “configuration variables” here.
Make them easy to find (directly at the top of the script), give them useful names and maybe a short comment. As noted above, use UPPERCASE for them.
Unless the code has reasons to not do, all needed function definitions should be declared before the main script code is ran. This gives a far better overview and ensures that all function names are known before they are used.
Since a function isn't parsed before it is executed, you usually don't even have to ensure a specific order.
The portable form of the function definition should be used, without the function keyword (here using the grouping compound command):
getargs() {
...
}
Speaking about the command grouping in function definitions using { ...; }: If you don't have a good reason to use another compound command directly, you should always use this one.
If you use commands that might not be installed on the system, check for their availability and tell the user what's missing.
Example:
NEEDED_COMMANDS="sed awk lsof who"
missing_counter=0
for needed_command in $NEEDED_COMMANDS; do
if ! hash "$needed_command" >/dev/null 2>&1; then
printf "Command not found in PATH: %s\n" "$needed_command" >&2
((missing_counter++))
fi
done
if ((missing_counter > 0)); then
printf "Minimum %d commands are missing in PATH, aborting" "$missing_counter" >&2
exit 1
fi
The exit code is your only way to directly communicate with the calling process without any special things to do.
If your script exits, provide a meaningful exit code. That minimally means:
exit 0 (zero) if everything is okayexit 1 - in general non-zero - if there was an errorThis, and only this, will enable the calling component to check the operation status of your script.
You know: “One of the main causes of the fall of the Roman Empire was that, lacking zero, they had no way to indicate successful termination of their C programs.” – Robert Firth