Wednesday, October 28, 2015

Linux Shell Scripting Misc




bash -x CMC -O
https://ryanstutorials.net/bash-scripting-tutorial/bash-input.php
read var1
Two commonly used options however are -p which allows you to specify a prompt and -s which makes the input silent.
  1. read -p 'Username: ' uservar
  2. read -sp 'Password: ' passvar

[Atom ide-bash](https://atom.io/packages/ide-bash)
[IDEA bashsupport](https://www.plugin-dev.com/project/bashsupport/)
  1. read car1 car2 car3

Reading from STDIN
It's common in Linux to pipe a series of simple, single purpose commands together to create a larger solution tailored to our exact needs

Bash accomodates piping and redirection by way of special files. Each process gets it's own set of files (one for STDIN, STDOUT and STDERR respectively) and they are linked when piping or redirection is invoked. Each process gets the following files:
  • STDIN - /proc/<processID>/fd/0
  • STDOUT - /proc/<processID>/fd/1
  • STDERR - /proc/<processID>/fd/2
To make life more convenient the system creates some shortcuts for us:
  • STDIN - /dev/stdin or /proc/self/fd/0
  • STDOUT - /dev/stdout or /proc/self/fd/1
  • STDERR - /dev/stderr or /proc/self/fd/2

cat /dev/stdin | cut -d' ' -f 2,3 | sort



  1. if [ $USER == 'bob' ] || [ $USER == 'andy' ]
  2. then
  3. ls -alh
  4. else
  5. ls
  6. fi

  1. if [ -r $1 ] && [ -s $1 ]
  2. then
  3. echo This file is useful.
  4. fi
var=$((var+1))
-eq
is equal to
if [ "$a" -eq "$b" ]
-ne
is not equal to
if [ "$a" -ne "$b" ]


https://stackoverflow.com/questions/6980090/how-to-read-from-a-file-or-stdin-in-bash
The following solution reads from a file if the script is called with a file name as the first parameter $1 otherwise from standard input.
while read line
do
  echo "$line"
done < "${1:-/dev/stdin}"
The substitution ${1:-...} takes $1 if defined otherwise the file name of the standard input of the own process is used.
The filename you supply on the command line could have blanks


To parse each line from the standard input, try the following script:
#!/bin/bash
while IFS= read -r line; do
  printf '%s\n' "$line"
done

https://stackoverflow.com/questions/7023025/how-do-i-declare-a-constant-variable-in-shell-script
readonly DATA=/usr/home/data/file.dat
You can also do:
declare -r var=123
https://kimballhawkins.wordpress.com/2010/07/06/boolean-variables-in-shell-scripts/
Technically, there is no variable typing in shell scripts.  In most shells, variables are just text strings unless and until evaluated in a different context, most often a numeric context.
Some shells support limited variable typing via their “typeset” command, but that capability is far from universal.
That’s actually just fine, but, as with other languages, I prefer to use boolean variables for cleaner code.
Even though there is, technically, no such thing, it is quite easy to implement.  It looks like this:

FILE_FOUND=false
# ... some processing to locate the file
    # If found, set the flag
    FILE_FOUND=true
    # ...
if $FILE_FOUND
then
    # process the file...
fi

This works because “true” and “false” are actually commands, so $FILE_FOUND executes the “true” or “false” command.  “true” always returns 0 and “false” always returns 1.  This technique works in all bourne-style shells.
http://www.bahmanm.com/blogs/command-line-options-how-to-parse-in-bash-using-getopt
http://tuxtweaks.com/2014/05/bash-getopts/
while getopts :a:b:c:d:h FLAG; do
  case $FLAG in
    a)  #set option "a"
      OPT_A=$OPTARG
      echo "-a used: $OPTARG"
      echo "OPT_A = $OPT_A"
      ;;
    b)  #set option "b"
      OPT_B=$OPTARG
      echo "-b used: $OPTARG"
      echo "OPT_B = $OPT_B"
      ;;
    c)  #set option "c"
      OPT_C=$OPTARG
      echo "-c used: $OPTARG"
      echo "OPT_C = $OPT_C"
      ;;
    d)  #set option "d"
      OPT_D=$OPTARG
      echo "-d used: $OPTARG"
      echo "OPT_D = $OPT_D"
      ;;
    h)  #show help
      HELP
      ;;
    \?) #unrecognized option - show help
      echo -e \\n"Option -${BOLD}$OPTARG${NORM} not allowed."
      HELP
      #If you just want to display a simple error message instead of the full
      #help, remove the 2 lines above and uncomment the 2 lines below.
      #echo -e "Use ${BOLD}$SCRIPT -h${NORM} to see the help documentation."\\n
      #exit 2
      ;;
  esac
done

getopts optstring varname [arg ...]
where optstring is a list of the valid option letters, varname is the variable that receives the options one at a time, and arg is the optional list of parameters to be processed. If arg is not present, getoptsprocesses the command-line arguments. If optstring starts with a colon (:), the script must take care of generating error messages; otherwise, getopts generates error messages.
The getopts builtin uses the OPTIND (option index) and OPTARG (option argument) variables to track and store option-related values. When a shell script starts, the value of OPTIND is 1. Each time getopts is called and locates an argument, it increments OPTIND to the index of the next option to be processed. If the option takes an argument, bash assigns the value of the argument to OPTARG.

To indicate that an option takes an argument, follow the corresponding letter in optstring with a colon (:). For example, the optstring dxo:lt:r instructs getopts to search for the –d–x–o–l–t, and –r options and tells it the –o and –t options take arguments.
while getopts :bt:u arg
do
    case $arg in
        b)     SKIPBLANKS=TRUE ;;
        t)     if [ -d "$OPTARG" ]
                   then
                   TMPDIR=$OPTARG
               else
                   echo "$0: $OPTARG is not a directory." >&2
                   exit 1
               fi ;;
        u)     CASE=upper ;;
        :)     echo "$0: Must supply an argument to -$OPTARG." >&2
               exit 1 ;;
        \?)    echo "Invalid option -$OPTARG ignored." >&2 ;;
        esac
done
In this version of the code, the while structure evaluates the getopts builtin each time control transfers to the top of the loop. The getopts builtin uses the OPTIND variable to keep track of the index of the argument it is to process the next time it is called. There is no need to call shift in this example.

while getopts 'i:b:q?' argv
do
  case $argv in
        i) INFILE=$OPTARG       ;;
        b) bs=$OPTARG           ;;
        q) quiet=1              ;;
        \?) usage               ;;
  esac
done
https://sookocheff.com/post/bash/parsing-bash-script-arguments-with-shopts/
The getopts function takes three parameters. The first is a specification of which options are valid, listed as a sequence of letters. For example, the string 'ht' signifies that the options -h and -t are valid.
The second argument to getopts is a variable that will be populated with the option or argument to be processed next. In the following loop, opt will hold the value of the current option that has been parsed by getopts.
This example shows a few additional features of getopts. First, if an invalid option is provided, the option variable is assigned the value ?. You can catch this case and provide an appropriate usage message to the user. Second, this behaviour is only true when you prepend the list of valid options with : to disable the default error handling of invalid options. It is recommended to always disable the default error handling in your scripts.
The third argument to getopts is the list of arguments and options to be processed. When not provided, this defaults to the arguments and options provided to the application ($@). You can provide this third argument to use getopts to parse any list of arguments and options you provide.
The variable OPTIND holds the number of options parsed by the last call to getopts. It is common practice to call the shift command at the end of your processing loop to remove options that have already been handled from $@.
shift $((OPTIND -1))

install also takes an option, -t-t takes as an argument the location to install the package to relative to the current directory.
> pip install urllib3 -t ./src/lib

package="" # Default to empty package target="" # Default to empty target # Parse options to the `pip` command while getopts ":h" opt; do case ${opt} in h ) echo "Usage:" echo " pip -h Display this help message." echo " pip install <package> Install <package>." exit 0 ;; \? ) echo "Invalid Option: -$OPTARG" 1>&2 exit 1 ;; esac done shift $((OPTIND -1)) subcommand=$1; shift # Remove 'pip' from the argument list case "$subcommand" in # Parse options to the install sub command install) package=$1; shift # Remove 'install' from the argument list # Process package options while getopts ":t:" opt; do case ${opt} in t ) target=$OPTARG ;; \? ) echo "Invalid Option: -$OPTARG" 1>&2 exit 1 ;; : ) echo "Invalid Option: -$OPTARG requires an argument" 1>&2 exit 1 ;; esac done shift $((OPTIND -1)) ;; esac
https://www.cyberciti.biz/faq/bash-infinite-loop/
while :
do
 echo "Press [CTRL+C] to stop.."
 sleep 1
done
This is a loop that will forever print “Press [CTRL+C] to stop..”. Please note that : is the null command. The null command does nothing and its exit status is always set to true. You can modify the above as follows to improve the readability:
while true
do
 echo "Press [CTRL+C] to stop.."
 sleep 1
done
A single-line bash infinite while loop syntax is as follows:
 while :; do echo 'Hit CTRL+C'; sleep 1; done
OR
 while true; do echo 'Hit CTRL+C'; sleep 1; done

Bash for infinite loop example

for (( ; ; ))
do
   echo "Pres CTRL+C to stop..."
   sleep 1
done


https://www.mylinuxplace.com/bash-special-variables/
$#Number of command-line arguments.
$_The underscore variable is set at shell startup and contains the absolute file name of the shell or script being executed as passed in the argument list. Subsequently, it expands to the last argument to the previous command, after expansion. It is also set to the full pathname of each command executed and placed in the environment exported to that command. When checking mail, this parameter holds the name of the mail file.
$-A hyphen expands to the current option flags as specified upon invocation, by the set built-in command, or those set by the shell itself (such as the -i).
$?Exit value of last executed command.
$Process number of the shell.
$!Process number of last background command.
$0First word; that is, the command name. This will have the full pathname if it was found via a PATH search.
$nIndividual arguments on command line (positional parameters). The Bourne shell allows only nine parameters to be referenced directly (n = 1–9); Bash allows n to be greater than 9 if specified as ${n}.
$*, $@All arguments on command line ($1 $2 …).
“$*”All arguments on command line as one string (“$1 $2…”). The values are separated by the first character in $IFS.
“$@”All arguments on command line, individually quoted (“$1” “$2” …).

http://stackoverflow.com/questions/13296863/difference-between-wait-and-sleep
wait waits for a process to finish; sleep sleeps for a certain amount of time.
sleep is not a BASH built-in command. It is a utility that delays for a specified amount of time.
wait makes the shell wait for the given subprocess. e.g.:
workhard &
[1] 27408
workharder &
[2] 27409
wait %1 %2
delays the shell until both of the subprocesses have finished
http://unix.stackexchange.com/questions/122460/bash-how-to-let-some-background-processes-run-but-wait-for-others
( somethingElse InputA >OutputA; echo $? >"$tmp1" ) &
proc1=$!

( somethingElse InputB >OutputB; echo $? >"$tmp2" ) &
proc2=$!

wait "$proc1" "$proc2"

wait PID #wait for command1, in background, to end
http://www.onkarjoshi.com/blog/191/device-dev-random-vs-urandom/
If you want random data in a Linux/Unix type OS, the standard way to do so is to use /dev/random or /dev/urandom. These devices are special files. They can be read like normal files and the read data is generated via multiple sources of entropy in the system which provide the randomness.

/dev/random will block after the entropy pool is exhausted. It will remain blocked until additional data has been collected from the sources of entropy that are available. This can slow down random data generation.

/dev/urandom will not block. Instead it will reuse the internal pool to produce more pseudo-random bits.

/dev/urandom is best used when:
You just want a large file with random data for some kind of testing.
You are using the dd command to wipe data off a disk by replacing it with random data.
Almost everywhere else where you don’t have a really good reason to use /dev/random instead.

/dev/random is likely to be the better choice when:
Randomness is critical to the security of cryptography in your application – one-time pads, key generation.

http://unix.stackexchange.com/questions/92384/how-to-clean-log-file
> logfile
or
cat /dev/null > logfile

http://wiki.bash-hackers.org/commands/builtin/printf
printf FORMAT [ARGUMENT]...
$ printf "%s\n" "1" "2" "\n3"

http://stackoverflow.com/questions/613572/capturing-multiple-line-output-to-a-bash-variable
Actually, RESULT contains what you want — to demonstrate:
echo "$RESULT"
What you show is what you get from:
echo $RESULT

As noted in the comments, the difference is that (1) the double-quoted version of the variable (echo "$RESULT") preserves internal spacing of the value exactly as it is represented in the variable — newlines, tabs, multiple blanks and all — whereas (2) the unquoted version (echo $RESULT) replaces each sequence of one or more blanks, tabs and newlines with a single space. Thus (1) preserves the shape of the input variable, whereas (2) creates a potentially very long single line of output with 'words' separated by single spaces (where a 'word' is a sequence of non-whitespace characters; there needn't be any alphanumerics in any of the words).
http://www.cyberciti.biz/faq/howto-linux-unix-bash-append-textto-variables/
x="Mango"
y="Pickle"
x="$x $y"
echo "$x"

x="Master"
# print 'Master' without a whitespace i.e. print Mastercard as a one word #
echo "${x}card"

What you are doing there is redefining it.
It’s completely useless.
When appending stuff to a variable, the most likely scenario is doing it while on a loop or an “if” block.
Redefining a variable in that context will just make it invisible once you get out of the loop or the “if” block.
Use the $( ... ) construct:
hash=$(genhash --use-ssl -s $IP -p 443 --url $URL | grep MD5 | grep -c $MD5)
http://stackoverflow.com/questions/8467424/echo-newline-in-bash-prints-literal-n
You could use printf instead:
printf "hello\nworld\n"
printf has more consistent behavior than echo. The behavior of echo varies greatly between different versions.
http://www.cyberciti.biz/faq/bash-shell-script-generating-random-numbers/
Bash Shell Generate Random Numbers
Each time this is referenced, a random integer between 0 and 32767 is generated. The sequence of random numbers may be initialized by assigning a value to RANDOM. If RANDOM is unset, it loses its special properties, even if it is subsequently reset.
for i in {1..5}; do echo $RANDOM; done
# Find out random unused TCP port 
findRandomTcpPort(){
 port=$(( 100+( $(od -An -N2 -i /dev/random) )%(1023+1) ))
 while :
 do
  (echo >/dev/tcp/localhost/$port) &>/dev/null &&  port=$(( 100+( $(od -An -N2 -i /dev/random) )%(1023+1) )) || break
 done
 echo "$port"
}

Using /dev/urandom or /dev/random

The character special files /dev/random and /dev/urandom provide an interface to the kernel’s random number generator.
http://unix.stackexchange.com/questions/140750/generate-random-numbers-in-specific-range
The $RANDOM variable is normally not a good way to generated good random values. The output of /dev/[u]random need also to be converted first.
An easier way is to use higher level languages, like e.g. python:
To generate a random integer variable between 5 and 10 (5<=N<=10), use
python -c "import random; print random.randint(5,10)"
Do not use this for cryptographic applications.
https://www.eduonix.com/blog/shell-scripting/generating-random-numbers-in-linux-shell-scripting/
Where:
MIN and MAX are the lower and upper limits of the range of numbers, respectively.
COUNT is the number of lines (random numbers) to display.
shuf -i 0-1000 -n 10

od -An -N1 -i /dev/random

# for i in `seq 5`
> do
> od -An -N1 -i /dev/random
> done
Similarly, to get 2 bytes (0 – 65535):
od -An -N2 -i /dev/random
To get a larger number, increase the number of bytes to convert (the number following the –N option)
http://www.thegeekstuff.com/2012/08/od-command/
od command in Linux is used to output the contents of a file in different formats with the octal format being the default.
This command is especially useful when debugging Linux scripts for unwanted changes or characters.
1. Display contents of file in octal format using -b option
od -b input
2. Display contents of file in character format using -c option
$ od -c input
3. Display the byte offsets in different formats using -A option

The byte offset can be displayed in any of the following formats :

Hexadecimal (using -x along with -A)
Octal (using -o along with -A)
Decimal (using -d along with -A)

4. Display no offset information using ‘-An’ option
$ od -An -c input
12. Accept input from command line using –
$ od -c -
13. Display hidden characters using od command
$ od -c input

http://google.github.io/styleguide/shell.xml
Bash is the only shell scripting language permitted for executables.

Executables must start with #!/bin/bash and a minimum number of flags. Use set to set shell options so that calling your script as bash <script_name> does not break its functionality.
Restricting all executable shell scripts to bash gives us a consistent shell language that's installed on all our machines.

Shell should only be used for small utilities or simple wrapper scripts.

While shell scripting isn't a development language, it is used for writing various utility scripts throughout Google. This style guide is more a recognition of its use rather than a suggestion that it be used for widespread deployment.
Some guidelines:
  • If you're mostly calling other utilities and are doing relatively little data manipulation, shell is an acceptable choice for the task.
  • If performance matters, use something other than shell.
  • If you find you need to use arrays for anything more than assignment of ${PIPESTATUS}, you should use Python.
  • If you are writing a script that is more than 100 lines long, you should probably be writing it in Python instead. Bear in mind that scripts grow. Rewrite your script in another language early to avoid a time-consuming rewrite at a later date.
SUID and SGID are forbidden on shell scripts.

There are too many security issues with shell that make it nearly impossible to secure sufficiently to allow SUID/SGID.
Use sudo to provide elevated access if you need it.

All error messages should go to STDERR.
This makes it easier to separate normal status from actual issues.
A function to print out error messages along with other status information is recommended.
err() {
  echo "[$(date +'%Y-%m-%dT%H:%M:%S%z')]: $@" >&2
}

if ! do_something; then
  err "Unable to do_something"
  exit "${E_DID_NOTHING}"
fi


If a pipeline all fits on one line, it should be on one line.
If not, it should be split at one pipe segment per line with the pipe on the newline and a 2 space indent for the next section of the pipe. This applies to a chain of commands combined using '|' as well as to logical compounds using '||' and '&&'.
# All fits on one line
command1 | command2

# Long commands
command1 \
  | command2 \
  | command3 \
  | command4

Put ; do and ; then on the same line as the whilefor or if.

Loops in shell are a bit different, but we follow the same principles as with braces when declaring functions. That is: ; then and ; do should be on the same line as the if/for/while. else should be on its own line and closing statements should be on their own line vertically aligned with the opening statement.
Example:
for dir in ${dirs_to_cleanup}; do
  if [[ -d "${dir}/${ORACLE_SID}" ]]; then
    log_date "Cleaning up old files in ${dir}/${ORACLE_SID}"
    rm "${dir}/${ORACLE_SID}/"*
    if [[ "$?" -ne 0 ]]; then
      error_message
    fi
  else
    mkdir -p "${dir}/${ORACLE_SID}"
    if [[ "$?" -ne 0 ]]; then
      error_message
    fi
  fi
done

  • Indent alternatives by 2 spaces.
  • A one-line alternative needs a space after the close parenthesis of the pattern and before the ;;.
  • Long or multi-command alternatives should be split over multiple lines with the pattern, actions, and ;; on separate lines.

The matching expressions are indented one level from the 'case' and 'esac'. Multiline actions are indented another level. In general, there is no need to quote match expressions. Pattern expressions should not be preceded by an open parenthesis. Avoid the ;& and ;;& notations.
case "${expression}" in
  a)
    variable="..."
    some_command "${variable}" "${other_expr}" ...
    ;;
  absolute)
    actions="relative"
    another_command "${actions}" "${other_expr}" ...
    ;;
  *)
    error "Unexpected expression '${expression}'"
    ;;
esac
Simple commands may be put on the same line as the pattern and ;; as long as the expression remains readable. This is often appropriate for single-letter option processing. When the actions don't fit on a single line, put the pattern on a line on its own, then the actions, then ;; also on a line of its own. When on the same line as the actions, use a space after the close parenthesis of the pattern and another before the ;;.
verbose='false'
aflag=''
bflag=''
files=''
while getopts 'abf:v' flag; do
  case "${flag}" in
    a) aflag='true' ;;
    b) bflag='true' ;;
    f) files="${OPTARG}" ;;
    v) verbose='true' ;;
    *) error "Unexpected option ${flag}" ;;
  esac
done

In order of precedence: Stay consistent with what you find; quote your variables; prefer "${var}" over "$var", but see details.
Don't brace-quote single character shell specials / positional parameters, unless strictly necessary or avoiding deep confusion.
Prefer brace-quoting all other variables.
# Section of recommended cases.

# Preferred style for 'special' variables:
echo "Positional: $1" "$5" "$3"
echo "Specials: !=$!, -=$-, _=$_. ?=$?, #=$# *=$* @=$@ \$=$$ ..."

# Braces necessary:
echo "many parameters: ${10}"

# Braces avoiding confusion:
# Output is "a0b0c0"
set -- a b c
echo "${1}0${2}0${3}0"

# Preferred style for other variables:
echo "PATH=${PATH}, PWD=${PWD}, mine=${some_var}"
while read f; do
  echo "file=${f}"
done < <(ls -l /tmp)

# Section of discouraged cases

# Unquoted vars, unbraced vars, brace-quoted single letter
# shell specials.
echo a=$avar "b=$bvar" "PID=${$}" "${1}"

# Confusing use: this is expanded as "${1}0${2}0${3}0",
# not "${10}${20}${30}
set -- a b c
echo "$10$20$30


  • Always quote strings containing variables, command substitutions, spaces or shell meta characters, unless careful unquoted expansion is required.
  • Prefer quoting strings that are "words" (as opposed to command options or path names).
  • Never quote literal integers.
  • Be aware of the quoting rules for pattern matches in [[.
  • Use "$@" unless you have a specific reason to use $*.

# 'Single' quotes indicate that no substitution is desired.
# "Double" quotes indicate that substitution is required/tolerated.
# Simple examples
# "quote command substitutions"
flag="$(some_command and its args "$@" 'quoted separately')"

# "quote variables"
echo "${flag}"

# "never quote literal integers"
value=32
# "quote command substitutions", even when you expect integers
number="$(generate_number)"

# "prefer quoting words", not compulsory
readonly USE_INTEGER='true'

# "quote shell meta characters"
echo 'Hello stranger, and well met. Earn lots of $$$'
echo "Process $$: Done making \$\$\$."

# "command options or path names"
# ($1 is assumed to contain a value here)
grep -li Hugo /dev/null "$1"

# Less simple examples
# "quote variables, unless proven false": ccs might be empty
git send-email --to "${reviewers}" ${ccs:+"--cc" "${ccs}"}

# Positional parameter precautions: $1 might be unset
# Single quotes leave regex as-is.
grep -cP '([Ss]pecial|\|?characters*)$' ${1:+"$1"}

# For passing on arguments,
# "$@" is right almost everytime, and
# $* is wrong almost everytime:
#
# * $* and $@ will split on spaces, clobbering up arguments
#   that contain spaces and dropping empty strings;
# * "$@" will retain arguments as-is, so no args
#   provided will result in no args being passed on;
#   This is in most cases what you want to use for passing
#   on arguments.
# * "$*" expands to one argument, with all args joined
#   by (usually) spaces,
#   so no args provided will result in one empty string
#   being passed on.
# (Consult 'man bash' for the nit-grits ;-)

set -- 1 "2 two" "3 three tres"; echo $# ; set -- "$*"; echo "$#, $@")
set -- 1 "2 two" "3 three tres"; echo $# ; set -- "$@"; echo "$#, $@")

Use $(command) instead of backticks.
Nested backticks require escaping the inner ones with \. The $(command) format doesn't change when nested and is easier to read.
Example:
# This is preferred:
var="$(command "$(command1)")"

# This is not:
var="`command \`command1\``"
[[ ... ]] is preferred over [test and /usr/bin/[.

[[ ... ]] reduces errors as no pathname expansion or word splitting takes place between [[ and ]] and [[ ... ]] allows for regular expression matching where [ ... ] does not.
# This ensures the string on the left is made up of characters in the
# alnum character class followed by the string name.
# Note that the RHS should not be quoted here.
# For the gory details, see
# E14 at https://tiswww.case.edu/php/chet/bash/FAQ
if [[ "filename" =~ ^[[:alnum:]]+name ]]; then
  echo "Match"
fi
# This matches the exact pattern "f*" (Does not match in this case) if [[ "filename" == "f*" ]]; then echo "Match" fi # This gives a "too many arguments" error as f* is expanded to the # contents of the current directory if [ "filename" == f* ]; then echo "Match" fi
Use quotes rather than filler characters where possible.

Bash is smart enough to deal with an empty string in a test. So, given that the code is much easier to read, use tests for empty/non-empty strings or empty strings rather than filler characters.
# Do this:
if [[ "${my_var}" = "some_string" ]]; then
  do_something
fi

# -z (string length is zero) and -n (string length is not zero) are
# preferred over testing for an empty string
if [[ -z "${my_var}" ]]; then
  do_something
fi

# This is OK (ensure quotes on the empty side), but not preferred:
if [[ "${my_var}" = "" ]]; then
  do_something
fi

# Not this:
if [[ "${my_var}X" = "some_stringX" ]]; then
  do_something
fi

To avoid confusion about what you're testing for, explicitly use -z or -n.
# Use this
if [[ -n "${my_var}" ]]; then
  do_something
fi

# Instead of this as errors can occur if ${my_var} expands to a test
# flag
if [[ "${my_var}" ]]; then
  do_something
fi


Use an explicit path when doing wildcard expansion of filenames.
As filenames can begin with a -, it's a lot safer to expand wildcards with ./* instead of *.
# Here's the contents of the directory:
# -f  -r  somedir  somefile

# This deletes almost everything in the directory by force
psa@bilby$ rm -v *
removed directory: `somedir'
removed `somefile'

# As opposed to:
psa@bilby$ rm -v ./*
removed `./-f'
removed `./-r'
rm: cannot remove `./somedir': Is a directory
removed `./somefile'

eval should be avoided.
Use process substitution or for loops in preference to piping to while. Variables modified in a while loop do not propagate to the parent because the loop's commands run in a subshell.

The implicit subshell in a pipe to while can make it difficult to track down bugs.
last_line='NULL'
your_command | while read line; do
  last_line="${line}"
done

# This will output 'NULL'
echo "${last_line}"
Use a for loop if you are confident that the input will not contain spaces or special characters (usually, this means not user input).
total=0
# Only do this if there are no spaces in return values.
for value in $(command); do
  total+="${value}"
done
Using process substitution allows redirecting output but puts the commands in an explicit subshell rather than the implicit subshell that bash creates for the while loop.
total=0
last_file=
while read count filename; do
  total+="${count}"
  last_file="${filename}"
done < <(your_command | uniq -c)

# This will output the second field of the last line of output from
# the command.
echo "Total = ${total}"
echo "Last one = ${last_file}"
Use while loops where it is not necessary to pass complex results to the parent shell - this is typically where some more complex "parsing" is required. Beware that simple examples are probably more easily done with a tool such as awk. This may also be useful where you specifically don't want to change the parent scope variables.
# Trivial implementation of awk expression:
#   awk '$3 == "nfs" { print $2 " maps to " $1 }' /proc/mounts
cat /proc/mounts | while read src dest type opts rest; do
  if [[ ${type} == "nfs" ]]; then
    echo "NFS ${dest} maps to ${src}"
  fi
done
Eval munges the input when used for assignment to variables and can set variables without making it possible to check what those variables were.
# What does this set?
# Did it succeed? In part or whole?
eval $(set_my_variables)

# What happens if one of the returned values has a space in it?
variable="$(eval some_function)"

Read-only Variables
Use readonly or declare -r to ensure they're read only.

As globals are widely used in shell, it's important to catch errors when working with them. When you declare a variable that is meant to be read-only, make this explicit.
zip_version="$(dpkg --status zip | grep Version: | cut -d ' ' -f 2)"
if [[ -z "${zip_version}" ]]; then
  error_message
else
  readonly zip_version
fi

Use Local Variables
Ensure that local variables are only seen inside a function and its children by using local when declaring them. This avoids polluting the global name space and inadvertently setting variables that may have significance outside the function.
Declaration and assignment must be separate statements when the assignment value is provided by a command substitution; as the 'local' builtin does not propagate the exit code from the command substitution.
my_func2() {
  local name="$1"

  # Separate lines for declaration and assignment:
  local my_var
  my_var="$(my_func)" || return

  # DO NOT do this: $? contains the exit code of 'local', not my_func
  local my_var="$(my_func)"
  [[ $? -eq 0 ]] || return

  ...
}

Always check return values and give informative return values.
For unpiped commands, use $? or check directly via an if statement to keep it simple.
Example:
if ! mv "${file_list}" "${dest_dir}/" ; then
  echo "Unable to move ${file_list} to ${dest_dir}" >&2
  exit "${E_BAD_MOVE}"
fi

# Or
mv "${file_list}" "${dest_dir}/"
if [[ "$?" -ne 0 ]]; then
  echo "Unable to move ${file_list} to ${dest_dir}" >&2
  exit "${E_BAD_MOVE}"
fi
Bash also has the PIPESTATUS variable that allows checking of the return code from all parts of a pipe. If it's only necessary to check success or failure of the whole pipe, then the following is acceptable:
tar -cf - ./* | ( cd "${dir}" && tar -xf - )
if [[ "${PIPESTATUS[0]}" -ne 0 || "${PIPESTATUS[1]}" -ne 0 ]]; then
  echo "Unable to tar files to ${dir}" >&2
fi
However, as PIPESTATUS will be overwritten as soon as you do any other command, if you need to act differently on errors based on where it happened in the pipe, you'll need to assign PIPESTATUS to another variable immediately after running the command (don't forget that [ is a command and will wipe out PIPESTATUS).
tar -cf - ./* | ( cd "${DIR}" && tar -xf - )
return_codes=(${PIPESTATUS[*]})
if [[ "${return_codes[0]}" -ne 0 ]]; then
  do_something
fi
if [[ "${return_codes[1]}" -ne 0 ]]; then
  do_something_else
fi
We prefer the use of builtins such as the Parameter Expansion functions in bash(1) as it's more robust and portable (especially when compared to things like sed).
Example:
# Prefer this:
addition=$((${X} + ${Y}))
substitution="${string/#foo/bar}"

# Instead of this:
addition="$(expr ${X} + ${Y})"
substitution="$(echo "${string}" | sed -e 's/^foo/bar/')"
https://bash.cyberciti.biz/guide/Readonly_command
readonly var
readonly var=value
readonly p=/tmp/toi.txt
# error
p=/tmp/newvale

Make function readonly

  • You need to use the -f option to make corresponding function readonly and syntax is:
readonly -f functionName
  • For example, write a function called hello() at a shell prompt, enter:
function hello() { echo "Hello world"; }
# invoke it 
hello
  • Make it readonly:
readonly -f hello
# invoke it 
hello
  • Now, try to update the hello(), enter:
function hello() { echo "Hello $1, let us be friends."; }
Sample outputs:
bash: hello: readonly function

Display all readonly variables

If no arguments are given, or if -p is given to the readonly buitin, a list of all readonly names is printed on screen:
readonly
OR
readonly -p
readonly -f   //Display all readonly functions
https://bash.cyberciti.biz/guide/Declare_command
  • Use the declare command to set variable and functions attributes.
  • A constant variable is a variable that is always constant in the experiment, it never changes.
  • Also known as readonly variable and syntax is:
declare -r var
declare -r varName=value
  • An integer data type (variable) is any type of number without a fractional part.
  • The syntax is as follow to make variable have the integer attribute:
declare -i var
declare -i varName=value
  • For example, define y as an integer number:
declare -i y=10

http://kvz.io/blog/2013/11/21/bash-best-practices/
  1. Use long options (logger --priority vs logger -p). If you're on cli, abbreviations make sense for efficiency. but when you're writing reusable scripts a few extra keystrokes will pay off in readability and avoid ventures into man pages in the future by you or your collaborators.
  2. Use set -o errexit (a.k.a. set -e) to make your script exit when a command fails.
  3. Then add || true to commands that you allow to fail.
  4. Use set -o nounset (a.k.a. set -u) to exit when your script tries to use undeclared variables.
  5. Use set -o xtrace (a.k.a set -x) to trace what gets executed. Useful for debugging.
  6. Use set -o pipefail in scripts to catch mysqldump fails in e.g. mysqldump |gzip. The exit status of the last command that threw a non-zero exit code is returned.
  7. #!/usr/bin/env bash is more portable than #!/bin/bash.
  8. Avoid using #!/usr/bin/env bash -e (vs set -e), because when someone runs your script as bash ./script.sh, the exit on error will be ignored.
  9. Surround your variables with {}. Otherwise bash will try to access the $ENVIRONMENT_app variable in /srv/$ENVIRONMENT_app, whereas you probably intended /srv/${ENVIRONMENT}_app.
  10. You don't need two equal signs when checking if [ "${NAME}" = "Kevin" ].
  11. Surround your variable with " in if [ "${NAME}" = "Kevin" ], because if $NAME isn't declared, bash will throw a syntax error (also see nounset).
  12. Use :- if you want to test variables that could be undeclared. For instance: if [ "${NAME:-}" = "Kevin" ] will set $NAME to be empty if it's not declared. You can also set it to noname like so if [ "${NAME:-noname}" = "Kevin" ]
  13. Set magic variables for current file, basename, and directory at the top of your script for convenience.
set -o errexit
set -o pipefail
set -o nounset
# set -o xtrace

# Set magic variables for current file & dir
__dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
__file="${__dir}/$(basename "${BASH_SOURCE[0]}")"
__base="$(basename ${__file} .sh)"
__root="$(cd "$(dirname "${__dir}")" && pwd)" # <-- change this as it depends on your app

arg1="${1:-}"
https://github.com/thoughtbot/guides/tree/master/best-practices
Shell
Don't parse the output of ls. See here for details and alternatives.
Don't use cat to provide a file on stdin to a process that accepts file arguments itself.
Don't use echo with options, escapes, or variables (use printf for those cases).
Don't use a /bin/sh shebang unless you plan to test and run your script on at least: Actual Sh, Dash in POSIX-compatible mode (as it will be run on Debian), and Bash in POSIX-compatible mode (as it will be run on OSX).
Don't use any non-POSIX features when using a /bin/sh shebang.
If calling cd, have code to handle a failure to change directories.
If calling rm with a variable, ensure the variable is not empty.
Prefer "$@" over "$*" unless you know exactly what you're doing.
Prefer awk '/re/ { ... }' to grep re | awk '{ ... }'.
Prefer find -exec {} + to find -print0 | xargs -0.
Prefer for loops over while read loops.
Prefer grep -c to grep | wc -l.
Prefer mktemp over using $ to "uniquely" name a temporary file.
Prefer sed '/re/!d; s//.../' to grep re | sed 's/re/.../'.
Prefer sed 'cmd; cmd' to sed -e 'cmd' -e 'cmd'.
Prefer checking exit statuses over output in if statements (if grep -q ...;, not if [ -n "$(grep ...)" ];).
Prefer reading environment variables over process output ($TTY not $(tty), $PWD not $(pwd), etc).
Use $( ... ), not backticks for capturing command output.
Use $(( ... )), not expr for executing arithmetic expressions.
Use 1 and 0, not true and false to represent boolean variables.
Use find -print0 | xargs -0, not find | xargs.
Use quotes around every "$variable" and "$( ... )" expression unless you want them to be word-split and/or interpreted as globs.
Use the local keyword with function-scoped variables.
Identify common problems with shellcheck.
Bash
In addition to Shell best practices,Prefer ${var,,} and ${var^^} over tr for changing case.
Prefer ${var//from/to} over sed for simple string replacements.
Prefer [[ over test or [.
Prefer process substitution over a pipe in while read loops.
Use (( or let, not $(( when you don't need the result
http://www.cyberciti.biz/faq/unix-linux-bash-read-comma-separated-cvsfile/
INPUT=data.cvs
OLDIFS=$IFS
IFS=,
[ ! -f $INPUT ] && { echo "$INPUT file not found"; exit 99; }
while read flname dob ssn tel status
do
 echo "Name : $flname"
 echo "DOB : $dob"
 echo "SSN : $ssn"
 echo "Telephone : $tel"
 echo "Status : $status"
done < $INPUT
IFS=$OLDIFS

http://ccm.net/faq/1757-how-to-read-a-file-line-by-line
while read line           
do           
    echo-e "$ line \ n"           
done <file.txt    
Bash continuation lines
$ echo "continuation""lines"
continuationlines
So a continuation line without an indent is one way to break up a string:
$ echo "continuation"\
> "lines"
continuationlines
But when an indent is used:
$       echo "continuation"\
>       "lines"
continuation lines

http://tldp.org/LDP/abs/html/here-docs.html
here document is a special-purpose code block. It uses a form of I/O redirection to feed a command list to an interactive program or a command, such as ftpcat, or the ex text editor.

COMMAND <<InputComesFromHERE
...
...
...
InputComesFromHERE
http://stackoverflow.com/questions/1167746/how-to-assign-a-heredoc-value-to-a-variable-in-bash
VAR=<<END
abc
END
doesn't work because you are redirecting stdin to something that doesn't care about it, namely the assignment
you should really avoid using backticks, it's better to use the command substitution notation $(..).
THS WORKS
export A=$(cat <<END
sdfsdf
sdfsdf
sdfsfds
END
) ; echo $A
If you don't quote the variable when you echo it, newlines are lost. Quoting it preserves them:
http://wiki.bash-hackers.org/howto/mutex
https://blog.famzah.net/2013/07/31/using-flock-in-bash-without-invoking-a-subshell/
bash并发编程和flock
在shell编程中,需要使用并发编程的场景并不多。我们倒是经常会想要某个脚本不要同时出现多次同时执行,比如放在crond中的某个周期任务,如果执行时间较长以至于下次再调度的时间间隔,那么上一个还没执行完就可能又打开一个,这时我们会希望本次不用执行
#!/bin/bash

countfile=/tmp/count

if ! [ -f $countfile ]
then
    echo 0 > $countfile
fi

do_count () {
    read count < $countfile
    echo $((++count)) > $countfile
}

for i in `seq 1 100`
do
     do_count &
done

wait

cat $countfile

rm $countfile
[zorro@zorrozou-pc0 bash]$ cat repeat.sh
#!/bin/bash

exec 3> /tmp/.lock

if ! flock -xn 3
then
    echo "already running!"
    exit 1
fi

echo "running!"
sleep 30
echo "ending"

flock -u 3
exec 3>&-
rm /tmp/.lock

exit 0
-n参数可以让flock命令以非阻塞方式探测一个文件是否已经被加锁,所以可以使用互斥锁的特点保证脚本运行的唯一性。脚本退出的时候锁会被释放,所以这里可以不用显式的使用flock解锁。flock除了-u参数指定文件描述符锁文件以外,还可以作为执行命令的前缀使用。这种方式非常适合直接在crond中方式所要执行的脚本重复执行。如:
*/1 * * * * /usr/bin/flock -xn /tmp/script.lock -c '/home/bash/script.sh
#!/bin/bash countfile=/tmp/count if ! [ -f $countfile ] then    echo 0 > $countfile fi do_count () {    exec 3< $countfile    #对三号描述符加互斥锁    flock -x 3    read -u 3 count    echo $((++count)) > $countfile    #解锁    flock -u 3    #关闭描述符也会解锁    exec 3>&- } for i in `seq 1 100` do     do_count & done wait cat $countfile rm $countfile
http://stackoverflow.com/questions/14066992/what-does-minus-mean-in-exec-3-and-how-do-i-use-it
  1. 3>&- means that file descriptor 3, opened for writing(same as stdout), is closed.
The 3>&- close the file descriptor number 3 (it probably has been opened before with 3>filename).
There are always three default files [1] open, stdin (the keyboard), stdout (the screen), and stderr (error messages output to the screen). These, and any other open files, can be redirected. Redirection simply means capturing output from a file, command, program, script, or even code block within a script (see Example 3-1 and Example 3-2) and sending it as input to another file, command, program, or script.
Each open file gets assigned a file descriptor. [2] The file descriptors for stdinstdout, and stderr are 0, 1, and 2, respectively.
   : > filename
      # The > truncates file "filename" to zero length.
      # If file not present, creates zero-length file (same effect as 'touch').
      # The : serves as a dummy placeholder, producing no output.

   > filename    
      # The > truncates file "filename" to zero length.
      # If file not present, creates zero-length file (same effect as 'touch').
      # (Same result as ": >", above, but this does not work with some shells.)
   &>filename
      # Redirect both stdout and stderr to file "filename."
      # This operator is now functional, as of Bash 4, final release.
   2>&1
      # Redirects stderr to stdout.
      # Error messages get sent to same place as standard output.
        >>filename 2>&1
            bad_command >>filename 2>&1
            # Appends both stdout and stderr to the file "filename" ...
        2>&1 | [command(s)]

Closing File Descriptors
n<&-
Close input file descriptor n.
0<&-<&-
Close stdin.
n>&-
Close output file descriptor n.
1>&->&-
Close stdout.
Child processes inherit open file descriptors. This is why pipes work. To prevent an fd from being inherited, close it.
Closing File Descriptors
n<&-
Close input file descriptor n.
0<&-<&-
Close stdin.
n>&-
Close output file descriptor n.
1>&->&-
Close stdout.
Child processes inherit open file descriptors. This is why pipes work. To prevent an fd from being inherited, close it.


# Redirecting only stderr to a pipe.

exec 3>&1                              # Save current "value" of stdout.
ls -l 2>&1 >&3 3>&- | grep bad 3>&-    # Close fd 3 for 'grep' (but not 'ls').
#              ^^^^   ^^^^
exec 3>&-                              # Now close it for the remainder of the script.

# Redirecting only stderr to a pipe.

exec 3>&1                              # Save current "value" of stdout.
ls -l 2>&1 >&3 3>&- | grep bad 3>&-    # Close fd 3 for 'grep' (but not 'ls').
#              ^^^^   ^^^^
exec 3>&-                              # Now close it for the remainder of the script.


Child processes inherit open file descriptors. This is why pipes work. To prevent an fd from being inherited, close it.


# Redirecting only stderr to a pipe.

exec 3>&1                              # Save current "value" of stdout.
ls -l 2>&1 >&3 3>&- | grep bad 3>&-    # Close fd 3 for 'grep' (but not 'ls').
#              ^^^^   ^^^^
exec 3>&-                              # Now close it for the remainder of the script.

http://www.tldp.org/LDP/abs/html/x17974.html


An exec <filename command redirects stdin to a file. From that point on, all stdin comes from that file, rather than its normal source (usually keyboard input). This provides a method of reading a file line by line and possibly parsing each line of input using sed and/or awk.

exec 6<&0          # Link file descriptor #6 with stdin.
                   # Saves stdin.

exec < data-file   # stdin replaced by file "data-file"

read a1            # Reads first line of file "data-file".
read a2            # Reads second line of file "data-file."

echo "Following lines read from file."
echo $a1
echo $a2
exec 0<&6 6<&-
#  Now restore stdin from fd #6, where it had been saved,
#+ and close fd #6 ( 6<&- ) to free it for other processes to use.
#
# <&6 6<&-    also works.

an exec >filename command redirects stdout to a designated file. This sends all command output that would normally go to stdout to that file.

Important
exec N > filename affects the entire script or current shell. Redirection in the PID of the script or shell from that point on has changed. However . . .
N > filename affects only the newly-forked process, not the entire script or shell.

#!/bin/bash
# reassign-stdout.sh

LOGFILE=logfile.txt

exec 6>&1           # Link file descriptor #6 with stdout.
                    # Saves stdout.

exec > $LOGFILE     # stdout replaced with file "logfile.txt".
# All output from commands in this block sent to file $LOGFILE.
exec 1>&6 6>&-      # Restore stdout and close file descriptor #6.

http://www.tldp.org/LDP/abs/html/redircb.html
while [ "$name" != Smith ]  # Why is variable $name in quotes?
do
  read name                 # Reads from $Filename, rather than stdin.
  echo $name
  let "count += 1"
done <"$Filename"           # Redirects stdin to file $Filename. 
Example 20-6. Alternate form of redirected while loop
exec 3<&0                 # Save stdin to file descriptor 3.
exec 0<"$Filename"        # Redirect standard input.

count=0
echo


while [ "$name" != Smith ]
do
  read name               # Reads from redirected stdin ($Filename).
  echo $name
  let "count += 1"
done                      #  Loop reads from file $Filename
                          #+ because of line 20.

#  The original version of this script terminated the "while" loop with
#+      done <"$Filename" 
#  Exercise:
#  Why is this unnecessary?

exec 0<&3                 # Restore old stdin.
exec 3<&-                 # Close temporary fd 3.
Here documents are a special case of redirected code blocks. That being the case, it should be possible to feed the output of a here document into the stdin for a while loop.
function doesOutput()
 # Could be an external command too, of course.
 # Here we show you can use a function as well.
{
  ls -al *.jpg | awk '{print $5,$9}'
}


nr=0          #  We want the while loop to be able to manipulate these and
totalSize=0   #+ to be able to see the changes after the 'while' finished.

while read fileSize fileName ; do
  echo "$fileName is $fileSize bytes"
  let nr++
  totalSize=$((totalSize+fileSize))   # Or: "let totalSize+=fileSize"
done<<EOF
$(doesOutput)
EOF
http://pisadmin.welaika.com/post/40182393097/bash-heredoc-and-variables-a-memorandum
http://www.techug.com/bash-practice
采用一致缩进
提供有效信息
合理使用注释
在出错退出时返回错误码
使用函数替换重复命令集
function lower()
{
    local str="$@"
    local output
    output=$(tr '[A-Z]' '[a-z]'<<<"${str}")
    echo $output
}
为变量取有实际意义的名称
检查参数类型是否正确
检查参数缺失或提供参数顺序的错误信息
if ! [ "$1" -eq "$1" 2> /dev/null ]
then
  echo "ERROR: $1 is not a number!"
  exit 1
fi
检查必要文件是否真实存在
if [ ! -f $1 ]; then
    echo "$1 -- no such file"
fi
输出发送至/dev/null

将命令输出发送到 /dev/null 并以一种更加友好的方式告诉用户哪里出错,可以让你的脚本对使用者而言更加简单。
if [ $1 == "help" ]; then
    echo "Sorry -- No help available for $0"
else
    CMD=`which $1 >/dev/null 2>&1`
    if [ $? != 0 ]; then
        echo "$1: No such command -- maybe misspelled or not on your search path"
        exit 2
    else
        cmd=`basename $1`
        whatis $cmd
    fi
fi
使用错误码
信息提示
引用所有参数扩展
如果在脚本中使用字符扩展,别忘了使用引号,这样就不会得到一个意料之外的结果。

引用所有参数时使用$@
$@变量会列出所有提供给脚本的参数,并且非常容易使用
for i in "$@"
do
    echo "$i"
done
Sleep Sort 是一种通过多线程的不同休眠时间的排序方法。可以很简单地用Shell脚本实现。
#! /bin/bash

function func() {
    sleep "$1"
    echo "$1"
}

while [ -n "$1" ]
do
    func "$1" &
    shift
done
wait
很显然,其时间复杂度与排序的数据有关。(在绝大多数场合下,并不是一种实用的排序算法。)
http://www.devshed.com/c/a/braindump/more-amazing-things-to-do-with-pipelines/
tr -cs A-Za-z’ ‘n’ |  Replace nonletters with newlines tr A-Z a-z |   Map uppercase to lowercase  sort |   Sort the words in ascending order   uniq -c |   Eliminate duplicates, showing their counts    sort -k1,1nr -k2 |  Sort by descending count, and then by ascending word     sed ${1:-25}q   Print only the first n (default: 25) lines

以魔法#!开始

一个脚本程序的开始方式都比较统一,它们几乎都开始于一个#!符号。这个符号的作用大家似乎也都知道,叫做声明解释器。脚本语言跟编译型语言的不一样之处主要是脚本语言需要解释器。因为脚本语言主要是文本,而系统中能够执行的文件实际上都是可执行的二进制文件,就是编译好的文件。文本的好处是人看方便,但是操作系统并不能直接执行,所以就需要将文本内容传递给一个可执行的二进制文件进行解析,再由这个可执行的二进制文件根据脚本的内容所确定的行为进行执行。可以做这种解析执行的二进制可执行程序都可以叫做解释器。
脚本开头的#!就是用来声明本文件的文本内容是交给那个解释器进行解释的。比如我们写bash脚本,一般声明的方法是#!/bin/bash或#!/bin/sh。如果写的是一个python脚本,就用#!/usr/bin/python。当然,在不同环境的系统中,这个解释器放的路径可能不一样,所以固定写一个路径的方式就可能造成脚本在不同环境的系统中不通用的情况,于是就出现了这样的写法:
#!/usr/bin/env 脚本解释器名称
这就利用了env命令可以得到可执行程序执行路径的功能,让脚本自行找到在当前系统上到底解释器在什么路径。让脚本更具通用性。但是大家有没有想过一个问题,大多数脚本语言都是将#后面出现的字符当作是注释,在脚本中并不起作用。这个#!和这个注释的规则不冲突么?
这就要从#!符号起作用的原因说起,其实也很简单,这个功能是由操作系统的程序载入器做的。在Linux操作系统上,出了1号进程以外,我们可以认为其它所有进程都是由父进程fork出来的。所以对bash来说,所谓的载入一个脚本执行,无非就是父进程调用fork()、exec()来产生一个子进程。这#!就是在内核处理exec的时候进行解析的。
内核中整个调用过程如下(linux 4.4),内核处理exec族函数的主要实现在fs/exec.c文件的do_execveat_common()方法中,其中调用exec_binprm()方法处理执行逻辑,这函数中使用search_binary_handler()对要加载的文件进行各种格式的判断,脚本(script)只是其中的一种。确定是script格式后,就会调用script格式对应的load_binary方法:load_script()进行处理,#!就是在这个函数中解析的。解析到了#!以后,内核会取其后面的可执行程序路径,再传递给search_binary_handler()重新解析。这样最终找到真正的可执行二进制文件进行相关执行操作。
因此,对脚本第一行的#!解析,其实是内核给我们变的魔术。#!后面的路径内容在起作用的时候还没有交给脚本解释器。很多人认为#!这一行是脚本解释器去解析的,然而并不是。了解了原理之后,也顺便明白了为什么#!一定要写在第一行的前两个字符,因为这是在内核里写死的,它就只检查前两个字符。当内核帮你选好了脚本解释器之后,后续的工作就都交给解释器做了。脚本的所有内容也都会原封不动的交给解释器再次解释,是的,包括#!。但是由于对于解释器来说,#开头的字符串都是注释,并不生效,所以解释器自然对#!后面所有的内容无感,继续解释对于它来说有意义的字符串去了。
我们可以用一个自显示脚本来观察一下这个事情,什么是自显示脚本?无非就是#!/bin/cat,这样文本的所有内容包括#!行都会交给cat进行显示:
[zorro@zorrozou-pc0 bash]$ cat cat.sh 
#!/bin/cat

echo "hello world!"
[zorro@zorrozou-pc0 bash]$ ./cat.sh 
#!/bin/cat

echo "hello world!"
或者自删除脚本:
[zorro@zorrozou-pc0 bash]$ cat rm.sh 
#!/bin/rm

echo "hello world!"
[zorro@zorrozou-pc0 bash]$ chmod +x rm.sh 
[zorro@zorrozou-pc0 bash]$ ./rm.sh 
[zorro@zorrozou-pc0 bash]$ cat rm.sh
cat: rm.sh: No such file or directory
这就是#!的本质。

bash如何执行shell命令?

刚才我们从#!的作用原理讲解了一个bash脚本是如何被加载的。就是说当#!/bin/bash的时候,实际上内核给我们启动了一个bash进程,然后把脚本内容都传递给bash进行解析执行。实际上,无论在脚本里还是在命令行中,bash对文本的解析方法大致都是一样的。首先,bash会以一些特殊字符作为分隔符,将文本进行分段解析。最主要的分隔符无疑就是回车,类似功能的分隔符还有分号”;”。所以在bash脚本中是以回车或者分号作为一行命令结束的标志的。这基本上就是第一层级的解析,主要目的是将大段的命令行进行分段。
之后是第二层级解析,这一层级主要是区分所要执行的命令。这一层级主要解析的字符是管道”|”,&&、||这样的可以起到连接命令作用的特殊字符。这一层级解析完后,bash就能拿到最基本的一个个的要执行的命令了。
当然拿到命令之后还要继续第三层解析,这一层主要是区分出要执行的命令和其参数,主要解析的是空格和tab字符。这一层次解析完之后,bash才开始对最基本的字符串进行解释工作。当然,绝大多数解析完的字符串,bash都是在fork之后将其传递给exec进行执行,然后wait其执行完毕之后再解析下一行。这就是bash脚本也被叫做批处理脚本的原因,主要执行过程是一个一个指令串行执行的,上一个执行完才执行下一个。以上这个过程并不能涵盖bash解释字符串的全过程,实际情况要比这复杂。
bash在解释命令的时候为了方便一些操作和提高某些效率做了不少特性,包括alias功能和外部命令路径的hash功能。bash还因为某些功能不能做成外部命令,所以必须实现一些内建命令,比如cd、pwd等命令。当然除了内建命令以外,bash还要实现一些关键字,比如其编程语法结构的if或是while这样的功能。实际上作为一种编程语言,bash还要实现函数功能,我们可以理解为,bash的函数就是将一堆命令做成一个命令,然后调用执行这个名字,bash就是去执行事先封装好的那堆命令。
好吧,问题来了:我们已知有一个内建命令叫做cd,如果此时我们又建立一个alias也叫cd,那么当我在bash中敲入cd并回车之后,bash究竟是将它当成内建命令解释还是当成alias解释?同样,如果cd又是一个外部命令能?如果又是一个hash索引呢?如果又是一个关键字或函数呢?
实际上bash在做这些功能的时候已经安排好了它们在名字冲突的情况下究竟该先以什么方式解释。优先顺序是:
  1. 别名:alias
  2. 关键字:keyword
  3. 函数:function
  4. 内建命令:built in
  5. 哈西索引:hash
  6. 外部命令:command
这些bash要判断的字符串类型都可以用type命令进行判断,如:
[zorro@zorrozou-pc0 bash]$ type egrep
  1. 程序退出的返回码:1-127。这部分返回码一般用来作为给程序员自行设定错误退出用的返回码,比如:如果一个文件不存在,ls将返回2。如果要执行的命令不存在,则bash统一返回127。返回码125盒126有特殊用处,一个是程序命令不存在的返回码,另一个是命令的文件在,但是不可执行的返回码。
  2. 程序被信号打断的返回码:128-255。这部分系统习惯上是用来表示进程被信号打断的退出返回码的。一个进程如果被信号打断了,其退出返回码一般是128+信号编号的数字。

Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts