Massive Technical Interviews Tips: Linux Grep

https://unix.stackexchange.com/questions/13466/can-grep-output-only-specified-groupings-that-match

GNU grep has the -P option for perl-style regexes, and the -o option to print only what matches the pattern. These can be combined using look-around assertions (described under Extended Patterns in the perlre manpage) to remove part of the grep pattern from what is determined to have matched for the purposes of -o.

$ grep -oP 'foobar \K\w+' test.txt



The \K is the short-form (and more efficient form) of (?<=pattern) which you use as a zero-width look-behind assertion before the text you want to output. (?=pattern) can be used as a zero-width look-ahead assertion after the text you want to output.

For instance, if you wanted to match the word between foo and bar, you could use:
$ grep -oP 'foo \K\w+(?= bar)' test.txt


or (for symmetry)
$ grep -oP '(?<=foo )\w+(?= bar)' test.txt

Well, if you know that foobar is always the first word or the line, then you can use cut. Like so:

grep "foobar" test.file | cut -d" " -f2

http://man7.org/linux/man-pages/man1/find.1.html
-exec command ;
Execute command; true if 0 status is returned. All following
arguments to find are taken to be arguments to the command
until an argument consisting of `;' is encountered. The
string `{}' is replaced by the current file name being
processed everywhere it occurs in the arguments to the
command, not just in arguments where it is alone, as in some
versions of find. Both of these constructions might need to
be escaped (with a `\') or quoted to protect them from
expansion by the shell. See the EXAMPLES section for examples
of the use of the -exec option. The specified command is run
once for each matched file. The command is executed in the
starting directory. There are unavoidable security problems
surrounding use of the -exec action; you should use the
-execdir option instead.

-exec command {} +
This variant of the -exec action runs the specified command on
the selected files, but the command line is built by appending
each selected file name at the end; the total number of
invocations of the command will be much less than the number
of matched files. The command line is built in much the same
way that xargs builds its command lines. Only one instance of
`{}' is allowed within the command, and (when find is being
invoked from a shell) it should be quoted (for example, '{}')
to protect it from interpretation by shells. The command is
executed in the starting directory. If any invocation returns
a non-zero value as exit status, then find returns a non-zero
exit status. If find encounters an error, this can sometimes
cause an immediate exit, so some pending commands may not be
run at all. This variant of -exec always returns true.

-s, --no-messages
Silent mode. Nonexistent and unreadable files are ignored (i.e. their error messages are suppressed).
https://en.wikibooks.org/wiki/Grep

-e pattern
-i: Ignore uppercase vs. lowercase.
-v: Invert match.
-c: Output count of matching lines only.
-l: Output matching files only.
-n: Precede each matching line with a line number.
-b: A historical curiosity: precede each matching line with a block number.

[Find multiple patterns across multiple lines](https://stackoverflow.com/questions/7422743/grep-for-multiple-patterns-over-multiple-files)
- find . | xargs grep 'pattern1' -sl | xargs grep 'pattern2' -sl
http://stackoverflow.com/questions/6637882/how-can-i-use-grep-to-show-just-filenames-no-in-line-matches-on-linux

The standard option grep -l (that is a lowercase L) could do this.

http://ftp.gnu.org/old-gnu/Manuals/grep-2.4/html_node/grep_7.html

How can I list just the names of matching files?
```
grep -l 'main' *.c
```
lists the names of all C files in the current directory whose contents mention `main'.
How do I search directories recursively?
```
grep -r 'hello' /home/gigi
```
searches for `hello' in all files under the directory `/home/gigi'. For more control of which files are searched, use @command{find}, @command{grep} and @command{xargs}. For example, the following command searches only C files:
```
find /home/gigi -name '*.c' -print | xargs grep 'hello' /dev/null
```
What if a pattern has a leading `-'?
```
grep -e '--cut here--' *
```
searches for all lines matching `--cut here--'. Without `-e', @command{grep} would attempt to parse `--cut here--' as a list of options.
Suppose I want to search for a whole word, not a part of a word?
```
grep -w 'hello' *
```
searches only for instances of `hello' that are entire words; it does not match `Othello'. For more control, use `\<' and `\>' to match the start and end of words. For example:
```
grep 'hello\>' *
```
searches only for words ending in `hello', so it matches the word `Othello'.
How do I output context around the matching lines?
```
grep -C 2 'hello' *
```
prints two lines of context around each matching line.
How do I force grep to print the name of the file? Append `/dev/null':
```
grep 'eli' /etc/passwd /dev/null
```
Why do people use strange regular expressions on @command{ps} output?
```
ps -ef | grep '[c]ron'
```
If the pattern had been written without the square brackets, it would have matched not only the @command{ps} output line for @command{cron}, but also the @command{ps} output line for @command{grep}.
Why does @command{grep} report "Binary file matches"? If @command{grep} listed all matching "lines" from a binary file, it would probably generate output that is not useful, and it might even muck up your display. So GNU @command{grep} suppresses output from files that appear to be binary files. To force GNU @command{grep} to output lines even from files that appear to be binary, use the `-a' or `--text'option.
Why doesn't `grep -lv' print nonmatching file names? `grep -lv' lists the names of all files containing one or more lines that do not match. To list the names of all files that contain no matching lines, use the `-L' or `--files-without-match' option.
I can do OR with `|', but what about AND?
```
grep 'paul' /etc/motd | grep 'franc,ois'
```
finds all lines that contain both `paul' and `franc,ois'.
How can I search in both standard input and in files? Use the special file name `-':
```
cat /etc/passwd | grep 'alain' - /etc/motd
```

http://miguelcamba.com/blog/2013/09/16/quick-tip-show-surrounding-lines-with-grep/
tail -f logs/production.log | grep "500 Internal Server Error"

-A NUM, --after-context=NUM Print NUM lines of trailing context after matching lines. Places a line containing -- between contiguous groups of matches. -a, --text Process a binary file as if it were text; this is equivalent to the --binary-files=text option. -B NUM, --before-context=NUM Print NUM lines of leading context before matching lines. Places a line containing -- between contiguous groups of matches. -C NUM, --context=NUM Print NUM lines of output context. Places a line containing -- between contiguous groups of matches.
tail -f logs/production.log | grep "500 Internal Server Error" -B 2 -A 5
http://stackoverflow.com/questions/9081/grep-a-file-but-show-several-surrounding-lines

For BSD or GNU grep you can use -B num to set how many lines before the match and -A numfor the number of lines after the match.

grep -B 3 -A 2 foo README.txt

If you want the same number of lines before and after you can use -C num.

grep -C 3 foo README.txt

This will show 3 lines before and 3 lines after.

http://stackoverflow.com/questions/3213748/get-line-number-while-using-grep

Line numbers are printed with grep -n:

grep -n pattern file.txt

To get only the line number (without the matching line), one may use cut:

grep -n pattern file.txt | cut -d : -f 1

Lines not containing a pattern are printed with grep -v:

grep -v pattern file.txt

http://www.cyberciti.biz/faq/unix-linux-grep-show-line-numbers-on-screen/

The `-n` or `--line-number` grep option

You can pass either -n or --line-number option to the grep command to prefix each line of output with the line number within its input file. The syntax is:

grep -n 'patten' file grep -n 'patten' file1 file2 grep -n [options] 'pattens' file

http://www.gnu.org/software/grep/manual/html_node/Basic-vs-Extended.html

In basic regular expressions the meta-characters ‘?’, ‘+’, ‘{’, ‘|’, ‘(’, and ‘)’ lose their special meaning; instead use the backslashed versions ‘\?’, ‘\+’, ‘\{’, ‘\|’, ‘$’, and ‘$’.

Traditional egrep did not support the ‘{’ meta-character, and some egrep implementations support ‘\{’ instead, so portable scripts should avoid ‘{’ in ‘grep -E’ patterns and should use ‘[{]’ to match a literal ‘{’.

GNU grep -E attempts to support traditional usage by assuming that ‘{’ is not special if it would be the start of an invalid interval specification. For example, the command ‘grep -E '{1'’ searches for the two-character string ‘{1’ instead of reporting a syntax error in the regular expression. POSIX allows this behavior as an extension, but portable scripts should avoid it.

http://unix.stackexchange.com/questions/17949/what-is-the-difference-between-grep-egrep-and-fgrep

egrep is 100% equivalent to grep -E
fgrep is 100% equivalent to grep -F

Historically these switches were provided in separate binaries. On some really old Unix systems you will find that you need to call the separate binaries, but on all modern systems the switches are preferred. The man page for grep has details about this.

As for what they do, -E switches grep into a special mode so that the expression is evaluated as an ERE (Extended Regular Expression) as opposed to its normal pattern matching. Details of this syntax are on the man page.

-E, --extended-regexp
Interpret PATTERN as an extended regular expression

The -F switch switches grep into a different mode where it accepts a pattern to match, but then splits that pattern up into one search string per line and does an OR search on any of the strings without doing any special pattern matching.

-F, --fixed-strings
Interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched.

Here are some example scenarios:

You have a file with a list of say ten Unix usernames in plain text. You want to search the group file on your machine to see if any of the ten users listed are in any special groups:
```
grep -F -f user_list.txt /etc/group
```
The reason the -F switch helps here is that the usernames in your pattern file are interpreted as plain text strings. Dots for example would be interpreted as dots rather than wild-cards.
You want to search using a fancy expression. For example parenthesis () can be used to indicate groups with | used as an OR operator. You could run this search using -E:
```
grep -E '^no(fork|group)' /etc/group
```
...to return lines that start with either "nofork" or "nogroup". Without the -E switch you would have to escape the special characters involved because with normal pattern matching they would just search for that exact pattern;
```
grep '^no$fork\|group$' /etc/group
```

http://www.cs.columbia.edu/~tal/3261/fall07/handout/egrep_mini-tutorial.htm

http://stackoverflow.com/questions/2914197/how-to-grep-out-specific-line-ranges-of-a-file

The following command will do what you asked for "extract the lines between 1234 and 5555" in someFile.

sed -n '1234,5555p' someFile

http://www.thegeekstuff.com/2011/01/regular-expressions-in-grep-command/
Example 1. Beginning of line ( ^ )
Example 2. End of the line ( $)
$ grep "terminating.$" messages
$ grep "^Nov 10" messages.1

Example 4. Single Character (.)
$ grep ".ello" input

Example 5. Zero or more occurrence (*)
$ grep "kernel: *." *

Example 6. One or more occurrence (\+)

Friday, September 23, 2016

Linux Grep

The `-n` or `--line-number` grep option

Labels

Popular Posts

Friday, September 23, 2016

Linux Grep

The -n or --line-number grep option

Labels

Popular Posts

The `-n` or `--line-number` grep option