Friday, September 23, 2016

Linux Grep



https://unix.stackexchange.com/questions/13466/can-grep-output-only-specified-groupings-that-match

GNU grep has the -P option for perl-style regexes, and the -o option to print only what matches the pattern. These can be combined using look-around assertions (described under Extended Patterns in the perlre manpage) to remove part of the grep pattern from what is determined to have matched for the purposes of -o.
$ grep -oP 'foobar \K\w+' test.txt
The \K is the short-form (and more efficient form) of (?<=pattern) which you use as a zero-width look-behind assertion before the text you want to output. (?=pattern) can be used as a zero-width look-ahead assertion after the text you want to output.
For instance, if you wanted to match the word between foo and bar, you could use:
$ grep -oP 'foo \K\w+(?= bar)' test.txt
or (for symmetry)
$ grep -oP '(?<=foo )\w+(?= bar)' test.txt

Well, if you know that foobar is always the first word or the line, then you can use cut. Like so:
grep "foobar" test.file | cut -d" " -f2

http://man7.org/linux/man-pages/man1/find.1.html
       -exec command ;
              Execute command; true if 0 status is returned.  All following
              arguments to find are taken to be arguments to the command
              until an argument consisting of `;' is encountered.  The
              string `{}' is replaced by the current file name being
              processed everywhere it occurs in the arguments to the
              command, not just in arguments where it is alone, as in some
              versions of find.  Both of these constructions might need to
              be escaped (with a `\') or quoted to protect them from
              expansion by the shell.  See the EXAMPLES section for examples
              of the use of the -exec option.  The specified command is run
              once for each matched file.  The command is executed in the
              starting directory.  There are unavoidable security problems
              surrounding use of the -exec action; you should use the
              -execdir option instead.

       -exec command {} +
              This variant of the -exec action runs the specified command on
              the selected files, but the command line is built by appending
              each selected file name at the end; the total number of
              invocations of the command will be much less than the number
              of matched files.  The command line is built in much the same
              way that xargs builds its command lines.  Only one instance of
              `{}' is allowed within the command, and (when find is being
              invoked from a shell) it should be quoted (for example, '{}')
              to protect it from interpretation by shells.  The command is
              executed in the starting directory.  If any invocation returns
              a non-zero value as exit status, then find returns a non-zero
              exit status.  If find encounters an error, this can sometimes
              cause an immediate exit, so some pending commands may not be
              run at all.  This variant of -exec always returns true.




-s, --no-messages
             Silent mode.  Nonexistent and unreadable files are ignored (i.e. their error messages are suppressed).
https://en.wikibooks.org/wiki/Grep
  • -e pattern
  • -i: Ignore uppercase vs. lowercase.
  • -v: Invert match.
  • -c: Output count of matching lines only.
  • -l: Output matching files only.
  • -n: Precede each matching line with a line number.
  • -b: A historical curiosity: precede each matching line with a block number.

[Find multiple patterns across multiple lines](https://stackoverflow.com/questions/7422743/grep-for-multiple-patterns-over-multiple-files)
- find . | xargs grep 'pattern1' -sl | xargs grep 'pattern2' -sl
http://stackoverflow.com/questions/6637882/how-can-i-use-grep-to-show-just-filenames-no-in-line-matches-on-linux
The standard option grep -l (that is a lowercase L) could do this.

http://ftp.gnu.org/old-gnu/Manuals/grep-2.4/html_node/grep_7.html
  1. How can I list just the names of matching files?
    grep -l 'main' *.c
    
    lists the names of all C files in the current directory whose contents mention `main'.
  2. How do I search directories recursively?
    grep -r 'hello' /home/gigi
    
    searches for `hello' in all files under the directory `/home/gigi'. For more control of which files are searched, use @command{find}, @command{grep} and @command{xargs}. For example, the following command searches only C files:
    find /home/gigi -name '*.c' -print | xargs grep 'hello' /dev/null
    
  3. What if a pattern has a leading `-'?
    grep -e '--cut here--' *
    
    searches for all lines matching `--cut here--'. Without `-e', @command{grep} would attempt to parse `--cut here--' as a list of options.
  4. Suppose I want to search for a whole word, not a part of a word?
    grep -w 'hello' *
    
    searches only for instances of `hello' that are entire words; it does not match `Othello'. For more control, use `\<' and `\>' to match the start and end of words. For example:
    grep 'hello\>' *
    
    searches only for words ending in `hello', so it matches the word `Othello'.
  5. How do I output context around the matching lines?
    grep -C 2 'hello' *
    
    prints two lines of context around each matching line.
  6. How do I force grep to print the name of the file? Append `/dev/null':
    grep 'eli' /etc/passwd /dev/null
    
  7. Why do people use strange regular expressions on @command{ps} output?
    ps -ef | grep '[c]ron'
    
    If the pattern had been written without the square brackets, it would have matched not only the @command{ps} output line for @command{cron}, but also the @command{ps} output line for @command{grep}.
  8. Why does @command{grep} report "Binary file matches"? If @command{grep} listed all matching "lines" from a binary file, it would probably generate output that is not useful, and it might even muck up your display. So GNU @command{grep} suppresses output from files that appear to be binary files. To force GNU @command{grep} to output lines even from files that appear to be binary, use the `-a' or `--text'option.
  9. Why doesn't `grep -lv' print nonmatching file names? `grep -lv' lists the names of all files containing one or more lines that do not match. To list the names of all files that contain no matching lines, use the `-L' or `--files-without-match' option.
  10. I can do OR with `|', but what about AND?
    grep 'paul' /etc/motd | grep 'franc,ois'
    
    finds all lines that contain both `paul' and `franc,ois'.
  11. How can I search in both standard input and in files? Use the special file name `-':
    cat /etc/passwd | grep 'alain' - /etc/motd
http://miguelcamba.com/blog/2013/09/16/quick-tip-show-surrounding-lines-with-grep/
tail -f logs/production.log | grep "500 Internal Server Error"

-A NUM, --after-context=NUM Print NUM lines of trailing context after matching lines. Places a line containing -- between contiguous groups of matches. -a, --text Process a binary file as if it were text; this is equivalent to the --binary-files=text option. -B NUM, --before-context=NUM Print NUM lines of leading context before matching lines. Places a line containing -- between contiguous groups of matches. -C NUM, --context=NUM Print NUM lines of output context. Places a line containing -- between contiguous groups of matches.
tail -f logs/production.log | grep "500 Internal Server Error" -B 2 -A 5
http://stackoverflow.com/questions/9081/grep-a-file-but-show-several-surrounding-lines
For BSD or GNU grep you can use -B num to set how many lines before the match and -A numfor the number of lines after the match.
grep -B 3 -A 2 foo README.txt
If you want the same number of lines before and after you can use -C num.
grep -C 3 foo README.txt
This will show 3 lines before and 3 lines after.
http://stackoverflow.com/questions/3213748/get-line-number-while-using-grep
Line numbers are printed with grep -n:
grep -n pattern file.txt
To get only the line number (without the matching line), one may use cut:
grep -n pattern file.txt | cut -d : -f 1
Lines not containing a pattern are printed with grep -v:
grep -v pattern file.txt
http://www.cyberciti.biz/faq/unix-linux-grep-show-line-numbers-on-screen/

The -n or --line-number grep option

You can pass either -n or --line-number option to the grep command to prefix each line of output with the line number within its input file. The syntax is:
grep -n 'patten' file
grep -n 'patten' file1 file2
grep -n [options] 'pattens' file


http://www.gnu.org/software/grep/manual/html_node/Basic-vs-Extended.html
In basic regular expressions the meta-characters ‘?’, ‘+’, ‘{’, ‘|’, ‘(’, and ‘)’ lose their special meaning; instead use the backslashed versions ‘\?’, ‘\+’, ‘\{’, ‘\|’, ‘\(’, and ‘\)’.
Traditional egrep did not support the ‘{’ meta-character, and some egrep implementations support ‘\{’ instead, so portable scripts should avoid ‘{’ in ‘grep -E’ patterns and should use ‘[{]’ to match a literal ‘{’.
GNU grep -E attempts to support traditional usage by assuming that ‘{’ is not special if it would be the start of an invalid interval specification. For example, the command ‘grep -E '{1'’ searches for the two-character string ‘{1’ instead of reporting a syntax error in the regular expression. POSIX allows this behavior as an extension, but portable scripts should avoid it.
http://unix.stackexchange.com/questions/17949/what-is-the-difference-between-grep-egrep-and-fgrep
  • egrep is 100% equivalent to grep -E
  • fgrep is 100% equivalent to grep -F
Historically these switches were provided in separate binaries. On some really old Unix systems you will find that you need to call the separate binaries, but on all modern systems the switches are preferred. The man page for grep has details about this.
As for what they do, -E switches grep into a special mode so that the expression is evaluated as an ERE (Extended Regular Expression) as opposed to its normal pattern matching. Details of this syntax are on the man page.
-E, --extended-regexp
Interpret PATTERN as an extended regular expression
The -F switch switches grep into a different mode where it accepts a pattern to match, but then splits that pattern up into one search string per line and does an OR search on any of the strings without doing any special pattern matching.
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched.
Here are some example scenarios:
  • You have a file with a list of say ten Unix usernames in plain text. You want to search the group file on your machine to see if any of the ten users listed are in any special groups:
    grep -F -f user_list.txt /etc/group
    
    The reason the -F switch helps here is that the usernames in your pattern file are interpreted as plain text strings. Dots for example would be interpreted as dots rather than wild-cards.
  • You want to search using a fancy expression. For example parenthesis () can be used to indicate groups with | used as an OR operator. You could run this search using -E:
    grep -E '^no(fork|group)' /etc/group
    
    ...to return lines that start with either "nofork" or "nogroup". Without the -E switch you would have to escape the special characters involved because with normal pattern matching they would just search for that exact pattern;
    grep '^no\(fork\|group\)' /etc/group
http://www.cs.columbia.edu/~tal/3261/fall07/handout/egrep_mini-tutorial.htm

http://stackoverflow.com/questions/2914197/how-to-grep-out-specific-line-ranges-of-a-file
The following command will do what you asked for "extract the lines between 1234 and 5555" in someFile.
sed -n '1234,5555p' someFile

http://www.thegeekstuff.com/2011/01/regular-expressions-in-grep-command/
Example 1. Beginning of line ( ^ )
Example 2. End of the line ( $)
$ grep "terminating.$" messages
$ grep "^Nov 10" messages.1

Example 4. Single Character (.)
$ grep ".ello" input

Example 5. Zero or more occurrence (*)
$ grep "kernel: *." *

Example 6. One or more occurrence (\+)



Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts