Saturday, June 28, 2014

Linux Command: xargs



http://man7.org/linux/man-pages/man1/xargs.1.html
-I replace-str
              Replace occurrences of replace-str in the initial-arguments
              with names read from standard input.
https://www.everythingcli.org/find-exec-vs-find-xargs/
find . [args] -exec [cmd] {} \;
{} Is a placeholder for the result found by find
\; Says that for each found result, the command (in this case ‘grep’) is executed once with the found result
find . [args] -exec [cmd] {} \+
{} Is a placeholder for the result found by find
\+ All result lines are concatenated and the command (in this case ‘grep’) is executed only a single time with all found results as a parameter
find . [args] -print0 | xargs -0 [cmd]
-print0 Tells find to print all results to std, each separated with the ASCII NUL character ‘\000’
-0 Tells xargs that the input will be separated with the ASCII NUL character ‘\000’
You have to use both or neither of them. The advantage is that all results will be handed over to xargs as a single string without newline separation. NUL charater separation is a way to escape files which also contain spaces in their filenames.
find .[args] -print0 | xargs -0 -n1 [cmd]
-n1 Tells xarg to execute the command [cmd] with only one argument (In this case only one file found by find). This is equal to:
find . -exec [cmd] {} \;


find .[args] -print0 | xargs -0 [cmd]
If no -n[int] is specified, xargs uses the default of -n5000 (see man xargs).
This means that xargs uses up to 5000 parameters for the command and executes it once, instead of 5000 times.
This is equal to:
find . -exec [cmd] {} +;



xargs has some additional useful parameter you should be aware of:


-t   # print each command prior execution
-p   # print each command and ask to execute it
-x   # make xargs quit if the nummber of arguments does not fit into the command line length


https://askubuntu.com/questions/339015/what-does-mean-in-a-linux-command
find . -name  * -exec ls -a {} \;

If you run find with exec{} expands to the filename of each file or directory found with find(so that ls in your example gets every found filename as an argument - note that it calls ls or whatever other command you specify once for each file found).
Semicolon ; ends the command executed by exec. It needs to be escaped with \ so that the shell you run find inside does not treat it as its own special character, but rather passes it to find.
Also, find provides some optimization with exec cmd {} + - when run like that, find appends found files to the end of the command rather than invoking it once per file (so that the command is run only once, if possible).
The difference in behavior (if not in efficiency) is easily noticeable if run with ls, e.g.
find ~ -iname '*.jpg' -exec ls {} \;
# vs
find ~ -iname '*.jpg' -exec ls {} +
Assuming you have some jpg files (with short enough paths), the result is one line per file in first case and standard ls behavior of displaying files in columns for the latter.

Reference: xargs: How To Control and Use Command Line Arguments
xargs reads items from the standard input or pipes, delimited by blanks or newlines, and executes the command one or more times with any initial-arguments followed by items read from standard input. Blank lines on the standard input are ignored.
echo 1 2 3 4 | xargs echo
Find all .bak files in or below the current directory and delete them.
find . -name "*.bak" -type f -print | xargs /bin/rm -f
{} as the argument list marker
{} is the default argument list marker. You need to use {} this with various command which take more than two arguments at a time. For example mv command need to know the file name. The following will find all .bak files in or below the current directory and move them to ~/.old.files directory:
find . -name "*.bak" -print0 | xargs -0 -I {} mv {} ~/old.files
You can rename {} to something else. In the following example {} is renamed as file. This is more readable as compare to previous example:
find . -name "*.bak" -print0 | xargs -0 -I file mv file ~/old.files
Avoiding errors and resource hungry problems with xargs and find combo
find /share/media/mp3/ -type f -name "*.mp3" -print0 | xargs -0 -r -I file cp -v -p file --target-directory=/bakup/iscsi/mp3

Reference: http://www.computerhope.com/unix/xargs.htm
--null, -0
Input items are terminated by a null character instead of by whitespace, and the quotes and backslash are not special (every character is taken literally). Disables the end-of-file string, which is treated like any other argument.
Useful when input items might contain white space, quote marks, or backslashes.
The find -print0 option produces input suitable for this mode.

find /tmp -name core -type f -print | xargs /bin/rm -f
Find files named core in or below the directory /tmp and delete them. (Note that this will work incorrectly if there are any filenames containing newlines or spaces.)
find /tmp -name core -type f -print0 | xargs -0 /bin/rm -f
find /tmp -depth -name core -type f -delete
Find files named core in or below the directory /tmp and delete them, but more efficiently than in the previous example (because we avoid the need to use fork and exec rm, and we don't need the extra xargs process).
cut -d: -f1 < /etc/passwd | sort | xargs echo
Uses cut to generate a compact listing of all the users on the system.

References: 10 xargs command example in Linux - Unix tutorial
with and without xargs
you can clearly see that multiline output is converted into single line:
find . -name "*bash*" | xargs
xargs and grep
find . -name "*.java" | xargs grep "Stock"
delete temporary file using find and xargs
find /tmp -name "*.tmp" | xargs rm
xargs -0 to handle space in file name
find /tmp -name "*.tmp" -print0 | xargs -0 rm
xargs and cut command in Unix
cut -d, -f1 smartphones.csv | sort | xargs
Counting number of lines in each file using xargs and find
ls -1 *.txt | xargs wc -l
Passing subset of arguments to xargs in Linux.
when used with xargs you can use flag "-n" to instruct xargs on how many argument it should pass to given command. this xargs command line option is extremely useful on certain situation like repeatedly doing diff etc
ls -1 *.txt | xargs -n 2 echo
avoid "Argument list too long"
xargs in unix or Linux was initially use to avoid "Argument list too long" errors and by using xargs you send sub-list to any command which is shorter than "ARG_MAX" and that's how xargs avoid "Argument list too long" error. You can see current value of "ARG_MAX" by using getconf ARG_MAX.

find –exec vs find + xargs

xargs with find command is much faster than using -exec on find. since -exec runs for each file while xargs operates on sub-list level. to give an example if you need to change permission of 10000 files 

http://stackoverflow.com/questions/25840713/illegal-option-when-using-find
The first argument to find is the path where it should start looking. The path . means the current directory.
find . -type f -name '*R'
You must provide at least one path, but you can actually provide as many as you want:
find ~/Documents ~/Library -type f -name '*R'
默认情况下,xargs只是把\n转换成了空格。
所以find . -maxdepth 1 | xargs du -sh
近似于:du -sh . ./plugins ./templates ./.git ./README.textile ./oh-my-zsh.sh ./log ./.gitignore ./custom ./cache ./MIT-LICENSE.txt ./themes ./tools ./lib
find . -maxdepth 1 -exec du -sh {} \;
可见find -exec是对每一个匹配的文件都执行了命令。
不过,如果不是以{} \;结尾({}会被替换成匹配的文件名),而是以{} +结尾,那么命令只会在最后执行一次。这类似于使用xargs的结果。
BTW,既然xargs只是简单地把\n变成空格,那么可能存在一些问题。比如文件名带空格,这时候xargs就给跪了。万一空格之后是-xx那样的内容,那就有得看了。当然通过指定\0作为分隔符,可以避免这种情况:
find . -maxdepth 1 -print0 | xargs -0 xxx
但是这么麻烦,还是坚持使用find . -exec xxx {} \;好了。
后记: 经别人指正,其实使用du -hd 1就能统计当前目录下各个文件(夹)占用的大小,无需使用find -maxdepth 1 -exec du -sh {} \;


Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts