Massive Technical Interviews Tips: Linux Tools

Tuesday, January 19, 2016

Linux Tools

http://fahdshariff.blogspot.com/2011/08/lessopen-powers-up-less.html
A really useful feature of the Unix less pager is LESSOPEN which is the "input preprocessor" for less. This is a script, defined in the LESSOPEN environment variable, which is invoked before the file is opened. It gives you the chance to modify the way the contents of the file are displayed. Why would you want to do this? The most common reason is to uncompress files before you view them, allowing you to less GZ files. But it also allows you to list the contents of zip files and other archives. I like to use it to format XML files and to view Java class files by invoking jad.

You can download a really useful LESSOPEN script from http://sourceforge.net/projects/lesspipe/ and then extend it if necessary.

To use it, simply add export LESSOPEN="|/path/to/bin/lesspipe.sh %s" to your bashrc.

Use `less -N` (or type -N while in less) to enable line numbers.

# Enable syntax-highlighting in less.## 

First, add these two lines to ~/.bashrc

# export LESSOPEN="| /opt/local/bin/src-hilite-lesspipe.sh %s"

# export LESS=" -R "

pygmentize somefile.ex | less -R

https://www.gnu.org/software/gettext/manual/html_node/Customizing-less.html

https://blog.tersmitten.nl/how-to-colorize-your-log-files-with-ccze.html
yum install ccze (Red Hat/CentOS)
tail -f -n 50 /var/log/syslog | ccze

brew install source-highlight; export LESSOPEN="| /usr/local/bin/src-hilite-lesspipe.sh %s". Note the path change

https://superuser.com/questions/117841/get-colors-in-less-or-more
http://www.gnu.org/software/src-highlite/

https://stackoverflow.com/questions/14573485/pygments-piped-to-less-inside-python-script-breaks-highlighting

That's less's fault, not Python's. Run less with the -R switch:

-R or --RAW-CONTROL-CHARS

Like -r, but only ANSI "color" escape sequences are output in "raw" form. Unlike -r, the screen appearance is maintained correctly in most cases. ANSI "color" escape sequences are sequences of the form:

https://github.com/garabik/grc

grc netstat
grc ping hostname
grc tail /var/log/syslog
grc ps aux

grc tail -f solr.log

brew install grc

https://bitbucket.org/birkenfeld/pygments-main/pull-requests/165/added-s-option-to-support-use-with-tail-f/diff

pygmentize currently only processes entire files. For example, it currently hangs if you try it with tail -f:

tail -f sql.log | pygmentize -l sql

This pull request adds a -s option to support the above example (and others). Like this:

tail -f sql.log | pygmentize -s -l sql

-s was chosen for 'streaming'.

http://pygments.org/docs/lexers/

class pygments.lexers.hdl.SystemVerilogLexer

Short names:	systemverilog, sv

class pygments.lexers.prolog.LogtalkLexer

Short names:	logtalk
Filenames:	.lgt, .logtalk
MIME types:	text/x-logtalk

For Logtalk source code.

New in version 0.10.

class pygments.lexers.prolog.PrologLexer

Short names:	prolog
Filenames:	.ecl, .prolog, .pro, .pl
MIME types:	text/x-prolog

Lexer for Prolog files.

https://lilyfeng.wordpress.com/2013/07/17/the-difference-among-virt-res-and-shr-in-top-output/

VIRT stands for the virtual size of a process, which is the sum of memory it is actually using, memory it has mapped into itself (for instance the video card’s RAM for the X server), files on disk that have been mapped into it (most notably shared libraries), and memory shared with other processes. VIRT represents how much memory the program is able to access at the present moment.

RES stands for the resident size, which is an accurate representation of how much actual physical memory a process is consuming. (This also corresponds directly to the %MEM column.) This will virtually always be less than the VIRT size, since most programs depend on the C library.

SHR indicates how much of the VIRT size is actually sharable (memory or libraries). In the case of libraries, it does not necessarily mean that the entire library is resident. For example, if a program only uses a few functions in a library, the whole library is mapped and will be counted in VIRT and SHR, but only the parts of the library file containing the functions being used will actually be loaded in and be counted under RES.

http://www.dba-oracle.com/t_linux_oracle_monitor_with_top.htm

On the lower part of the top output, the information on each process can be seen. Specifically the VIRT, RES, SHR and %MEM columns are of interest for memory usage. Here is what these values represent:

VIRT: The total amount of memory the process is using, including RAM, swap and any shared memory being accessed
RES: The amount of resident memory, or real RAM being used by this process
SHR: The amount of shared memory being accessed by this process
%MEM: The percentage of physical (resident) memory used by this process

When investigating memory usage, it may be useful to sort processes by one of these columns. In top, you can change the sort order by pressing 'F', then choosing from a list of sort columns. The output listed below is sorted by the VIRT column.

https://unix.stackexchange.com/questions/128953/how-to-display-top-results-sorted-by-memory-usage-in-real-time

Use quick tip using top command in Linux/Unix

top

hit Shift+F, then choose the display to order by memory usage by hitting key N (without Shift) then press Enter. You will see active process ordered by memory usage.

Or you can just press Shift+M after running the top command.

On OS X 10.10 the command top -o MEM seems to work.

https://www.jianshu.com/p/e4cd53186202

背景：服务在平稳运行一段时间后，CPU突然飙高。

通过top命令，可以确认下，到底是哪个进程导致CPU飙高了（也许是误报呢？）。

可以看到图中PID是2816的进程，CPU使用率非常高。

使用top -Hp 2816来对进程下的线程进行观察。图中可以发现，2825这个线程CPU非常高。

进制转换

这里利用Python非常方便的把十进制的线程ID转化成了16进制，为什么要这么做呢？

因为在接下来的线程DUMP文件中使用的就是16进制的NID。

在实际中，我们应该利用jstack pid多DUMP几次，因为线程存在状态转换，因此多次DUMP有利于抓取到线程更多的信息。

图中，你可以观察到，一个线程得到了锁，在运行，迟迟没有释放，而另一个线程一直在等待这个锁。至此，就可以到去查看代码去分析为什么锁迟迟不释放的原因了。

第一行：
涉及到2个时间，一个是系统时间，一个是机器运行的时间。【我们应该重点关注的是机器运行的时间，Why? 有时候，重启机器能带来很多问题，你懂的！】
多少用户登录了系统？【通过who/w/history可以查到更多信息】
3个load值是什么含义？
分别代表的是1MIN,5MIN,15MIN机器的负载情况，如何确定负载的大小呢？需要和CPU的核数相结合来看，比如该机器是4核CPU，那么如果load值超过了4，就意味着负载很大了！【在top下按下1可以观察出CPU的个数】
上述信息，其实也可以通过uptime命令来获取。
第二行：
主要是总共有多少个任务，重点应该关注的是僵尸状态的任务数。
第三行：
主要是CPU的一些信息。
US/SY，说的就是用户进程和系统进程使用CPU的占比。
NI，即NICE，表示被调整过线程优先级的进程占比，这个比例正常不应该很大。
ID，表示空闲；WA表示资源等待的时间，比如在瞬时大流量下，服务打了很多日志的话，那么这个值就会飙高，因为这会很消耗资源的。
HI，硬中断，一般就是外设引起的，如果HI飙高的话，那么意味着外设在硬件层面出现了问题。SI表示软中断。
ST，即steel，如果该主机是虚拟的话会有这个ST信息，也即是该虚拟机从宿主机获取CPU的时间片的百分占比

第四和第五行：
这里主要说2个概念性的东西：buffer 和 cache。
buffer主要是什么呢？应该是待处理的数据，主要是处理2个系统之间速度不匹配的问题。而cache，一般应该是结果数据的缓存，比如从DB加载一些信息供查询用。
SWAP分区，就是想利用硬盘的做一部分缓存，如果SWAP交换非常频繁的话，就是说内存不够用！
列表说明：
PID 进程ID、USER 用户、PR 优先级、VIRT 虚拟内存、RES 驻留内存、SHR 共享内存
这里需要指出的是，RES表示的是该进程实际占用的内存，而并不是申请的内存大小。也就是说当前进程所占用的内存物理大小是 RES-SHR

http://hellojava.info/?p=517

查问题的时候会非常依赖日志，因此看日志的相关工具非常重要，通常的话掌握好tail,find,fgrep,awk这几个常用工具的方法就可以，说到这个就必须说关键的异常和信息日志输出是多么的重要（看过太多异常的随意处理，例如很典型的是应用自己的ServletContextListener实现，很多的Listener实现都会变成往外抛RuntimeException，然后直接导致tomcat退出，而tomcat这个时候也不会输出这个异常信息，这种时候要查原因真的是让人很郁闷，尽管也有办法）。
日志的标准化也非常重要，日志的标准化一方面方便像我这种要查各种系统问题的人，不标准的话连日志在哪都找不到；另一方面对于分布式系统而言，如果标准化的话是很容易做日志tracing的，对问题定位会有很大帮助。

sar
sar有助于查看历史指标数据，除了CPU外，其他内存，磁盘，网络等等各种指标都可以查看，毕竟大部分时候问题都发生在过去，所以翻历史记录非常重要。
jstack
jstack可以用来查看Java进程里的线程都在干什么，这通常对于应用没反应，非常慢等等场景都有不小的帮助，jstack默认只能看到Java栈，而jstack -m则可以看到线程的Java栈和native栈，但如果Java方法被编译过，则看不到（然而大部分经常访问的Java方法其实都被编译过）。
pstack
pstack可以用来看Java进程的native栈。

perf
一些简单的CPU消耗的问题靠着top -H + jstack通常能解决，复杂的话就需要借助perf这种超级利器了。
cat /proc/interrupts
之所以提这个是因为对于分布式应用而言，频繁的网络访问造成的网络中断处理消耗也是一个关键，而这个时候网卡的多队列以及均衡就非常重要了，所以如果观察到cpu的si指标不低，那么看看interrupts就有必要了。

内存相关工具
碰到一些内存相关的问题时，通常需要用到的工具：
jstat
jstat -gcutil或-gc等等有助于实时看gc的状况，不过我还是比较习惯看gc log。
jmap
在需要dump内存看看内存里都是什么的时候，jmap -dump可以帮助你；在需要强制执行fgc的时候（在CMS GC这种一定会产生碎片化的GC中，总是会找到这样的理由的），jmap -histo:live可以帮助你（显然，不要随便执行）。
gcore
相比jmap -dump，其实我更喜欢gcore，因为感觉就是更快，不过由于某些jdk版本貌似和gcore配合的不是那么好，所以那种时候还是要用jmap -dump的。
mat
有了内存dump后，没有分析工具的话然并卵，mat是个非常赞的工具，好用的没什么可说的。
btrace
少数的问题可以mat后直接看出，而多数会需要再用btrace去动态跟踪，btrace绝对是Java中的超级神器，举个简单例子，如果要你去查下一个运行的Java应用，哪里在创建一个数组大小>1000的ArrayList，你要怎么办呢，在有btrace的情况下，那就是秒秒钟搞定的事，:)
gperf
Java堆内的内存消耗用上面的一些工具基本能搞定，但堆外就悲催了，目前看起来还是只有gperf还算是比较好用的一个，或者从经验上来说Direct ByteBuffer、Deflater/Inflater这些是常见问题。
除了上面的工具外，同样内存信息的记录也非常重要，就如日志一样，所以像GC日志是一定要打开的，确保在出问题后可以翻查GC日志来对照是否GC有问题，所以像-XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc: 这样的参数必须是启动参数的标配。

ClassLoader相关工具
作为Java程序员，不碰到ClassLoader问题那基本是不可能的，在排查此类问题时，最好办的还是-XX:+TraceClassLoading，或者如果知道是什么类的话，我的建议就是把所有会装载的lib目录里的jar用jar -tvf *.jar这样的方式来直接查看冲突的class，再不行的话就要呼唤btrace神器去跟踪Classloader.defineClass之类的了。

其他工具
jinfo
Java有N多的启动参数，N多的默认值，而任何文档都不一定准确，只有用jinfo -flags看到的才靠谱，甚至你还可以看看jinfo -flag，你会发现更好玩的。
dmesg
你的java进程突然不见了？也许可以试试dmesg先看看。
systemtap
有些问题排查到java层面是不够的，当需要trace更底层的os层面的函数调用的时候，systemtap神器就可以派上用场了。
gdb
更高级的玩家们，拿着core dump可以用gdb来排查更诡异的一些问题。

io类型的问题我排查的很少，所以尽管知道一些工具，还是不在这里写了

http://stackoverflow.com/questions/7420501/jstack-target-process-not-responding
jstack: Target process not responding

I got it working by doing two things:

Changed call to: sudo -u tomcat6 jstack -J-d64 -m pid

http://zeroturnaround.com/rebellabs/5-command-line-tools-you-should-be-using/
http://blog.jobbole.com/99902/
The first on my list is a tool called HTTPie. Fear not, this tool has nothing to do with Internet Explorer, fortunately. In essence HTTPie is a cURL wrapper, the utility that performs HTTP requests from the command line. HTTPie adds nice features like auto-formatting and intelligent colour highlighting to the output making it much more readable and useful to the user. Additionally, it takes a very human-centric approach to its execution, not asking you to remember obscure flags and options. To perform an HTTP GET, you simply run http, to post you http POST,

brew install httpie

brew install icdiff
Secondly you need to config your VCS to actually use icdiff

http://pandoc.org/
http://babun.github.io/

Autojump
https://github.com/wting/autojump
http://lifehacker.com/5583546/autojump-is-a-faster-way-to-browse-your-filesystem

Autojump learns your most commonly used folders, and makes switching between them an easy task, requiring you to enter only part of the directory path to switch between them.

Jump To A Directory That Contains foo:

j foo

j -s

Jump To A Child Directory:

Sometimes it's convenient to jump to a child directory (sub-directory of current directory) rather than typing out the full name.

jc bar

Open File Manager To Directories (instead of jumping):

Instead of jumping to a directory, you can open a file explorer window (Mac Finder, Windows Explorer, GNOME Nautilus, etc.) to the directory instead.

jo music

Opening a file manager to a child directory is also supported:

jco images

tldr
https://github.com/tldr-pages/tldr

http://www.codeceo.com/article/5-command-tools-you-may-overlook.html
这是一个跨平台的工具，允许你下载YouTube视频
brew install youtube-dll
brew install shellcheck

对于系统管理员和开发运营人员：停止使用 tail -f，并开始使用multitail。这个最终的日志查看器允许你做一些非常酷的事情，非常值得一提。
brew install multitail
tree能输出一个不错的，结构化的目录树视图，让你直观看到你的数据结构
brew install tree

Amazon has their own Linux distribution based on Red Hat Enterprise Linux and later linux-xen-kernel.

https://www.digitalocean.com/community/tutorials/how-to-install-java-on-centos-and-fedora
sudo yum install java-1.7.0-openjdk

sudo yum install java-1.7.0-openjdk-devel

The alternatives command, which manages default commands through symbolic links, can be used to select the default Java command.
sudo alternatives --config java

export JAVA_HOME=/usr/java/jdk1.8.0_60/jre

If you want JAVA_HOME to be set for every user on the system by default, add the previous line to the/etc/environment file. An easy way to append it to the file is to run this command:

sudo sh -c "echo export JAVA_HOME=/usr/java/jdk1.8.0_60/jre >> /etc/environment"

http://lintut.com/how-to-install-java-8-on-rhel-centos-7-x-and-fedora-linux/

yum remove java-1.7.0-openjdk

Tuesday, January 19, 2016

Linux Tools

Labels

Popular Posts