Friday, October 9, 2015

Netty



https://en.wikipedia.org/wiki/Netty_(software)
Netty is a non-blocking I/O (NIO) client-server framework for the development of Java network applications such as protocol servers and clients. The asynchronous event-driven network application framework and tools are used to simplify network programming such as TCP and UDP socket servers. Netty includes an implementation of the reactor pattern of programming. Originally developed by JBoss, Netty is now developed and maintained by the Netty Project Community.
Besides being an asynchronous network application framework, Netty also includes built-in HTTPprotocol support, including the ability to run inside a servlet container, support for WebSockets, integration with Google Protocol BuffersSSL/TLS support, support for SPDY protocol and support for message compression.
As of version 4.0.0, Netty also supports the usage of NIO.2 as a backend, along with NIO and blocking Java sockets.
http://www.infoq.com/cn/articles/netty-million-level-push-service-design-points
由于网络不稳定经常会导致客户端断连,如果服务端没有能够及时关闭socket,就会导致处于close_wait状态的链路过多。close_wait状态的链路并不释放句柄和内存等资源,如果积压过多可能会导致系统句柄耗尽,发生“Too many open files”异常,新的客户端无法接入,涉及创建或者打开句柄的操作都将失败。
close_wait是被动关闭连接是形成的,根据TCP状态机,服务器端收到客户端发送的FIN,TCP协议栈会自动发送ACK,链接进入close_wait状态。但如果服务器端不执行socket的close()操作,状态就不能由close_wait迁移到last_ack,则系统中会存在很多close_wait状态的连接。通常来说,一个close_wait会维持至少2个小时的时间(系统默认超时时间的是7200秒,也就是2小时)。如果服务端程序因某个原因导致系统造成一堆close_wait消耗资源,那么通常是等不到释放那一刻,系统就已崩溃。
导致close_wait过多的可能原因如下:
  1. 程序处理Bug,导致接收到对方的fin之后没有及时关闭socket,这可能是Netty的Bug,也可能是业务层Bug,需要具体问题具体分析;
  2. 关闭socket不及时:例如I/O线程被意外阻塞,或者I/O线程执行的用户自定义Task比例过高,导致I/O操作处理不及时,链路不能被及时释放。
设计要点1:不要在Netty的I/O线程上处理业务(心跳发送和检测除外)。Why? 对于Java进程,线程不能无限增长,这就意味着Netty的Reactor线程数必须收敛。Netty的默认值是CPU核数 * 2,通常情况下,I/O密集型应用建议线程数尽量设置大些,但这主要是针对传统同步I/O而言,对于非阻塞I/O,线程数并不建议设置太大,尽管没有最优值,但是I/O线程数经验值是[CPU核数 + 1,CPU核数*2 ]之间。


Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts