Thursday, October 27, 2016

Apache HttpClient



http://stackoverflow.com/questions/5769717/how-can-i-get-an-http-response-body-as-a-string-in-java
Every library I can think of returns a stream. You could use IOUtils.toString() from Apache Commons IO to read an InputStream into a String in one method call. E.g.:
URL url = new URL("http://www.example.com/");
URLConnection con = url.openConnection();
InputStream in = con.getInputStream();
String encoding = con.getContentEncoding();
encoding = encoding == null ? "UTF-8" : encoding;
String body = IOUtils.toString(in, encoding);
System.out.println(body);
Update: I changed the example above to use the content encoding from the response if available. Otherwise it'll default to UTF-8 as a best guess, instead of using the local system default.
  1. Using EntityUtils and HttpEntity
    HttpResponse response = httpClient.execute(new HttpGet(URL));
    HttpEntity entity = response.getEntity();
    String responseString = EntityUtils.toString(entity, "UTF-8");
    System.out.println(responseString);
  2. HttpResponse response = httpClient.execute(new HttpGet(URL));
    String responseString = new BasicResponseHandler().handleResponse(response);
    System.out.println(responseString);
http://hc.apache.org/httpclient-3.x/performance.html

Reuse the HttpClient instance

Generally it is recommended to have a single instance of HttpClient per communication component or even per application. However, if the application makes use of HttpClient only very infrequently, and keeping an idle instance of HttpClient in memory is not warranted, it is highly recommended to explicitly shut down the multithreaded connection manager prior to disposing the HttpClient instance. This will ensure proper closure of all HTTP connections in the connection pool.

Concurrent execution of HTTP methods

If the application logic allows for execution of multiple HTTP requests concurrently (e.g. multiple requests against various sites, or multiple requests representing different user identities), the use of a dedicated thread per HTTP session can result in a significant performance gain. HttpClient is fully thread-safe when used with a thread-safe connection manager such as MultiThreadedHttpConnectionManager. Please note that each respective thread of execution must have a local instance of HttpMethod and can have a local instance of HttpState or/and HostConfiguration to represent a specific host configuration and conversational state. At the same time the HttpClient instance and connection manager should be shared among all threads for maximum efficiency.
For details on using multiple threads with HttpClient please refer to the HttpClient Threading Guide.

http://stackoverflow.com/questions/30528912/java-httpclient-4-x-performance-guide-to-resolve-issue
Please DO re-use the HttpClient instance! HttpClient instances are very expensive. By throwing it away not only you throw away the instance itself, you also throw away the SSL context, the connection manager, and all persistent connections that might be kept-alive by the connection manager.

http://stackoverflow.com/questions/24904275/thread-is-blocked-when-to-reuse-appache-httpclient

http://www.cnblogs.com/wasp520/archive/2012/07/06/2580101.html
我服务器端APACHE的配置 
  1. Timeout 30  
  2. KeepAlive On   #表示服务器端不会主动关闭链接   
  3. MaxKeepAliveRequests 100  
  4. KeepAliveTimeout 180   

因此这样的配置就会导致每个链接至少要过180S才会被释放,这样在大量请求访问时就必然会造成链接被占满,请求等待的情况。 
在通过DEBUH后发现HttpClient在method.releaseConnection()后并没有把链接关闭,这个方法只是将链接返回给connection manager。如果使用HttpClient client = new HttpClient()实例化一个HttpClient connection manager默认实现是使用SimpleHttpConnectionManager

代码实现很简单,所有代码就和最上面的事例代码一样。只需要在HttpMethod method = new GetMethod("http://www.apache.org");加上一行HTTP头的设置即可 
  1. method.setRequestHeader("Connection""close");  

看一下HTTP协议中关于这个属性的定义:
HTTP/1.1 defines the "close" connection option for the sender to signal that the connection will be closed after completion of the response. For example,
       Connection: close
现在再说一下客户端关闭链接和服务器端关闭链接的区别。如果采用客户端关闭链接的方法,在客户端的机器上使用netstat –an命令会看到很多TIME_WAIT的TCP链接。如果服务器端主动关闭链接这中情况就出现在服务器端。
参考WIKI上的说明http://wiki.apache.org/HttpComponents/FrequentlyAskedConnectionManagementQuestions
The TIME_WAIT state is a protection mechanism in TCP. The side that closes a socket connection orderly will keep the connection in state TIME_WAIT for some time, typically between 1 and 4 minutes.
TIME_WAIT的状态会出现在主动关闭链接的这一端。TCP协议中TIME_WAIT状态主要是为了保证数据的完整传输。具体可以参考此文档:
http://www.softlab.ntua.gr/facilities/documentation/unix/unix-socket-faq/unix-socket-faq-2.html#ss2.7
另外强调一下使用上面这些方法关闭链接是在我们的应用中明确知道不需要重用链接时可以主动关闭链接来释放资源。如果你的应用是需要重用链接的话就没必要这么做,使用原有的链接还可以提供性能。

http://jinnianshilongnian.iteye.com/blog/2089792
因为使用了连接池,但连接不够用,造成大量的等待;而且这种等待都有滚雪球的效应(和交易组最近使用的apache common dbcp存在的风险是类似的)。

httpclient 3.1 使用synchronized+wait+notifyAll,存在两个问题,量大synchronized慢和notifyAll可能造成线程饥饿;httpclient 4.2.3 使用 ReentrantLock(默认非公平) + Condition(每个线程一个)。





Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts