Wednesday, May 25, 2016

Websocket



https://www.educative.io/collection/page/5668639101419520/5649050225344512/5715426797420544

Ajax Polling



    Polling is a standard technique used by the vast majority of AJAX applications. The basic idea is that the client repeatedly polls (or requests) a server for data. The client makes a request and waits for the server to respond with data. If no data is available, an empty response is returned.
    The problem with Polling is that the client has to keep asking the server for any new data. As a result, a lot of responses are empty, creating HTTP overhead.

    HTTP Long-Polling

    This is a variation of the traditional polling technique that allows the server to push information to a client whenever the data is available. With Long-Polling, the client requests information from the server exactly as in normal polling, but with the expectation that the server may not respond immediately. That’s why this technique is sometimes referred to as a “Hanging GET”.
    • If the server does not have any data available for the client, instead of sending an empty response, the server holds the request and waits until some data becomes available.
    • Once the data becomes available, a full response is sent to the client. The client then immediately re-request information from the server so that the server will almost always have an available waiting request that it can use to deliver data in response to an event.
    The basic life cycle of an application using HTTP Long-Polling is as follows:
    1. The client makes an initial request using regular HTTP and then waits for a response.
    2. The server delays its response until an update is available or a timeout has occurred.
    3. When an update is available, the server sends a full response to the client.
    4. The client typically sends a new long-poll request, either immediately upon receiving a response or after a pause to allow an acceptable latency period.
    5. Each Long-Poll request has a timeout. The client has to reconnect periodically after the connection is closed due to timeouts

    WebSockets

    WebSocket provides Full duplex communication channels over a single TCP connection. It provides a persistent connection between a client and a server that both parties can use to start sending data at any time. The client establishes a WebSocket connection through a process known as the WebSocket handshake. If the process succeeds, then the server and client can exchange data in both directions at any time. The WebSocket protocol enables communication between a client and a server with lower overheads, facilitating real-time data transfer from and to the server. This is made possible by providing a standardized way for the server to send content to the browser without being asked by the client and allowing for messages to be passed back and forth while keeping the connection open. In this way, a two-way (bi-directional) ongoing conversation can take place between a client and a server

    Server-Sent Events (SSEs)

    Under SSEs the client establishes a persistent and long-term connection with the server. The server uses this connection to send data to a client. If the client wants to send data to the server, it would require the use of another technology/protocol to do so.
    1. Client requests data from a server using regular HTTP.
    2. The requested webpage opens a connection to the server.
    3. The server sends the data to the client whenever there’s new information available.
    SSEs are best when we need real-time traffic from the server to the client or if the server is generating data in a loop and will be sending multiple events to the client
    https://qiutc.me/post/websocket-guide.html
    由于网络环境复杂,某些情况会出现断开连接或者连接出错,需要我们在 close 或者 error 事件中监听非正常断开并重连;
    由于一些原因在 error 时浏览器并不会响应回调事件,因此稳妥的做法还需要在 open 之后开启一个定时任务去判断当前的连接状态 readyState ,在出现异常情况下尝试重连;

    心跳

    websocket规范定义了心跳机制,一方可以通过发送ping(opcode 0x9)消息给另一方,另一方收到ping后应该尽可能快的返回pong(0xA)。
    心跳机制是用于检测连接的对方在线状态,因此如果没有心跳,那么无法判断一方还在连接状态中,一些网络层比如 nginx 或者浏览器层会主动断开连接,
    在 JavaScript 中,WebSocket 并没有开放 ping/pong 的 API ,虽然浏览器自带了心跳处理,然而不同厂商的实现也不尽相同,因此需要在我们开发时候与服务端约定好一个自实现的心跳机制;
    比如浏览器中,检测到 open 事件后,启动一个定时任务,每次发送数据 0x9 给服务端,而服务端返回 0xA作为响应;
    实践下来,心跳的定时任务一般是相隔 15-20 秒发送一次。
    https://abhirockzz.gitbooks.io/java-websocket-api-handbook/content/lifecycle_and_concurrency_semantics.html
    For any serious application development, it's very important to understand the threading and concurrency aspects of both the application and the frameworks being used. This lesson covers these aspects and helps answer some common questions such as
    • how many instances exist ?
    • is an instance inherently thread safe?
    • is concurrent access permitted (by the WebSocket implementation) ?
    Instances: there is a unique Session instance per client-server pair i.e. one instance of Session is created for a each client which connects to the WebSocket server endpoint. In short, the number of unique Session instances is equal to number of connected clients
    Thread Safety: A Session is thread safe in spite of the fact that multiple threads are allowed to invoke a single instance of Session. This is because the the specification mandates implementations to ensure the integrity of the mutable properties of the session under such circumstances.

    Instances: By default, there is one instance of an Endpoint per client unless this behavior is overridden by a customConfigurator implementation
    Instances: In contrast to some of the other components, creation of a MessageHandler instance is controlled by the developer (not the container). Typically, each Session instance registers (via the addMessageHandler method) a separate instance of a MessageHandler i.e. there is a one-to-one relation b/w the peer who is sending a message (client), the Session (let's assume it's on the server end) and the MessageHandler instance (in this case it's responsible for receiving messages on the server side)
    Thread Safety: The container will do as much as it can do ensure thread safety i.e. in case of MessageHandlers, it makes sure that only one thread enters a specific MessageHandler instance. In case the developer implementation is such that a single MessageHandler instance is registered to multiple Sessions, then concurrent access is inevitable and this needs to be accounted for
    http://www.baeldung.com/java-websockets
    <dependency>
        <groupId>javax.websocket</groupId>
        <artifactId>javax.websocket-api</artifactId>
        <version>1.1</version>
    </dependency>
    • @ServerEndpoint: If decorated with @ServerEndpoint, the container ensures availability of the class as a WebSocket server listening to a specific URI space
    • @ClientEndpoint: A class decorated with this annotation is treated as a WebSocket client
    • @OnOpen: A Java method with @OnOpen is invoked by the container when a new WebSocketconnection is initiated
    • @OnMessage: A Java method, annotated with @OnMessage, receives the information from the WebSocket container when a message is sent to the endpoint
    • @OnError: A method with @OnError is invoked when there is a problem with the communication
    • @OnClose: Used to decorate a Java method that is called by the container when the WebSocketconnection closes
    @ServerEndpoint(value="/chat/{username}")
    public class ChatEndpoint {
      
        private Session session;
        private static Set<ChatEndpoint> chatEndpoints
          = new CopyOnWriteArraySet<>();
        private static HashMap<String, String> users = new HashMap<>();
        @OnOpen
        public void onOpen(
          Session session,
          @PathParam("username") String username) throws IOException {
      
            this.session = session;
            chatEndpoints.add(this);
            users.put(session.getId(), username);
            Message message = new Message();
            message.setFrom(username);
            message.setContent("Connected!");
            broadcast(message);
        }
        @OnMessage
        public void onMessage(Session session, Message message)
          throws IOException {
      
            message.setFrom(users.get(session.getId()));
            broadcast(message);
        }
        @OnClose
        public void onClose(Session session) throws IOException {
      
            chatEndpoints.remove(this);
            Message message = new Message();
            message.setFrom(users.get(session.getId()));
            message.setContent("Disconnected!");
            broadcast(message);
        }
        @OnError
        public void onError(Session session, Throwable throwable) {
            // Do error handling here
        }
        private static void broadcast(Message message)
          throws IOException, EncodeException {
      
            chatEndpoints.forEach(endpoint -> {
                synchronized (endpoint) {
                    try {
                        endpoint.session.getBasicRemote().
                          sendObject(message);
                    } catch (IOException | EncodeException e) {
                        e.printStackTrace();
                    }
                }
            });
        }
    }
    The WebSocket specification supports two on-wire data formats – text and binary. The API supports both these formats, adds capabilities to work with Java objects and health check messages (ping-pong) as defined in the specification:
    • Text: Any textual data (java.lang.String, primitives or their equivalent wrapper classes)
    • Binary: Binary data (e.g. audio, image etc.) represented by a java.nio.ByteBuffer or a byte[] (byte array)
    • Java objects: The API makes it possible to work with native (Java object) representations in your code and use custom transformers (encoders/decoders) to convert them into compatible on-wire formats (text, binary) allowed by the WebSocket protocol
    • Ping-Pong: A javax.websocket.PongMessage is an acknowledgment sent by a WebSocket peer in response to a health check (ping) request
    An encoder takes a Java object and produces a typical representation suitable for transmission as a message such as JSON, XML or binary representation. Encoders can be used by implementing the Encoder.Text<T> or Encoder.Binary<T> interfaces.
    @ServerEndpoint(
      value="/chat/{username}",
      decoders = MessageDecoder.class,
      encoders = MessageEncoder.class )

    Issues of comet/long polling:
    http://www.html5rocks.com/en/tutorials/websockets/basics/
    X. Long polling makes unnecessary requests and keeps a constant stream of opening and closing connections for your servers to deal with.
    X. request size:
    Every time you make an HTTP request a bunch of headers and cookie data are transferred to the server. 
    WebSocket is an event-driven, full-duplex asynchronous communications channel for your web applications. It has the ability to give you real-time updates that in the past you would use long polling or other hacks to achieve. The primary benefit is reducing resource needs on both the client and (more important) the server.

    http://headerlabs.com/blog/5-benefits-of-websockets/
    1. Designed As Complete Duplex Link Model for the Web:
    2. Enhances the efficiency of Client and Server Communication:

    https://en.wikipedia.org/wiki/WebSocket
    GET /chat HTTP/1.1
    Host: server.example.com
    Upgrade: websocket
    Connection: Upgrade
    Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
    Sec-WebSocket-Protocol: chat, superchat
    Sec-WebSocket-Version: 13
    Origin: http://example.com
    
    Server response:
    HTTP/1.1 101 Switching Protocols
    Upgrade: websocket
    Connection: Upgrade
    Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk=
    Sec-WebSocket-Protocol: chat
    
    The handshake resembles HTTP in allowing servers to handle HTTP connections as well as WebSocket connections on the same port. Once the connection is established, communication switches to a bidirectional binary protocol which doesn't conform to the HTTP protocol.
    https://pusher.com/websockets
    WebSockets don’t make AJAX obsolete but they do supersede Comet (HTTP Long-polling/HTTP Streaming) as the solution of choice for true realtime functionality. AJAX should still be used for making short-lived web service calls, and if we eventually see a good uptake in CORS supporting web services, it will get even more useful. WebSockets should now be the go to standard for realtime functionality since they offer low latency bi-directional communication over a single connection
    https://www.pubnub.com/blog/2015-01-05-websockets-vs-rest-api-understanding-the-difference/
    WebSockets are really just an extension of the socket idea. While HTTP was invented for the World Wide Web, and has been used by browsers since then, it had limitations. It was a particular protocol that worked in a particular way, and wasn’t well suited for every need. In particular was how HTTP handled connections. Whenever you made a request, say to download html, or an image, a port/socket was opened, data was transferred, and then it was closed.

    The opening and closing creates overhead, and for certain applications, especially those that want rapid responses or real time interactions or display streams of data, this just doesn’t work.
    The other limitation with HTTP was that it was a “pull” paradigm. The browser would request or pull information from servers, but the server couldn’t push data to the browser when it wanted to. This means that browsers would have to poll the server for new information by repeating requests every so many seconds or minutes to see if there was anything new.

    There are a number of WebSocket frameworks and Socket.IO is likely the most popular and widely known.
    http://blog.teamtreehouse.com/an-introduction-to-websockets
    WebSockets provide a persistent connection between a client and server that both parties can use to start sending data at any time.
    The client establishes a WebSocket connection through a process known as the WebSocket handshake. This process starts with the client sending a regular HTTP request to the server. An Upgrade header is included in this request that informs the server that the client wishes to establish a WebSocket connection.
    GET ws://websocket.example.com/ HTTP/1.1
    Connection: Upgrade
    Upgrade: websocket
    
    WebSocket URLs use the ws scheme. There is also wss for secure WebSocket connections which is the equivalent of HTTPS

    If the server supports the WebSocket protocol, it agrees to the upgrade and communicates this through an Upgrade header in the response.
    HTTP/1.1 101 WebSocket Protocol Handshake
    Date: Wed, 16 Oct 2013 10:07:34 GMT
    Connection: Upgrade
    Upgrade: WebSocket
    
    Now that the handshake is complete the initial HTTP connection is replaced by a WebSocket connection that uses the same underlying TCP/IP connection. At this point either party can starting sending data.
    With WebSockets you can transfer as much data as you like without incurring the overhead associated with traditional HTTP requests. Data is transferred through a WebSocket as messages, each of which consists of one or more frames containing the data you are sending (the payload). In order to ensure the message can be properly reconstructed when it reaches the client each frame is prefixed with 4-12 bytes of data about the payload. Using this frame-based messaging system helps to reduce the amount of non-payload data that is transferred, leading to significant reductions in latency.
    var socket = new WebSocket('ws://echo.websocket.org');
    socket.send(data);
    socket.onmessage = function(event) {
      var message = event.data;
      messagesList.innerHTML += '<li class="received"><span>Received:</span>' +
                                 message + '</li>';
    };
    socket.close();
    socket.onclose = function(event) {
      socketStatus.innerHTML = 'Disconnected from WebSocket.';
      socketStatus.className = 'closed';
    };

    The developer tools in Google Chrome include a feature for monitoring traffic through a WebSocket. 
    https://blog.kaazing.com/2012/05/09/inspecting-websocket-traffic-with-chrome-developer-tools/
    You can access this tool by following these steps:
    Switch to the Network tab.
    Click on the entry for your WebSocket connection.
    Switch to the Frames tab.

    http://www.oracle.com/technetwork/articles/java/jsr356-1937161.html

    Lifecycle Events

    The typical lifecycle event of a WebSocket interaction goes as follows:
    • One peer (a client) initiates the connection by sending an HTTP handshake request.
    • The other peer (the server) replies with a handshake response.
    • The connection is established. From now on, the connection is completely symmetrical.
    • Both peers send and receive messages.
    • One of the peers closes the connection.
    An endpoint that is accepting incoming WebSocket requests can be a POJO annotated with the @ServerEndpoint annotation. This annotation tells the container that the given class should be considered to be a WebSocket endpoint. The required value element specifies the path of the WebSocket endpoint.
    Consider the following code snippet:
    @ServerEndpoint("/hello") 
    public class MyEndpoint { }
    

    This code will publish an endpoint at the relative path hello. The path can include path parameters that are used in subsequent method calls; for example, /hello/{userid} is a valid path, where the value of {userid} can be obtained in lifecycle method calls using the@PathParam annotation.
    In GlassFish, if your application is deployed with the contextroot mycontextroot in a Web container listening at port 8080 oflocalhost, the WebSocket will be accessible using ws://localhost:8080/mycontextroot/hello.
    An endpoint that should initiate a WebSocket connection can be a POJO annotated with the @ClientEndpoint annotation. The main difference between @ClientEndpoint and a ServerEndpoint is that the ClientEndpoint does not accept a path value element, because it is not listening to incoming requests.
    @ClientEndpoint 
    public class MyClientEndpoint {}
    

    Initiating a WebSocket connection in Java leveraging the annotation-driven POJO approach can be done as follows:
    javax.websocket.WebSocketContainer container = 
    javax.websocket.ContainerProvider.getWebSocketContainer();
    
    container.conntectToServer(MyClientEndpoint.class, 
    new URI("ws://localhost:8080/tictactoeserver/endpoint"));
    

    Hereafter, classes annotated with @ServerEndpoint or @ClientEndpoint will be called annotated endpoints.
    Once a WebSocket connection has been established, a Session is created and the method annotated with @OnOpen on the annotated endpoint will be called. This method can contain a number of parameters:
    • javax.websocket.Session parameter, specifying the created Session
    • An EndpointConfig instance containing information about the endpoint configuration
    • Zero or more string parameters annotated with @PathParam, referring to path parameters on the endpoint path
    If the return type of the method annotated with @OnMessage is not void, the WebSocket implementation will send the return value to the other peer. The following code snippet returns the received text message in capitals back to the sender:
    @OnMessage
    public String myOnMessage (String txt) {
       return txt.toUpperCase();
    } 
    

    Another way of sending messages over a WebSocket connection is shown below:
    RemoteEndpoint.Basic other = session.getBasicRemote();
    other.sendText ("Hello, world");
    

    Interface-Driven Approach

    In order to intercept messages, a javax.websocket.MessageHandler needs to be registered in the onOpen implementation:
    public void onOpen (Session session, EndpointConfig config) {
       session.addMessageHandler (new MessageHandler() {...});
    }
    

    MessageHandler is an interface with two subinterfaces: MessageHandler.Partial and MessageHandler.Whole. TheMessageHandler.Partial interface should be used when the developer wants to be notified about partial deliveries of messages, and an implementation of MessageHandler.Whole should be used for notification about the arrival of a complete message.
    The following code snippet listens to incoming text messages and sends the uppercase version of the text message back to the other peer:
    public void onOpen (Session session, EndpointConfig config) {
       final RemoteEndpoint.Basic remote = session.getBasicRemote();
       session.addMessageHandler (new MessageHandler.Whole<String>() {
          public void onMessage(String text) {
                     try {
                         remote.sendString(text.toUpperCase());
                     } catch (IOException ioe) {
                         // handle send failure here
                     }
                 }
    
       });
    }
    

    @ServerEndpoint(value="/chatserver",
        encoders = ChatCommandEncoder.class,
        decoders = ChatCommandDecoder.class)
    public class ChatServerEndpoint
    https://blog.openshift.com/how-to-build-java-websocket-applications-using-the-jsr-356-api/


    https://en.wikipedia.org/wiki/Push_technology#Long_polling
    Long polling is itself not a true push; long polling is a variation of the traditional polling technique, but it allows emulating a push mechanism under circumstances where a real push is not possible, such as sites with security policies that require rejection of incoming HTTP/S Requests.
    With long polling, the client requests information from the server exactly as in normal polling, but with the expectation the server may not respond immediately. If the server has no new information for the client when the poll is received, instead of sending an empty response, the server holds the request open and waits for response information to become available. Once it does have new information, the server immediately sends an HTTP/S response to the client, completing the open HTTP/S Request. Upon receipt of the server response, the client often immediately issues another server request. In this way the usual response latency otherwise associated with polling clients is eliminated


    https://samsaffron.com/archive/2015/12/29/websockets-caution-required
    WebSockets provides simple APIs to broadcast information to clients and simple APIs to ship information from the clients to the web server.
    A realtime channel to send information from the server to the client is very welcome. In fact it is a part of HTTP 1.1.
    ###2) Web browsers allow huge numbers of open WebSockets
    The infamous 6 connections per host limit does not apply to WebSockets. Instead a far bigger limit holds (255 in Chrome and 200 in Firefox)

    https://stackoverflow.com/questions/25327039/can-websockets-work-on-mobile-phones
    I build several websocket webapp with real time data and they perform very well on the iphone and mobile. Websockets keep a ping/pong connection to see if the connection is still alive. Things that have caused disconnection:
    • If you close down the app the connection will be dropped (on iOS webapps).
    • If the network does go down (wifi/3g/4g) you will be dropped and not recover anything that was sent in that dropped time.
    https://deepstreamhub.com/blog/load-balancing-websocket-connections/
    Websockets on the other hand are persistent - this means that a large number of connections needs to be kept open simultaneously. This comes with a number of challenges:

    File Descriptor Limits

    File descriptors are used by operating systems to allocate files, connections and a number of other concepts. Every time a loadbalancer proxies a connection, it creates two file descriptors - one for the incoming and one for the outgoing part.
    Each open file descriptor consumes a tiny amount of memory, the limits of which can be freely assigned - a good rule of thumb is to allow 256 descriptors for every 4MB of RAM available. For a system with 8GB of RAM, this gets us about half a million concurrent connections - a good start, but not exactly Facebook dimensions just yet.

    Ephemeral Port Limits

    Every time a loadbalancer connects to a backend server, it uses an "Ephemeral Port". Theoretically, 65.535 of these ports are available, yet most modern Linux distributions limit the range to 28.232 by default. This still doesn't sound too bad, but ports don't become available straight away after they've been used. Instead they enter a TIME_WAIT state to make sure they're not missing any packages. This state can last up to a minute, severely limiting the range of outgoing ports.

    Session allocation for multi-protocol requests

    Most real world bi-directional connectivity implementations (e.g. socket.io or SignalR ) use a mix of Websockets and a supporting protocol, usually HTTP long-polling. This was traditionally done as a fallback for browsers lacking Websocket support, but is still a good idea as the leading HTTP request can help convince Firewalls and network switches to process the following Websocket request.
    The trouble is: Both HTTP and WebSocket requests need to be routed to the same backend server by the load-balancer (sticky sessions). There are two ways to do this, both of which come with their own set of problems:
    • source-IP-port Hashing calculates a hash based on the client's signature. This is a simple and - most importantly - stateless way to allocate incoming connections to the same endpoint, but it's very coarse. If a large company's internal network lives behind a single NAT (Network Address Translation) gateway, it will look to the loadbalancer like a single client and all connections will be routed to the same endpoint.
    • cookie injection adds a cookie to the incoming HTTP and Websocket requests. Depending on the implementation this can mean that all loadbalancers need to keep a shared table of cookie-to-endpoint mappings. It also requires the loadbalancer to be the SSL-Termination point (the bit of the network infrastructure that decrypts incoming HTTPS and WSS traffic) in order to be able to manipulate the request.
    Kemal Handling 61189 concurrent WebSocket connections :) It’s not the limit but the end for now.

    Server Port Numbers

    A common misunderstanding is that a server cannot accept more than 65,536 (216) TCP sockets because TCP ports are 16-bit integer numbers.
    First, the number of ports is limited to 65,536, but this limitation applies only to a single IP address. Supposing that we are limited by the number of ports to have more than 65,536 clients, then adding more IP addresses to the server machine (either by adding new network cards, or simply by using IP aliasing for the existing network card) would solve the problem (even if, for opening 12 million client would need 184 network cards or IP aliases on the server machine).

    Server Socket Descriptors

    While the MigratoryData server uses a single port to accept any number of clients, it uses a different socket descriptor for each client. So, to open 12 million sockets, the process of the MigratoryData server should be able to use 12 million socket descriptors. Increasing the maximum number of socket descriptors per process is possible using the command ulimit. Consequently, we increased this limit to about 20 million socket descriptors as follows:
    ulimit -n 20000500
    Because one cannot increase the maximum number of socket descriptors per process to a value larger than the current kernel maximum (fs.nr_open) and because the kernel maximum defaults to 1048576 (10242), prior to running the ulimit command, we increased the kernel maximum accordingly as follows:
    echo 20000500 > /proc/sys/fs/nr_open
    The JVM parameter UseCompressedOops compresses the 64-bit pointers and offers non-negligible memory optimization. 



    Labels

    Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

    Popular Posts