https://slack.engineering/flannel-an-application-level-edge-cache-to-make-slack-scale-b8a6400e2f6b
Upon client startup, Flannel caches relevant data of users, channels, bots, and more. It then provides query APIs for clients to fetch upon demand. It foresees data that will be requested next and pushes data proactively to clients. For example, let’s say you mention a colleague in a channel: While broadcasting that message to people in the channel, Flannel sees that some clients have not loaded the information about the mentioned user recently. It sends the user data to those clients just before sending the message to save them a round-trip query.
Flannel was a lazy-loading cache service that provided query APIs for clients to fetch data on demand. Whereas previously, clients received the entire application state on startup, they now used Flannel to request only what they needed to create a reasonable user interface (UI) for human users and then made subsequent requests to update their local state.
https://slack.engineering/flannel-an-application-level-edge-cache-to-make-slack-scale-b8a6400e2f6b
Slack started out with a Real-Time Messaging (RTM) API, which allowed developers to build apps and bots that could respond in real time to activities in Slack. The API delivered events from Slack over a WebSocket. As time went on, Slack discovered that even though the RTM API was great for its own clients, it provided too much data for developers to handle well. Plus, it was difficult for Slack and for developers to scale. Developers with several users had to deal with many concurrent open HTTP connections—at least one per user. Slack also needed to manage as many connections as the API provider.
developers can use the Events API to subscribe to only the events that they care about—delivered via HTTP.
https://api.slack.com/events-api
Upon client startup, Flannel caches relevant data of users, channels, bots, and more. It then provides query APIs for clients to fetch upon demand. It foresees data that will be requested next and pushes data proactively to clients. For example, let’s say you mention a colleague in a channel: While broadcasting that message to people in the channel, Flannel sees that some clients have not loaded the information about the mentioned user recently. It sends the user data to those clients just before sending the message to save them a round-trip query.
- The Slack client connects to Flannel.
- Behind the scenes, Flannel gathers the full client startup data. It also opens up a WebSocket connection to Slack servers in the main AWS region to stay current by consuming real-time events.
- Flannel returns a slimmed down version of this startup data to the client, allowing it to bootstrap itself.
We use consistent hashing to choose which Flannel host a user connects to in order to maintain team affinity: Users on the same team who are from the same networking region are directed to the same Flannel instance to achieve optimal cache efficiency. When new or recently disconnected users connect, they are served directly from the Flannel cache, which reduces impact of reconnect storms to the Slack backend servers.
- Finally, we are moving clients to a pub/sub model. Today, clients listen to all events happening on a team, including messages in all channels you are in, user profile updates, user presence changes, etc. This doesn’t have to be the case: clients can subscribe to the series of events that are relevant in the current view, and change subscriptions when users switch to another view. In fact, we’ve already moved user presence updates to the pub/sub model with great results: the number of presence events received by clients was reduced by a factor of 5. Moving more events to pub/sub will further improve client performance.
- We are moving event fanout into Flannel. Today, when a message is broadcast in a channel, the message server recognizes multiple recipients and forwards a copy to Flannel for each connected user. However, it’s much more efficient to send only one copy to each Flannel server and then fan out to multiple destinations. Besides messages, other event types traverse through Slack servers in the same fashion and can be thus optimized. Such a structure will largely reduce network bandwidth consumption and backend CPU overhead.
Flannel was a lazy-loading cache service that provided query APIs for clients to fetch data on demand. Whereas previously, clients received the entire application state on startup, they now used Flannel to request only what they needed to create a reasonable user interface (UI) for human users and then made subsequent requests to update their local state.
https://slack.engineering/flannel-an-application-level-edge-cache-to-make-slack-scale-b8a6400e2f6b
Slack started out with a Real-Time Messaging (RTM) API, which allowed developers to build apps and bots that could respond in real time to activities in Slack. The API delivered events from Slack over a WebSocket. As time went on, Slack discovered that even though the RTM API was great for its own clients, it provided too much data for developers to handle well. Plus, it was difficult for Slack and for developers to scale. Developers with several users had to deal with many concurrent open HTTP connections—at least one per user. Slack also needed to manage as many connections as the API provider.
developers can use the Events API to subscribe to only the events that they care about—delivered via HTTP.
https://api.slack.com/events-api