Tuesday, March 1, 2016

Cloud Design Patterns: Prescriptive Architecture Guidance for Cloud Applications



https://puncsky.com/hacking-the-software-engineer-interview#data-stores-todo

Availability patterns 

  • Health Endpoint Monitoring: Implement functional checks in an application that external tools can access through exposed endpoints at regular intervals.
  • Queue-Based Load Leveling: Use a queue that acts as a buffer between a task and a service that it invokes in order to smooth intermittent heavy loads.
  • Throttling: Control the consumption of resources used by an instance of an application, an individual tenant, or an entire service.

Data Management patterns 

  • Cache-Aside: Load data on demand into a cache from a data store
  • Command and Query Responsibility Segregation: Segregate operations that read data from operations that update data by using separate interfaces.
  • Event Sourcing: Use an append-only store to record the full series of events that describe actions taken on data in a domain.
  • Index Table: Create indexes over the fields in data stores that are frequently referenced by queries.
  • Materialized View: Generate prepopulated views over the data in one or more data stores when the data isn’t ideally formatted for required query operations.
  • Sharding: Divide a data store into a set of horizontal partitions or shards.
  • Static Content Hosting: Deploy static content to a cloud-based storage service that can deliver them directly to the client.

Security Patterns 

  • Federated Identity: Delegate authentication to an external identity provider.
  • Gatekeeper: Protect applications and services by using a dedicated host instance that acts as a broker between clients and the application or service, validates and sanitizes requests, and passes requests and data between them.
  • Valet Key: Use a token or key that provides clients with restricted direct access to a specific resource or service.

https://docs.microsoft.com/en-us/azure/architecture/patterns/bulkhead
Isolate elements of an application into pools so that if one fails, the others will continue to function.
Partition service instances into different groups, based on consumer load and availability requirements. This design helps to isolate failures, and allows you to sustain service functionality for some consumers, even during a failure.
A consumer can also partition resources, to ensure that resources used to call one service don't affect the resources used to call another service. For example, a consumer that calls multiple services may be assigned a connection pool for each service. If a service begins to fail, it only affects the connection pool assigned for that service, allowing the consumer to continue using the other services.
The benefits of this pattern include:
  • Isolates consumers and services from cascading failures. An issue affecting a consumer or service can be isolated within its own bulkhead, preventing the entire solution from failing.
  • Allows you to preserve some functionality in the event of a service failure. Other services and features of the application will continue to work.
  • Allows you to deploy services that offer a different quality of service for consuming applications. A high-priority consumer pool can be configured to use high-priority services.
https://msdn.microsoft.com/en-us/library/dn568099.aspx

Cache-aside Pattern


Enable multiple concurrent consumers to process messages received on the same messaging channel. This pattern enables a system to process multiple messages concurrently to optimize throughput, to improve scalability and availability, and to balance the workload.
Use a message queue to implement the communication channel between the application and the instances of the consumer service. The application posts requests in the form of messages to the queue, and the consumer service instances receive messages from the queue and process them. This approach enables the same pool of consumer service instances to handle messages from any instance of the application. Figure 1 illustrates this architecture.
Figure 1 - Using a message queue to distribute work to instances of a service
Detecting Poison Messages
Designing Services for Resiliency.

Command and Query Responsibility Segregation (CQRS) Pattern

Segregate operations that read data from operations that update data by using separate interfaces. This pattern can maximize performance, scalability, and security; support evolution of the system over time through higher flexibility; and prevent update commands from causing merge conflicts at the domain level.
The query model for reading data and the update model for writing data may access the same physical store, perhaps by using SQL views or by generating projections on the fly. However, it is common to separate the data into different physical stores to maximize performance, scalability, and security; as shown in Figure 3.
Figure 3 - A CQRS architecture with separate read and write stores
Figure 3 - A CQRS architecture with separate read and write stores

The read store can be a read-only replica of the write store, or the read and write stores may have a different structure altogether. Using multiple read-only replicas of the read store can considerably increase query performance and application UI responsiveness, especially in distributed scenarios where read-only replicas are located close to the application instances. Some database systems, such as SQL Server, provide additional features such as failover replicas to maximize availability.
Separation of the read and write stores also allows each to be scaled appropriately to match the load. For example, read stores typically encounter a much higher load that write stores.

https://martinfowler.com/bliki/CQRS.html
CQRS stands for Command Query Responsibility Segregation.  At its heart is the notion that you can use a different model to update information than the model you use to read information. For some situations, this separation can be valuable, but beware that for most systems CQRS adds risky complexity.
The mainstream approach people use for interacting with an information system is to treat it as a CRUD datastore.

Use an append-only store to record the full series of events that describe actions taken on data in a domain, rather than storing just the current state, so that the store can be used to materialize the domain objects
    The CRUD approach has some limitations:
    In a collaborative domain with many concurrent users, data update conflicts are more likely to occur because the update operations take place on a single item of data.
    Unless there is an additional auditing mechanism, which records the details of each operation in a separate log, history is lost.

    Solution

    The Event Sourcing pattern defines an approach to handling operations on data that is driven by a sequence of events, each of which is recorded in an append-only store. Application code sends a series of events that imperatively describe each action that has occurred on the data to the event store, where they are persisted. Each event represents a set of changes to the data (such as AddedItemToOrder).
    The events are persisted in an event store that acts as the source of truth or system of record (the authoritative data source for a given data element or piece of information) about the current state of the data. The event store typically publishes these events so that consumers can be notified and can handle them if needed

    Typical uses of the events published by the event store are to maintain materialized views of entities as actions in the application change them, and for integration with external systems. For example, a system may maintain a materialized view of all customer orders that is used to populate parts of the UI. As the application adds new orders, adds or removes items on the order, and adds shipping information, the events that describe these changes can be handled and used to update the materialized view.
    • Events are immutable and so can be stored using an append-only operation. The user interface, workflow, or process that initiated the action that produced the events can continue, and the tasks that handle the events can run in the background. This, combined with the fact that there is no contention during the execution of transactions, can vastly improve performance and scalability for applications, especially for the presentation level or user interface.
    Move configuration information out of the application deployment package to a centralized location. This pattern can provide opportunities for easier management and control of configuration data, and for sharing configuration data across applications and application instances.
    Delegate authentication to an external identity provider. This pattern can simplify development, minimize the requirement for user administration, and improve the user experience of the application.
    Implement an authentication mechanism that can use federated identity. Separating user authentication from the application code, and delegating authentication to a trusted identity provider, can considerably simplify development and allow users to authenticate using a wider range of identity providers (IdPs) while minimizing the administrative overhead. It also allows you to clearly decouple authentication from authorization.
    Protect applications and services by using a dedicated host instance that acts as a broker between clients and the application or service, validates and sanitizes requests, and passes requests and data between them.
    To minimize the risk of clients gaining access to sensitive information and services, decouple hosts or tasks that expose public endpoints from the code that processes requests and accesses storage. This can be achieved by using a façade or a dedicated task that interacts with clients and then hands off the request (perhaps through a decoupled interface) to the hosts or tasks that will handle the request. Figure 1 shows a high-level view of this approach.
    Figure 1 - High level overview of this pattern

    Health Endpoint Monitoring Pattern

    Implement health monitoring by sending requests to an endpoint on the application. The application should perform the necessary checks, and return an indication of its status.

    Compensating Transaction Pattern

    Undo the work performed by a series of steps, which together define an eventually consistent operation, if one or more of the steps fail. Operations that follow the eventual consistency model are commonly found in cloud-hosted applications that implement complex business processes and workflows.
    Applications running in the cloud frequently modify data. This data may be spread across an assortment of data sources held in a variety of geographic locations. To avoid contention and improve performance in a distributed environment such as this, an application should not attempt to provide strong transactional consistency. Rather, the application should implement eventual consistency. In this model, a typical business operation consists of a series of autonomous steps. While these steps are being performed the overall view of the system state may be inconsistent, but when the operation has completed and all of the steps have been executed the system should become consistent again.

    Solution

    Implement a compensating transaction. The steps in a compensating transaction must undo the effects of the steps in the original operation. A compensating transaction might not be able to simply replace the current state with the state the system was in at the start of the operation because this approach could overwrite changes made by other concurrent instances of an application. Rather, it must be an intelligent process that takes into account any work done by concurrent instances. This process will usually be application-specific, driven by the nature of the work performed by the original operation.
    A common approach to implementing an eventually consistent operation that requires compensation is to use a workflow. As the original operation proceeds, the system records information about each step and how the work performed by that step can be undone. If the operation fails at any point, the workflow rewinds back through the steps it has completed and performs the work that reverses each step. Note that a compensating transaction might not have to undo the work in the exact mirror-opposite order of the original operation, and it may be possible to perform some of the undo steps in parallel.
    Consolidate multiple tasks or operations into a single computational unit. This pattern can increase compute resource utilization, and reduce the costs and management overhead associated with performing compute processing in cloud-hosted applications.
    it may be possible to consolidate multiple tasks or operations into a single computational unit.
    Tasks can be grouped according to a variety of criteria based on the features provided by the environment, and the costs associated with these features. A common approach is to look for tasks that have a similar profile concerning their scalability, lifetime, and processing requirements. Grouping these items together allows them to scale as a unit. The elasticity provided by many cloud environments enables additional instances of a computational unit to be started and stopped according to the workload.


    Use a queue that acts as a buffer between a task and a service that it invokes in order to smooth intermittent heavy loads that may otherwise cause the service to fail or the task to timeout. This pattern can help to minimize the impact of peaks in demand on availability and responsiveness for both the task and the service.
    Queue-based Load Leveling Pattern
    Design an application so that it can be reconfigured without requiring redeployment or restarting the application. This helps to maintain availability and minimize downtime.
    Runtime Reconfiguration Pattern

    Throttling Pattern

    Control the consumption of resources used by an instance of an application, an individual tenant, or an entire service. This pattern can allow the system to continue to function and meet service level agreements, even when an increase in demand places an extreme load on resources.
    There are many strategies available for handling varying load in the cloud, depending on the business goals for the application. One strategy is to use autoscaling to match the provisioned resources to the user needs at any given time. This has the potential to consistently meet user demand, while optimizing running costs. However, while autoscaling may trigger the provisioning of additional resources, this provisioning is not instantaneous. If demand grows quickly, there may be a window of time where there is a resource deficit.
    An alternative strategy to autoscaling is to allow applications to use resources only up to some soft limit, and then throttle them when this limit is reached. The system should monitor how it is using resources so that, when usage exceeds some system-defined threshold, it can throttle requests from one or more users to enable the system to continue functioning and meet any service level agreements (SLAs) that are in place. For more information on monitoring resource usage, see the Instrumentation and Telemetry Guidance.
    The system could implement several throttling strategies, including:
    • Rejecting requests from an individual user who has already accessed system APIs more than n times per second over a given period of time. This requires that the system meters the use of resources for each tenant or user running an application. For more information, see the Service Metering Guidance.
    • Disabling or degrading the functionality of selected nonessential services so that essential services can run unimpeded with sufficient resources. For example, if the application is streaming video output, it could switch to a lower resolution.
    • Using load leveling to smooth the volume of activity (this approach is covered in more detail by the Queue-based Load Leveling pattern). In a multitenant environment, this approach will reduce the performance for every tenant. If the system must support a mix of tenants with different SLAs, the work for high-value tenants might be performed immediately. Requests for other tenants can be held back, and handled when the backlog has eased. The Priority Queue pattern could be used to help implement this approach.
    • Deferring operations being performed on behalf of lower priority applications or tenants. These operations can be suspended or curtailed, with an exception generated to inform the tenant that the system is busy and that the operation should be retried later.
    Use a token or key that provides clients with restricted direct access to a specific resource or service in order to offload data transfer operations from the application code. This pattern is particularly useful in applications that use cloud-hosted storage systems or queues, and can minimize cost and maximize scalability and performance.
    To resolve the problem of controlling access to a data store where the store itself cannot manage authentication and authorization of clients, one typical solution is to restrict access to the data store’s public connection and provide the client with a key or token that the data store itself can validate.
    This key or token is usually referred to as a valet key. It provides time-limited access to specific resources and allows only predefined operations such as reading and writing to storage or queues, or uploading and downloading in a web browser. Applications can create and issue valet keys to client devices and web browsers quickly and easily, allowing clients to perform the required operations without requiring the application to directly handle the data transfer. This removes the processing overhead, and the consequent impact on performance and scalability, from the application and the server.
    The client uses this token to access a specific resource in the data store for only a specific period, and with specific restrictions on access permissions, as shown in Figure 1. After the specified period, the key becomes invalid and will not allow subsequent access to the resource.
    Figure 1 - Overview of the pattern


    public access and temporary access
    The temporary access URLs are time-limited by your application-supplied expiration.
    Temporary access URLs need to be generated on the server and made available to the client
    Any code (in the cloud or elsewhere) with access to the storage access key will be able to create temporary access URLs.

    Temporary access URLs are secured through hashing, a proven cryptographic technique that requires accesses to the storage key and uses it to create a unique signature for any string, in this case the temporary access URL. The associated hash is checked every time a client attempts to use the temporary access URL; without a correct one, access is denied. Adversaries without access to the storage key cannot tamper with an existing temporary access URL, create a new one, or guess a valid one.

    follow the principle of least privilege and provide only those rights necessary, and only for as long as necessary. Further, when transporting a temporary access URL, do so over a secure channel.
    HTTPS protects the query string during transport, and the cloud storage service handles authorization.

    A Shared Access Signature (SAS) is the Windows Azure Storage feature used to construct temporary access URLs for blobs that have temporary permission for reading or writing blobs.

    Permissions are part of the special URL and will not allow one user to interfere with blobs belonging to another user.
    The SAS technique can also be used to provide temporary access URLs for reading non-public blob resources.

    Sharding Pattern

    Divide a data store into a set of horizontal partitions shards. This pattern can improve scalability when storing and accessing large volumes of data.
    Sharding Pattern
    Deploy static content to a cloud-based storage service that can deliver these directly to the client. This pattern can reduce the requirement for potentially expensive compute instances.
    Index Table Pattern
    https://msdn.microsoft.com/en-us/library/dn589791.aspx
    Create indexes over the fields in data stores that are frequently referenced by query criteria. This pattern can improve query performance by allowing applications to more quickly retrieve data from a data store.
    Coordinate the actions performed by a collection of collaborating task instances in a distributed application by electing one instance as the leader that assumes responsibility for managing the other instances. This pattern can help to ensure that tasks do not conflict with each other, cause contention for shared resources, or inadvertently interfere with the work that other task instances are performing.
    Generate prepopulated views over the data in one or more data stores when the data is formatted in a way that does not favor the required query operations.
    To support efficient querying, a common solution is to generate, in advance, a view that materializes the data in a format most suited to the required results set. The Materialized View pattern describes generating prepopulated views of data in environments where the source data is not in a format that is suitable for querying, where generating a suitable query is difficult, or where query performance is poor due to the nature of the data or the data store.
    These materialized views, which contain only data required by a query, allow applications to quickly obtain the information they need. In addition to joining tables or combining data entities, materialized views may include the current values of calculated columns or data items, the results of combining values or executing transformations on the data items, and values specified as part of the query. A materialized view may even be optimized for just a single query.
    Decompose a task that performs complex processing into a series of discrete elements that can be reused.
    Decompose the processing required for each stream into a set of discrete components (or filters), each of which performs a single task. By standardizing the format of the data that each component receives and emits, these filters can be combined together into a pipeline. This helps to avoid duplicating code, and makes it easy to remove, replace, or integrate additional components if the processing requirements change. Figure 2 shows an example of this structure.
    Figure 2 - A solution implemented by using pipes and filters

    Priority Queue Pattern

    Prioritize requests sent to services so that requests with a higher priority are received and processed more quickly than those of a lower priority. This pattern is useful in applications that offer different service level guarantees to individual clients.
    A queue is usually a first-in, first-out (FIFO) structure, and consumers typically receive messages in the same order that they were posted to the queue. However, some message queues support priority messaging; the application posting a message can assign a priority to a message and the messages in the queue are automatically reordered so that messages with a higher priority will be received before those of a lower priority. 
    Messaging is a key strategy employed in many distributed environments such as the cloud. It enables applications and services to communicate and cooperate, and can help to build scalable and resilient solutions. Messaging supports asynchronous operations, enabling you to decouple a process that consumes a service from the process that implements the service.
    Autoscaling is the process of dynamically allocating the resources required by an application to match performance requirements and satisfy service level agreements (SLAs). As the volume of work grows, an application may require additional resources to enable it to perform its tasks in a timely manner.
    Autoscaling is often an automated process that can help to ease management overhead by reducing the need for an operator to continually monitor the performance of a system and make decisions about adding or removing resources.
    Autoscaling should also be an elastic process; more resources can be provisioned as the load increases on the system, but as demand slackens resources can be de-allocated to minimize costs while still maintaining adequate performance and meeting SLAs.
    • An in-memory cache, where data is held locally on the computer running an instance of an application.
    • A shared cache, which can be accessed by several instances of an application running on different computers.

    Labels

    Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

    Popular Posts