http://dancres.github.io/Pages/
Google
Current "rocket science" in distributed systems.
Google
Thought Provokers
Ramblings that make you think about the way you design. Not everything can be solved with big servers, databases and transactions.- Harvest, Yield and Scalable Tolerant Systems - Real world applications of CAP from Brewer et al
- On Designing and Deploying Internet Scale Services - James Hamilton
- Latency Exists, Cope! - Commentary on coping with latency and it's architectural impacts
- Latency - the new web performance bottleneck - not at all new (see Patterson), but noteworthy
- The Perils of Good Abstractions - Building the perfect API/interface is difficult
- Chaotic Perspectives - Large scale systems are everything developers dislike - unpredictable, unordered and parallel
- Website Architecture - A collection of scalable architecture papers from various of the large websites
- Data on the Outside versus Data on the Inside - Pat Helland
- Memories, Guesses and Apologies - Pat Helland
- SOA and Newton's Universe - Pat Helland
- Building on Quicksand - Pat Helland
- Why Distributed Computing? - Jim Waldo
- A Note on Distributed Computing - Waldo, Wollrath et al
- Stevey's Google Platforms Rant - Yegge's SOA platform experience
Amazon
Somewhat about the technology but more interesting is the culture and organization they've created to work with it.- A Conversation with Werner Vogels - Coverage of Amazon's transition to a service-based architecture
- Discipline and Focus - Additional coverage of Amazon's transition to a service-based architecture
- Vogels on Scalability
- SOA creates order out of chaos @ Amazon
- MapReduce
- Chubby Lock Manager
- Google File System
- BigTable
- Data Management for Internet-Scale Single-Sign-On
- Dremel: Interactive Analysis of Web-Scale Datasets
- Large-scale Incremental Processing Using Distributed Transactions and Notifications
- Megastore: Providing Scalable, Highly Available Storage for Interactive Services - Smart design for low latency Paxos implementation across datacentres.
- Spanner - Google's scalable, multi-version, globally-distributed, and synchronously-replicated database.
- Photon - Fault-tolerant and Scalable Joining of Continuous Data Streams. Joins are tough especially with time-skew, high availability and distribution.
- Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing - Data warehousing system that stores critical measurement data related to Google's Internet advertising business.
eBay
Interesting they dumped most of J2EE and use a lot of db partitioning. Check out their site upgrade tool as well.Consistency Models
Key to building systems that suit their environments is finding the right tradeoff between consistency and availability.- CAP Conjecture - Consistency, Availability, Parition Tolerance cannot all be satisfied at once
- Consistency, Availability, and Convergence - Proves the upper bound for consistency possible in a typical system
- CAP Twelve Years Later: How the "Rules" Have Changed - Eric Brewer expands on the original tradeoff description
- Consistency and Availability - Vogels
- Eventual Consistency - Vogels
- Avoiding Two-Phase Commit - Two phase commit avoidance approaches
- 2PC or not 2PC, Wherefore Art Thou XA? - Two phase commit isn't a silver bullet
- Life Beyond Distributed Transactions - Helland
- If you have too much data, then 'good enough' is good enough - NoSQL, Future of data theory - Pat Helland
- Starbucks doesn't do two phase commit - Asynchronous mechanisms at work
- You Can't Sacrifice Partition Tolerance - Additional CAP commentary
- Optimistic Replication - Relaxed consistency approaches for data replication
Theory
Papers that describe various important elements of distributed systems design.- Distributed Computing Economics - Jim Gray
- Rules of Thumb in Data Engineering - Jim Gray and Prashant Shenoy
- Fallacies of Distributed Computing - Peter Deutsch
- Impossibility of distributed consensus with one faulty process - also known as FLP [access requires account and/or payment, a free version can be found here]
- Unreliable Failure Detectors for Reliable Distributed Systems. A method for handling the challenges of FLP
- Lamport Clocks - How do you establish a global view of time when each computer's clock is independent
- The Byzantine Generals Problem
- Lazy Replication: Exploiting the Semantics of Distributed Services
- Scalable Agreement - Towards Ordering as a Service
- Scalable Eventually Consistent Counters over Unreliable Networks - Scalable counting is tough in an unreliable world
Languages and Tools
Issues of distributed systems construction with specific technologies.- Programming Distributed Erlang Applications: Pitfalls and Recipes - Building reliable distributed applications isn't as simple as merely choosing Erlang and OTP.
Infrastructure
- Principles of Robust Timing over the Internet - Managing clocks is essential for even basics such as debugging
Storage
Paxos Consensus
Understanding this algorithm is the challenge. I would suggest reading "Paxos Made Simple" before the other papers and again afterward.- The Part-Time Parliament - Leslie Lamport
- Paxos Made Simple - Leslie Lamport
- Paxos Made Live - An Engineering Perspective - Chandra et al
- Revisiting the Paxos Algorithm - Lynch et al
- How to build a highly available system with consensus - Butler Lampson
- Reconfiguring a State Machine - Lamport et al - changing cluster membership
- Implementing Fault-Tolerant Services Using the State Machine Approach: a Tutorial - Fred Schneider
Other Consensus Papers
- Mencius: Building Efficient Replicated State Machines for WANs - consensus algorithm for wide-area network
Gossip Protocols (Epidemic Behaviours)
- Epidemic Routing Bibliography
- How robust are gossip-based communication protocols?
- Astrolabe: A Robust and Scalable Technology For Distributed Systems Monitoring, Management, and Data Mining
- Epidemic Computing at Cornell
- Fighting Fire With Fire: Using Randomized Gossip To Combat Stochastic Scalability Limits
- Bi-Modal Multicast
- ACM SIGOPS Operating Systems Review - Gossip-based computer networking
- SWIM: Scalable Weakly-consistent Infection-style Process Group Membership Protocol
P2P
- Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications
- Kademlia: A Peer-to-peer Information System Based on the XOR Metric
- Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems
- PAST: A large-scale, persistent peer-to-peer storage utility - storage system atop Pastry
- SCRIBE: A large-scale and decentralised application-level multicast infrastructure - wide area messaging atop Pastry
Thought Provokers
一些让你考虑你设计方式的随笔。不是所有事都可以靠大服务器,数据库和事物来解决的。
- Harvest, Yield and Scalable Tolerant Systems CAP原理在现实世界里的应用来自Brewer等人
- On Designing and Deploying Internet Scale Services James Hamilton
- Latency Exists, Cope! 处理延迟及其架构方面影响的说明
- Latency – the new web performance bottleneck 内容不太新了,但是值得关注下
- The Perils of Good Abstractions 构建完美的API/接口很困难
- Chaotic Perspectives 大规模系统有开发人员不喜欢的所有东西——不可预测,无序,并行
- Website Architecture 一些来自各类大型网站的可扩展架构文章
- Data on the Outside versus Data on the Inside Pat helland
- Memories, Guesses and Apologies Pat Helland
- SOA and Newton’s Universe – Pat Helland
- Building on Quicksand – Pat Helland
- Why Distributed Computing – Jim Waldo
- A Note on Distributed Computing – Waldo, Wollrath 等人
- Stevey’s Google Platforms Rant – Yegge的SOA平台经验
Amazon
有些有关的技术,但更有趣的是他们创造的与之配合的文化和结构。
- A Conversation with Werner Vogels 关于亚马逊转型为一个基于服务的架构的采访报道
- Discipline and Focus 关于亚马逊转型为一个基于服务的架构的另一篇采访
- Vogels on Scalability
- SOA creates order out of chaos @ Amazon
当前分布式系统领域的“火箭科学”(形容艰深的学问)
- MapReduce
- Chubby Lock Manager
- Google File System
- BigTable
- Data Management for Internet-Scale Single-Sign-On
- Dremel: Interactive Analysis of Web-Scale Datasets
- Large-scale Incremental Processing Using Distributed Transactions and Notifications
- Megastore: Providing Scalable, Highly Available Storage for Interactive Services – 实现跨数据中心、低延迟的paxos算法的巧妙设计。
- Spanner – Google的可扩展、多版本、全球分布且同步复制的数据库。
- Photon – 连续数据流的容错和扩容。扩容是非常困难的,尤其是在时钟偏移、高可用性和分布式的情况下.
- Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing 用于存储谷歌互联网广告业务相关的关键测量数据的数据仓库系统。
eBay
有趣的是他们抛弃了大多数的J2EE,并使用了大量的数据库分区。同时,看看他们的网站升级工具。
一致性模型
构建能够适应环境的系统的关键是寻求正确权衡一致性和可用性。
- CAP Conjecture – 一致性,可用性,分区容忍性不可能同时满足
- Consistency, Availability, and Convergence – 证明了在一个典型系统中一致性可能的上界。
- CAP Twelve Years Later: How the “Rules” Have Changed – Eric Brewer 在原来权衡描述工作上的扩展
- Consistency and Availability – Vogels
- Eventual Consistency – Vogels
- Avoiding Two-Phase Commit – 两阶段提交的避免方法
- 2PC or not 2PC, Wherefore Art Thou XA – 两阶段提交不是银弹
- Life Beyond Distributed Transactions – Helland
- If you have too much data, then ‘good enough’ is good enough – NoSQL, 数据理论的未来- Pat Helland
- Starbucks doesn’t do two phase commit – 在起作用的异步机制
- You Can’t Sacrifice Partition Tolerance – 另外的 CAP 说明
- Optimistic Replication – 数据主从复制的弱一致性方法
理论
一些描述了分布式系统设计中各种各样的重要因素的论文。
- Distributed Computing Economics – Jim Gray
- Rules of Thumb in Data Engineering – Jim Gray and Prashant Shenoy
- Fallacies of Distributed Computing – Peter Deutsch
- Impossibility of distributed consensus with one faulty process 也称为FLP [访问需要帐号或付费,免费版本在这里: here]
- Unreliable Failure Detectors for Reliable Distributed Systems.一种处理FLP难题的方法
- Lamport Clocks -当每台电脑的时钟都是独立的时候,你如何建立对时间的全局视图。
- The Byzantine Generals Problem
- Lazy Replication: Exploiting the Semantics of Distributed Services
- Scalable Agreement – Towards Ordering as a Service
- Scalable Eventually Consistent Counters over Unreliable Networks 在不可靠的世界,可扩展计数很困难。
语言和工具
使用特定技术构建分布式系统的问题。
- Programming Distributed Erlang Applications: Pitfalls and Recipes 构建可靠的分布式应用并不仅仅是的选择Erlang还是OTP的问题那么简单。
基础设施
- Principles of Robust Timing over the Internet 即便是调试这么基础的事,管理时钟也很重要。
存储
Paxos 一致性算法
理解这种算法是一个挑战。我建议在阅读其他论文之前先读读“Paxos Made Simple”,然后在读完其他论文之后,再读一遍。
- The Part-Time Parliament – Leslie Lamport
- Paxos Made Simple – Leslie Lamport
- Paxos Made Live – An Engineering Perspective – Chandra等人
- Revisiting the Paxos Algorithm – Lynch 等人
- How to build a highly available system with consensus – Butler Lampson
- Reconfiguring a State Machine – Lamport 等人 -改变集群的成员
- Implementing Fault-Tolerant Services Using the State Machine Approach: a Tutorial – Fred Schneider
其他一致性文章
Gossip 协议(传染行为)
- Epidemic Routing Bibliography
- How robust are gossip-based communication protocols
- Astrolabe: A Robust and Scalable Technology For Distributed Systems Monitoring, Management, and Data Mining
- Epidemic Computing at Cornell
- Fighting Fire With Fire: Using Randomized Gossip To Combat Stochastic Scalability Limits
- Bi-Modal Multicast
- ACM SIGOPS Operating Systems Review – Gossip-based computer networking
- SWIM: Scalable Weakly-consistent Infection-style Process Group Membership Protocol