Friday, December 11, 2015

Reddit Architecture



http://highscalability.com/blog/2013/8/26/reddit-lessons-learned-from-mistakes-made-scaling-to-1-billi.html
http://blog.jobbole.com/47630/
  • Think of SSDs as cheap RAM, not expensive disk
  • Give users a little bit of power, see what they do with it, and turn the good stuff into features
  • It’s not necessary to build a scalable architecture from the start.
  • Treat nonlogged in users as second class citizens.  By always giving logged out always cached content Akamai bears the brunt for reddit’s traffic. Huge performance improvement. 
Started in a datacenter and moved functionality to EC2 over time.
  • Originally started with EC2 in 2006 to store and serve logos using S3.
  • In 2007 used S3 for thumbnails.
  • In 2008 EC2 used for batch processing using a VPN tunnel to the datacenter.
  • In 2009 EC2 used for the entire site. Took site down for an entire day to move the data over to EC2. Great example of Data Gravity that is talked about later.
  • EC2 is not a magic bullet. You suffer from higher network latency and noisy neighbors, so plan to work around it. Benefit is that you can grow as you need to.
  • Keep track of resource limits on EC2
    • All resources have per account limits.
    • Amazon doesn’t even know what some of their limits are.
    • Track limits and get them raised ahead of when you need them.
    • Catch exceptions to notice when limits have been reached.
  • Reddit architecture is straightforward. Users connect to a web tier which talks to an application tier. The application tier talks to memcache, Cassandra, and Postgres. Postgres uses a master-slave configuration. A batch system makes use of Cassandra and Postgres.
  • In contrast Netflix use a service-oriented architecture where components talk to each other using REST APIs.
    • Advantages: Easier auto-scaling because just the service that is having the problem needs to scale; Easier capacity planning; Problems can be identified more easily because they are isolated behind REST calls; Effects of change are narrowed; More efficient local caching.
    • Disadvantages: Need multiple dev teams or devs to work on multiple services so you need more people; Need a common platform to prevent work duplication; Too much overhead for a small team just starting out.
  • Data gravity. Where you put your data you’re going to need to put your application near it. The idea that all your applications are going to want to rotate around your data. Data creates a gravity well where everything need to be near it because data is the hardest thing to move. The bigger the dataset the harder it is to move. Currently it would be very expensive to move out of EC2. This is why EC2 allows you to ingress data for free and charge you to take it out. They want you to put all your data in the cloud.
  • Relational vs. Non-relational. Most data at reddit is key-value and it is stored in Postgres. Everything that deals with money is kept in a relational database because of transactions and easier analytics.
    • Postgres is very fast and now natively supports KV.
  • Sharding. Writes are split across four master databases: links, accounts, subreddits, comments, votes, and misc.
    • Each has slaves. The vote database has one master one slave. The comment database has one master and 12 slaves.
    • Avoid reading from the master if possible and direct reads to the slaves to keep the master dedicated to writes.
    • Client libraries would load balance across slaves and try a new slave if one was busy.
    • Wrote a database access layer called ‘thing’
    • This approach worked for a long time. A combination of sharded databases, read slaves, and tracking read slave performance for load balancing.
  • Cassandra.
    • Fast writes, fast negative lookups, easy incremental scalability, no single point of failure
    • At Netflix data is distributed around to three different zones. A copy of all data is in all three zones. If a zone is lost then they can still run.
    • Switching vote data into Cassandra was a huge win at reddit. Cassandra’s bloom filters enabled really fast negative lookups. For comments it’s very fast to tell which comments you didn’t vote on, so the negative answers come back quickly. (more on this topic)

Mistakes

  • Did not account for increased latency after moving to EC2. In the datacenter they had submillisecond access between machines so it was possible to make a 1000 calls to memache for one page load. Not so on EC2. Memcache access times increased 10x to a millisecond which made their old approach unusable. Fix was to batch calls to memcache so a large number of gets are in one request.
  • Promises promises. Amazon doesn’t always deliver on promises and work around that. Design around failures instead of attempting to fix them. (There’s no reference here, maybe EBS?)
  • Used bleeding edge products in production. Cassandra was use when it was still early in its development cycle. Really great now, but it was problematic.
  • Should have offloaded a lot more of the work to the client earlier. The server did a lot of the page rendering when it could have been pushed to the client. Facebook is the master of this. You get a rectangle with a lot of divs and API calls are made to fill out all the divs. That’s how they wish reddit was done in the first place. It would have scaled much better. It also helps debugging as it's easy to determine which API calls become problems.
  • Not having enough monitoring and using a monitoring system that isn’t virtualization friendly. Started with Ganglia which produced great graphs but was hard to use and rapidly change, especially in a virtual environment with instances coming and going all the time.
  • Did not expire data. At reddit you can see comments all the way back to the beginning of time. They’ve started to put limits so you can’t vote on old comments or add comments to old threads. This causes data to grow and grow over time which makes it more and more difficult to keep the hot data in the database.
  • Not using consistent hashing. When hashing to a cache the problem is if you need to add more cache you are stuck because all your data is in one or however many caches you are hashing too. You can’t rebalance when growing the cache. Consistent hashing is a way around this problem. Solved by moving to Cassandra.

Lessons Learned

  • The key to scaling is finding the bottlenecks before your users do.
  • Using a proxy was a huge boon to scaling. User could be routed based on the URLs they were hitting. Reddit had a system that would monitor how long every URL took to service. People were put into different lanes. Slow traffic went one place and fast traffic went another.  Splitting traffic based on the median speed of response was a huge boost.
  • Automate all the things.  Your life will be so much easier if you treat your infrastructure like you treat your code. Everything should come up and configure itself automatically.
  • It’s not necessary to build a scalable architecture from the start. You don’t know what your feature set will be when you start out so you want know what your scaling problems will be. Wait until your site grows so you can learn where your scaling problems are going to be.
  • Don’t use a Service Oriented Architecture when just starting out. Keep it in mind, it’s a good place to go when you’re medium sized, otherwise there’s just too much overhead.
  • Don’t follow fads. Sometimes fads do work, node.js for example.
  • Put a limit on everything. Everything that can happen repeatedly put a high limit on it and raise or lower the limit as needed. Block users if the limit is passed. This protects the service. Example is uploading files of logos for subreddits. Users figured out they could upload really big files and harm the system. Don’t accept huge text blobs either. Someone will figure out how to send you 5GB of text.
  • Plan for 3. When designing always assume there’s going to be a whole bunch of what you are doing. Application servers, databases, caches. Always assume you are going to have more than one from the start. It will be much easier to horizontally scale in the future.
  • Recode Python functions in C.
  • Expire data. Lock out old threads and create a fully rendered page and cache it. This is how to deal with all the old data that threatens to overwhelm your database. At the same time, don’t allow voting on old comments or adding comments to old threads. Users will rarely notice.
  • Each tool has a different use case. Memcache has no guarantees about durability, but is very fast, so the vote data is stored there to make rendering of pages as quick as possible. Cassandra is durable and fast, and gives fast negative lookups because of its bloom filter, so it was good for storing a durable copy of the votes for when the data isn't in memcache. Postgres is rock solid and relational, so it’s a good place to store votes as a backup for Cassandra (all data in Cassandra could be generated from Postgres if necessary) and also for doing batch processing, which sometimes needed the relational capabilities.
  • Treat nonlogged in users as second class citizens. Logged out users were about 80% of the traffic, now it’s closer to 50%. By always giving logged out always cached content Akamai bears the brunt for reddit’s traffic. Huge performance improvement. Side benefit if reddit goes down and you aren’t logged in you might never know.
  • Put everything into a queue. Votes, comments, thumbnail creation, precomputed queries, spam processing and corrections. Queues allow you to know when there’s a problem by monitoring queue lengths. Side benefit is queues hide problems from users because things like vote requests are in the queue and if they aren’t applied immediately nobody notices.
  • Keep data in multiple Availability Zones.
  • Avoid keeping state on a single instance.
  • Take frequent snapshots of EBS disks.
  • Do not keep secret keys on the instance. Amazon now provides a service to give you on instance keys.
  • Divide functions by Security Group.
  • Provide an API. Programmers will build on your platform. The reddit iPhone applications, for example, are built by other people using the API.
  • Be an active in your own community. Reddit users love that reddit admins are actually on the site and interacting with them.
  • Let users do the work for you. On a site with user input one problem is always cheating, spam, and fraud. Most of the work of moderation is done by thousands of volunteers who take care of most of the spam problem. It works amazingly well and is one of the reasons the reddit team can remain small.
  • Give users a little bit of power, see what they do with it, and turn the good stuff into features. For example, when the ability to add CSS to subreddits was added they saw what people were doing and added many of the common things as feature for everyone. This also makes users excited about doing stuff on reddit because they like that sense of control. Lots of other examples.
  • Listen to your users. Users are going to tell you a lot of things you don’t know that you probably want to know. For example, reddit gold started as a joke in the community. They made it a product and users love it.
https://news.ycombinator.com/item?id=6280625

Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts