Showing posts with label Resource-System Desgin. Show all posts
Showing posts with label Resource-System Desgin. Show all posts

Thursday, July 9, 2015

(3) How should I prepare system design questions for Google/Facebook Interview? - Quora



(3) How should I prepare system design questions for Google/Facebook Interview? - Quora
1) Asynchronous is good  (use Queues, Topics etc.)
2) Parallel is good  (Multi-threading, load balancing etc.)
3) Avoid points of contention e.g. synchronization
4) Avoid writing to disk until you must - cache like crazy
5) Scale out not up
6) At web scale the speed of light is a problem
7) At web scale everything fails - networks, load balancers etc.

And here are some good articles to read and presentations to watch

http://s3.amazonaws.com/AllThing...
High Scalability - High Scalability - Amazon Architecture
http://www.addsimplicity.com/dow...
Scalability Best Practices: Lessons from eBay
http://www.hpts.ws/papers/2009/session9/shoup.pdfShopzilla - Performance By Design
http://static.googleusercontent....
facebook architecture for 600M users
Infrastructure at Facebook
Scaling Memcache at Facebook
Scuba: Diving into Data at Facebook
Unicorn: A System for Searching the Social Graph
http://www.cs.cmu.edu/~pavlo/cou...
Scale at Facebook
Scalability, Availability & Stability Patterns
LinkedIn - A Professional Network built with Java Technologies and ...
LinkedIn Data Infrastructure (QCon London 2012)
Global Netflix Platform
Timelines at Scale
Big Data in Real-Time at Twitter

https://github.com/checkcheckzz/system-design-interview
How should I prepare system design questions for Google/Facebook Interview?
In my view there are two components to solving system design questions at interviews:

1) Having enough knowledge and skills to come up with good solutions
2) Using the right process to approach the questions
The idea is that at an interview you have 30-40 minutes to answer a question like: "How would you design a search engine", or "Create this hierarchy of classes, which does X".

Obviously, to solve such a question in full, companies need weeks to years and big teams of engineers. So, how on Earth should one person answer the question in 30 minutes.

Usually, such a discussion needs to start at the higher level where the interviewee presents their idea of how the architecture would be structured. Only after that, if the interviewer is ok with the higher level design one could go into more details about how they would tackle specific bottlenecks and other limitations. Most often, the interviewer would lead the discussion in one or another direction based on their goals.

In short, it's important to practice doing that for enough system design problems. Just take a few from the internet and give it a try. Here are a few examples:

1) Design a URL shortening service.
2) How would you design the feature in LinkedIn where it computes how many hops there are between you and another person?
3) If you were to design a web platform for online chess games, how would you do that?

As for some good resources to start with, I would recommend this video lecture from prof. David Malan from Harvard, which covers many useful aspects of designing scalable systems:

Read full article from (3) How should I prepare system design questions for Google/Facebook Interview? - Quora

Monday, June 29, 2015

码农翻墙去美帝 ―― 系统设计准备 | ITint5



码农翻墙去美帝 ―― 系统设计准备 | ITint5
核心考察的是大流量下的distributed backend service design。因为这几家都是大公司,面对的都是每天至少billion级别的request,因此需要对这个场景的design有深刻认识。而design的核心思路就是trade-off。每个方案都不是十全十美的,比较postive的应对是:可以给出多种方案,在给出每种方案时,同时给出该方案的优点和缺点,并结合我们正在讨论的设计场景(给定的约束条件下),做出该方案好坏的点评,并最终选择一种“最适合”的方案。

模板套路大致为:
1. user case:和面试官讨论该系统需要实现什么功能,大部分设计都需要实现“读”和“写”两个基本功能。这样就确定了功能边界。(避免你和面试官聊high了想出很多新功能,从而偏题)

2. system constraint:和面试官明确max requests/day, QPS, storage size, latency,availability等基本系统约束。也有可能面试官不会给你,而是让你估算,比如面试官会说,如果这个是Facebook主页上的feature,你就需要结合Facebook本身的traffic, active user等数值进行估算(这就要求对各家的这些number要有大概认知)。这一部分是确定性能边界,也会决定你的设计框架。

3. abstract design:先给出high level的design,自顶向下逐步细化。一般来讲,high level都是三层:API, application layer, data layer。其中API是interface,负责和调用方打交道。application layer实现所有业务逻辑,为了实现distributed和scalability,要求application layer是stateless的(无状态:即server不保存任何request相关的data, 所有server的code base完全一致). 这样,如果application不能host更多request时,简单的增加更多机器即可(horizontal scalability)。为了实现load balance和failover, 在application layer入口加入load balancer(比如ZooKeeper)。data layer实现所有业务数据的读写。一般分为cache和persistant storage两层。cache让high request data能够快速读写(write through/write back),persistant storage保证所有数据不会丢失。

4. drill down:一般面试官会开始提问题,针对具体部分展开深入头论。
比如针对业务逻辑的实现方案(比如twitter timeline到底是用fan-out-write还是fan-out-read),针对data layer的实现方案(cache的设计,比如从基本的LRU问到distributed cache,或者是cache+storage设计,比如BigTable+GFS or Cassandra or Dynamo)。当你给出每个idea时,需要同时给出这种方案本身的优缺点,并最终结合我们之前确定的1,2(功能边界和系统约束),选择最适合的方案。

5. bottleneck: 如果还有时间,可以回顾我们已经设计好的系统,提出这个系统从整体上看,最可能出现瓶颈的地方,并和面试官互动的讨论。

有了以上框架,就需要针对框架里提到的各种主流技术去进行深入了解。在看文档或paper时,要多思考总结,互相比较,抽象理解。

比如大家都知道Google三驾马车(MapReduce, BigTable, GFS),我当时就想为什么这三个系统奠定了云计算的基础?个人理解是,这三个系统正好在distributed环境下拆解了传统数据库的三大基本功能(GFS: distributed persistent storage, BigTable: distributed structured data access/indexing,MapReduce:distributed data processing)。而Cassandra/Dynamo本质上也是为了实现distributed structured data access and persistent storage,因此功能上和BigTable+GFS是近似的。再回到NoSQL的基本原则之一,不保证数据的consistency,由业务层来完成数据的join/processing,这就是MapReduce上场的地方了。因此这三家马车组合起来,正好是distributed环境下的一个database。

另外一个例子,同样是timeline, twitter用fan-out-write(将new feed直接写到follower的timeline里),而Facebook却用fan-out-read(在读的时候实时抓取相关用户的feeds并merge/rank)。why?我的理解是,这两个timeline是有本质不同的,twitter的timeline是比较传统的按时间顺序呈现你follow的人的feeds,而Facebook的news feeds更像一个timeline + social graph search, 不仅有好友最新的post,还有基于social graph搜索之后的相关结果(可能是你朋友的朋友,或者朋友赞过的照片,等),并且可能会有个性化结果。回归到fan-out-read/fan-out-write一个本质区别是,如果request/answer pattern越不确定(比如web search就是一个极端例子,query pattern趋近无限),或者answer features需要agile的变化,那么用fan-out-read越好(read所有candidate后,灵活的组织结果)。

需要关注的系统和技术:
  • Google三驾马车的paper(MapReduce, BigTable, GFS) 对应Hadoop,HDFS, HBase
  • Cassandra, Dynamo paper (关注点:consistent hash, decentralized system design(gossip))
  • Memcached (distributed cache)
  • Redis (all-in-memory solution)
  • ZooKeeper (load balancer/configuration center/name service/distributed lock etc..) 
  • Thrift(cross language interface and RPC)
  • Storm(realtime streaming processing, 应对类似统计过去1分钟的count一类的题目), spark

一些常见面试题目,mitbbs上已经有高人进行了全面总结,我会转载贴在下面,我基本上是follow之。这里有一点补充:
 tinyurl一定要非常充分的准备。我在F,L,G三家面试时都被问到了这题。这题看似简单,其实可以将上面讨论的大部分东西多装进来,可以讨论的地方很多。
这道题我觉得很有价值的文章:
http://www.hiredintech.com/app#system-design   (讲design的框架,我引用了很多他的方法论,其中以tinyurl做例子,讲的很透彻)

其实针对一家的design准备过程,就是了解这家technology stack的过程。当你熟悉之后,在design环节灵活使用这些系统设计的idea来回答面试官的提问,会和面试官产生更多的认同感(可能他每天也在使用这些系统)。这比现场凭空想象,两人各执一词互相挑战的氛围,要轻松有爱很多吧:)

另外一块需要充分准备的就是个人的background project,这也是design环节可能会涉及的。对于之前工作上做过的项目,一定要力求没有死角,理解全面深刻。更进一步,能够从更高的角度去看这个项目。比如我在Google面试时,有一轮基本上大部分时间在讨论我做过的一个项目。因为之前针对这个项目做了很多深入思考,比如如何将这个项目的方案推广到更泛化的语义搜索。正好在那次面试中,面试官在讨论清楚已有项目之后,就问到了这个问题,因为之前已经总结了一些思路,所以侃侃而谈。后来才了解,面试官在G家也是做类似工作,我之前思考总结的那些idea也给了他不少启发,可以说是英雄惜英雄,氛围很融洽:)。个人感觉,这几家公司都会匹配和你简历背景非常吻合(大部分情况下)的面试官来和你聊,因此如果你对你简历的上的东西不熟悉,理解不全面,不深刻,面试官是很容易发现的,但如果你能带给他惊喜,那么恭喜你,你很有机会能拿到一个strong hire.

=============以下为转载MITBBS上的高人总结帖================
1. 入门级的news feed
http://www.quora.com/What-are-best-practices-for-building-somet
http://www.infoq.com/presentations/Scale-at-Facebook
http://www.infoq.com/presentations/Facebook-Software-Stack
一般的followup question是估算需要多少server
另外这个帖子有讨论
http://www.mitbbs.ca/article_t/JobHunting/32463885.html
这篇文章稍微提到要怎么approach这种题,可以稍微看看
http://book.douban.com/reading/23757677/


2. facebook chat,这个也算是挺常问的
http://www.erlang-factory.com/upload/presentations/31/EugeneLet
https://www.facebook.com/note.php?note_id=14218138919
http://www.cnblogs.com/piaoger/archive/2012/08/19/2646530.html
http://essay.utwente.nl/59204/1/scriptie_J_Schipers.pdf

3. typeahead search/search suggestion,这个也常见
https://www.facebook.com/video/video.php?v=432864835468
问题在这个帖子里被讨论到,基本上每个问题,在视频里都有回答
http://www.mitbbs.com/article_t/JobHunting/32438927.html


4. Facebook Messaging System(有提到inbox search, which has been asked before)
messaging system就是一个把所有chat/sms/email之类的都结合起来的一个系统
http://www.infoq.com/presentations/HBase-at-Facebook
http://sites.computer.org/debull/A12june/facebook.pdf
http://www.slideshare.net/brizzzdotcom/facebook-messages-hbase/
https://www.youtube.com/watch?v=UaGINWPK068


5. 任给一个手机的位置信号(经纬度),需要返回附近5mile 的POI
这个这里有讨论,这题貌似nyc很爱考...
http://www.mitbbs.ca/article0/JobHunting/32476139_0.html


6. Implement second/minute/hour/day counters
这题真不觉得是system design,但万一问道,还是要有准备,貌似在总部面试会被问
道....
这个帖子有讨论
http://www.mitbbs.com/article_t/JobHunting/32458451.html


7. facebook photo storage,这个不太会被问起,但是知道也不错
https://www.usenix.org/legacy/event/osdi10/tech/full_papers/Beaver.pdf
https://www.facebook.com/note.php?note_id=76191543919


8. facebook timeline,这个也不太是个考题,看看就行了
https://www.facebook.com/note.php?note_id=10150468255628920
http://highscalability.com/blog/2012/1/23/facebook-timeline-bro

除了这些,准备一下这些题目
implement memcache
http://www.adayinthelifeof.nl/2011/02/06/memcache-internals/

implement tinyurl(以及distribute across multiple servers)
http://stackoverflow.com/questions/742013/how-to-code-a-url-sho

determine trending topics(twitter)
http://www.americanscientist.org/issues/pub/the-britney-spears-
http://www.michael-noll.com/blog/2013/01/18/implementing-real-t

copy one file to multiple servers
http://vimeo.com/11280885

稍微知道一下dynamo key value store,以及google的gfs和big table

另外推荐一些网站
http://highscalability.com/blog/category/facebook
这个high scalability上有很多讲system design的东西,不光是facebook的,没空的
话,就光看你要面试的那家就好了..
facebook engineering blog
http://www.quora.com/Facebook-Engineering/What-is-Facebooks-arc
http://stackoverflow.com/questions/3533948/facebook-architectur

其他家的
http://www.quora.com/What-are-the-top-startup-engineering-blogs

a) 首先你可以从整体上了解一下facebook的architecture
http://www.quora.com/Facebook-Engineering/What-is-Facebooks-arc
http://www.ece.lsu.edu/hpca-18/files/HPCA2012_Facebook_Keynote.
http://www.quora.com/Facebook-Engineering/What-have-been-Facebo
除了下面给出的一些资料,fb engineering page里还有很多不错的内容
https://www.facebook.com/Engineering

b) news feed
这里有个talk
http://www.infoq.com/presentations/Facebook-News-Feed
对应的slides
http://readme.skplanet.com/wp-content/uploads/2012/11/0-3_Faceb
还有一些quora上的讨论
http://www.quora.com/Activity-Streams/What-are-the-scaling-issu
http://www.quora.com/What-are-best-practices-for-building-somet
http://www.quora.com/What-is-the-best-storage-solution-for-buil

c) facebook chat
这里有两个notes,其中第二个里面还有相应的tech talk links
https://www.facebook.com/notes/facebook-engineering/facebook-chat/
14218138919
https://www.facebook.com/notes/facebook-engineering/chat-stability-and-
scalability/51412338919

d) typeahead search & graph search
关于typeahead search的tech talk和notes
https://www.facebook.com/video/video.php?v=432864835468
https://www.facebook.com/note.php?note_id=365915113919
https://www.facebook.com/note.php?note_id=389105248919

关于graph search的paper, tech talk, notes。其中paper很值得一看。
http://db.disi.unitn.eu/pages/VLDBProgram/pdf/industry/p871-cur
https://newsroom.fb.com/Photos-and-B-Roll/4362/Graph-Search-Whiteboard
https://www.facebook.com/note.php?note_id=10151240856103920
https://www.facebook.com/note.php?note_id=10151347573598920
https://www.facebook.com/note.php?note_id=10151361720763920
https://www.facebook.com/note.php?note_id=10151432733048920
https://www.facebook.com/note.php?note_id=10151755593228920

e) facebook messages
两个tech talks
http://www.youtube.com/watch?v=XAuwAHWpzPc
http://www.infoq.com/presentations/HBase-at-Facebook
以及eng notes
https://www.facebook.com/note.php?note_id=10150148835363920
https://www.facebook.com/note.php?note_id=10150162742108920

f) photo storage
相关的papers和notes
https://www.usenix.org/conference/osdi10/finding-needle-haystack-facebooks-
photo-storage
https://www.usenix.org/legacy/events/osdi10/tech/full_papers/Beaver.pdf
https://www.usenix.org/legacy/events/osdi10/tech/slides/beaver.pdf
https://www.facebook.com/note.php?note_id=76191543919

g) social graph data store
相关的note, video, paper
https://www.facebook.com/notes/facebook-engineering/tao-the-power-of-the-
graph/10151525983993920
https://www.usenix.org/conference/atc13/technical-sessions/presentation/
bronson
http://www.cs.cmu.edu/~pavlo/courses/fall2013/static/papers/117

h) tiny URL
这里有一些讨论
http://n00tc0d3r.blogspot.com/2013/09/big-data-tinyurl.html
http://stackoverflow.com/questions/742013/how-to-code-a-url-sho
http://stackoverflow.com/questions/3376163/what-are-the-things-

i) POI
参考这里
http://www.slideshare.net/mmalone/scaling-gis-data-in-nonrelati
http://www.mitbbs.ca/article_t/JobHunting/32476139.html

Read full article from 码农翻墙去美帝 ―― 系统设计准备 | ITint5

Sunday, June 28, 2015

Design复习 - careerInUS



Design复习 - careerInUS
  • 如何秒掉99%的海量数据题!!! http://blog.csdn.net/v_july_v/article/details/7382693
  • 十道海量数据题: http://blog.csdn.net/v_july_v/article/details/6279498
    • 总而言之就是hash function写到小文件,然后再处理,然后再归并
    • 注意
      • bloom filter
        • k个hash,映射到bit vector
        • 快速,节约,但有false positive且不能进行删除操作
      • hashing
      • bit-map
      • heap
      • external sort
      • trie
      • mapreduce
  • Quora关于系统设计题 http://www.quora.com/Job-Interviews/How-should-I-prepare-system-design-questions-for-Google-Facebook-Interview
  • 很有见地的一系列博文!!link
    • 重点看GFT三家的解剖!!!
    • “做大的艺术”也有对分布式的讨论
    • 缓存人气作者
  • 非常好的总结!!!link
  • facebook design 总结!!其实也是一个非常非常好的general总结 link
    • Memcache (不是memcached,那个开源的内存cache)
    • Scalable Web Architecture and Distributed Systems
      • 对于一个像flickr一样的网站,更重要的是requesting time而不是上传时间。所以优化只要集中在读的一侧而不是上传的一侧
  • flickr架构 link
  • Pinterest架构 link
  • 买买提上关于除了刷题的建议link
  • 看一下段地址的完全实现过程
  • 解剖Twitter http://blog.sina.com.cn/s/blog_46d0a3930100fai8.html
  • Google Calendar的实现过程
  • Google的系统设计题:http://www.mitbbs.com/article_t/JobHunting/32562595.html
    • 如果实现像facebook一样的activity broadcast.
    • 一个朋友f1更新一条信息后,他的所有朋友都要看到这条信息。
    • 条件:
    • 1. 朋友数量极大。
    • 2. 有的朋友在线,有的不在线,不在线的登陆后要能看到。
    • 3. 如果储存空间有限,如何处理。
  • 如何开发一个聊天室 http://book.mixu.net/node/ch3.html
  • Terasort link
    • 普通归并排序的瓶颈在于合并是是sequential的
    • Terasort保证mapper处理的各个块,i的所有元素都比i + 1的大,这样就可以自然的归并了

Keywords for Design

  • sharding (partition)
  • 非常好地区分sharding和replication!link
  • load balancer
  • master-slave replication
  • consistent hashign
  • bloom filter
  • Kafka vs RabbitMQ
    • 两个都是消息队列,message broker
  • CDN
  • peak throughput

Hired in Tech - System Design link

  • Process
    • Step 1 - Constraints and Use Cases (5 min!)
      • Very first thing to do when faced with the problem: clarify the constraints and use case.
      • Don't panic about the correctness, write some numbers down!
      • Write on board: the use cases like
        • (Basic features)
        1. take a url => return a shorter one (shortening)
        2. take a short => return the original (redirection)
        3. (ask some more augmented features?) 
        4. customized url?
        5. analytics?
        6. automatic link expiration?
        7. UI? API?
      • Write on board, constraints. Traffic and Data
        1. scale down to seconds! Estimate the traffic and data
        2. 100M new urls per month
        3. What can be cached?? 10% from shortening and 90% from redirection
        4. how long is url?
      • 都是先算几年或者几个月有多少,然后scale到秒!
      • Ready heavy? Write heavy?
    • Step 2 - Abstract Design
      • draw sketch,不要管细节!先把基础功能实现!
      • Application service layer 和 Data Storage Layer
    • Step 3 - Understand Bottlenecks
      • Need a load balancer? Data is so huge that need to distribute the database?
      • Talk about the tradeoff!!!
      • 这一步要结合1和2,比如1中的data per sec,就可以用来估算数据库是否有效?或者如果每次都要搜遍数据库的话,会不会效率过低?Bottlenecks identified!!!
    • Step 4 - Actual Design of a Scalable System!
      • (next section)
      • whenever get challenged by interviewer about the architectural choices, acknowledge that rarely an idea is perfect, and outline the advantages and disadvantages of your choice
      • add load balancer is not only for amortize traffic, but also to deal with spiky traffic!! and shut them down when the traffic goes back to normal. 
      • Don't forget to mention we need benchmark, profiling, and load test the architecture
  • Fundamentals
    • (Scalability @Harvard)
    • (Scalability for Dummies)
    • (Database sharding)
    • and don't forget about the monitor
    • sometimes there will be a flash traffic when many people want to watch/read the same thing
  • Examples
    • (highscalability blog)
  • Wrap-up
  • Practice

Scalability @Havard

  • Vertical scaling
    • CPU cores, RAM, disks, etc.
    • why not a full solution? Real-world constraints...
    • Updating drive - updating IO of databases
    • RAID
  • Horizontal scaling
    • Use more cheaper hardware, multiple servers
    • problem of CONSISTENCY, SYNCHRONIZATION
    • 用load balancer回复客户端请求,即暴露一个公共IP给外部;而所有服务器都用的内部IP,客户请求都是由load balancer来分配! 
  • Load balancing
    • Round-robin请求 to 各个服务器
    • Implement load balancer:
      • Software: HAProxy etc.
      • Hardware: 100K for decent one...
    • Use multiple load balancers to avoid single node failure at the load balancer
    • (from HiredInTech) add load balancer is not only for amortize traffic, but also to deal with spiky traffic!! and shut them down when the traffic goes back to normal. Also, when one server is down, use others (reliability)
  • Caching
    • .html - redundancy, no template, must write <header> ... lots of find and replace 
    • MySQL query cache
    • memcached 
      • memory cache, which is also a server
      • if read-heavy, then this is very efficient!
  • DB replication
    • master-slave, benefit for read-heavy websites
    • master-master, 
    • load balancer can also generate a single-node failure
    • use heat beat to detect activity
  • DB partitioning
    • different multiple servers: mit.thefacebook.com, harvard.thefacebook.com
    • A-H to server1 I-Z to server 2, etc.
  • 最后还有一段实例,很好~
    • Sync between data center?
      • load balancing on DNS level!

Scalability for Dummies link

  • part 1, replicas
  • part 2, database can become the bottleneck. Switch to NoSQL.
  • part 3, caching - in-memory caching like Redis, not file-based caching!
    • cache queries (but hard to remove)
    • cache objects, makes asynchronous processing possible
  • part 4, asynchronism
    • async 1: cache the dynamic content, pre-rendered HTML
    • async 2: (computing-intensive tasks) pass to backend, constantly check if the job is done

Database Sharding

  • 一种data partition的方式,主要用的不是replica,而是每个廉价服务器服务一部分用户
    • high availability
    • faster queries
    • more write bandwidth (parallel writing!)
    • no replica (???)

系统设计面试题思路汇总 link

  • 基本可以用Google三驾马车解决:MapReduce, GFS, BigTable
  • 三驾马车之前,常用架构是:database + sharding + cache
  • 虚拟节点技术
    • 虚拟节点技术:该技术常用于分布式数据分片中。具体应用场景是:有一大坨数据(maybe TB级或者PB级),我们需按照某个字段(key)分片存储到几十(或者更多)台机器上,同时想尽量负载均衡且容易扩展。传统的做法是:Hash(key) mod N,这种方法最大缺点是不容易扩展,即:增加或者减少机器均会导致数据全部重分布,代价忒大。于是乎,新技术诞生了,其中一种是上面提到的DHT,现在已经被很多大型系统采用,还有一种是对“Hash(key) mod N”的改进:假设我们要将数据分不到20台机器上,传统做法是hash(key) mod 20,而改进后,N取值要远大于20,比如是20000000,然后我们采用额外一张表记录每个节点存储的key的模值,比如:node1:0~1000000;node2:1000001~2000000。。。。。。这样,当添加一个新的节点时,只需将每个节点上部分数据移动给新节点,同时修改一下这个表即可
  • Q:设计一个系统,使写速度提高 -> BigTable, 并发写到一个cache里面(随机写),然后cache满了在写到磁盘(顺序写,快!)

High Scalability

  • YouTube architecture
    • Most popular content is moved to CDN
  • 8个重要的scalable design pattern link
  • What is your scalability plan link
    • 写的不错!是一步一步来的
  • 咖啡店的比喻,不错!link

公司核心技术复习

  • 必看论文
    • Google: Google File System, MapReduce, Bigtable
      • Bigtable
        • 类似于一个分布式kvs
        • key是<行,列,timestamp>
        • tablet是最小单位
    • Facebook: Cassandra
    • Amazon: Dynamo
  • 注意distributed data store
    • Non-relational, therefore quick

Google

  • Web crawling and indexes link
    • the crawler should be distributed
    • robots.txt
    • DNS resolution is a well-known bottleneck in web crawling
    • BFS的时候用优先队列,权值为key
  • DESIGN
    • 大合集!!!
    • 分布式crawler link
    • autocomplete
      • trie
      • HMM?
      • popularity
    • 读写锁
      • read write lock, 又叫share lock和exclusive lock
      • lock()和trylock()的区别:一个是blocked一个不block
      • 这篇写得不错 link
      • 两道面试题 link link
      • 注意writer starvation
      • producer comsumer link
      • mutex和cv实现semaphore link
    • distributed kv storage (DHT B+ tree)
    • distributed lock
      • 其实还是基于Paxos
    • distributed hash 
      • 分布式哈希基本介绍 link link
      • 一致性哈希 link
        • 普通线性hash缺点就在于如果移除、添加机器,需要有大量数据迁移!
      • GFS是基于metadata的实现!看!!->单点故障->一致性哈希
  • 这篇high scalability link
  • how to implement online booking link
  • 购物车 link
    • cookie和session的实现,以及与数据库的联系,讲得不错!

Facebook

  • Zuck's law
  • 这篇文章!!!!http://www.mitbbs.com/article_t/JobHunting/32741713.html
  • Hiphop compiler
Facebook Stats
Facebook Infrastructure

Facebook feeds 
  • http://readme.skplanet.com/wp-content/uploads/2012/11/0-3_Facebook1.pdf 讲得很爽
  • 系统的几个文章
    • 关于feeds的general discuss!!!!link
    • SO上的 link
    • 这里面的几个链接都非常好!!!
      • push model for people who don't have many followers
      • pull model for the other way around
Tech Talks
  • "Facebook is not about actions, but interactions"
Read full article from Design复习 - careerInUS

Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts