Friday, October 23, 2015

Google Calendar Architecture



如何设计类似Google Calendar的系统
http://computer.howstuffworks.com/internet/basics/google-calendar.htm
You can choose to view the calendar by day, week, month or a view that presents just the next four days. You can also choose an "agenda" view, which presents all scheduled events as a list rather than as a calendar view.

You can also use the "repeat" function for events that occur regularly, such as a weekly meeting or annual event likebirthdays.
In Google Calendar, you can use Google's search technology to search not only your own calendars, but also any public calendar on Google's system.

One of the Web services Google takes advantage of is short message service (SMS) support. This is the format cell phones use to send text messages. Users can allow Google Calendar to send reminders via SMS to their cell phones. As a scheduled event draws near, Google Calendar sends an alert via SMS to a phone number registered by the respective user.

API as Service
Google fosters a growing community of developers who use Google's application programming interface(API) to build new programs based off Google technology. 
Google Calendar makes it pretty easy to send invitations to other people. First, you create an event in your own calendar and fill out the details. Then, you can click on the "add guests" option. This opens up a field in which you can type e-mail addresses. Once you save the event, Google Calendar sends e-mails to the invite list. As guests respond to the invite, Google Calendar displays the results within the event listing on your calendar.
If a user chooses to share or publish a calendar, other users can leave comments on event entries. This allows people to discuss upcoming appointments or debrief after a meeting. The event page becomes a forum for guests and calendar viewers.
http://www.cnblogs.com/jcli/p/calendar_recur_rule.html
设计实现一个「日历」服务产品,我觉得有两个难点,一是「重复事件规则计算」,二是「到点事件准时提醒」。













上图是设置一个重复事件的设置页面,可以看到设置项还是挺多的,有「重复类型」:按日,周,月,年重复;「重复频率」:每几天,几周....等发生一次;「重复日期」:按月重复时是「一月中的某天呢」,还是「一周中的某天呢」;「结束日期」:重复事件到什么时候结束呢。所以从Google产品功能和我们的描述可以知道想要准确描述一个「重复事件」,要具备如下几个元素:
  • 重复类型,你是按天,按周,按月,还是按年呢
  • 重复频率,即几个周期发生一次,如我每两个月去和朋友打一次球
  • 一个周期内发生的日期,比如你按周重复,那你是每周几发生呢,是周二,周四才发生,还是只周一才发生;如果是按月重复,是每月的第三个星期六发生,还是每月的23号发生呢;
  • 结束日期,即重复多久后终止事件本身。如你每个月要还房贷,也是有个最终还完的那天,比如30年后。比如每个月参加某个培训,只参加5次课堂培训就完了。
我们怎样定义「重复规则」的数据结构呢?基于重复规则的复杂性和弹性可变性(之所以说弹性可变性,因为我们不能保证自己的产品不会有一些个性化的规则,比如支持农历日历,怎样表示清明节这样的日期),用字符串表达式定义,持久化存储更为理想,就像正则表达式一样,我们可以用一个字符串表达任何丰富的信息在里面。其实对于重复事件的描述,设计,我们可以遵守一定的业界标准。在RFC2445中有详细的定义
我们先定义一个接口 Rule ,它就是根据重复事件解析后的规则引擎实例接口,它应该具有如下方法:
  • nextOccurDate: 根据传入的时间计算出以该时间为起始值的下一次事件发生的时间
  • includes(theDay): 判断指定时间是否是该事件的发生时间点系列之一
实时计算,可以理解成「无状态」的实时求值。每次根据传入的参数计算并返回。
 每次我们计算时,都没有任何上下文信息,只要知道开始时间和「重复规则配置」,实时根据公式计算出下一次的发生时间。我们分析下这个计算过程的可行性:根据事件最先发生的开始时间和当前传入的时间值,我们知道两者的时间差,然后根据「重复周期值:interval」可以知道下次发生的时间所在的周期区间。缩小在指定时间周期区间后,再根据具体的某天,某月,某年的信息,即可以算出最终的下次发生的时间点。所以从这里分析来看,好像理论上是可行的。但是有几点障碍使我觉得这种计算方法不能很完美:
  1. 这个时间差和Interval的关系在「农历计算」时我觉得没有公式计算,主要是闰月的原因。当然有一种办法,就是计算两个农历时间的差值时,一年一年的判断累加。但是我觉得这种方法不完美
  2. 「重复次数:Count」这个值基本上实时计算不出来。为什么这么说呢,因为有些比较特殊的「重复规则」会导致忽略一些时间点。如果每月31号重复时间,在小月的时候就不会发生。还有每月的第四个星期三,有时候一个月经常没有第四个星期几发生。

枚举法

 枚举法,可以理解成「有状态」的比较计算。每次调用都是根据传入的值和「预存计算好的值」比较。
我们总是先把该重复事件所有要发生的时间线上的点都计算出来,并保存起来。以后每次调用计算方法时,只要根据传入的参数值马上知道它的上次和下次发生时间点。相比上面的「实时计算法」,它的优点显而易见:简单,快速,并且可以解决上面方法中无法处理的两点。但是缺点你也想到了:那要多少空间存储这些预计算的值? 但是任何产品,都有它的实际使用场景,我想任何人使用「日历产品」的时候我们关注的时间区间都是以今天为中心两边延伸的时间区间,而且一般这个区间不会超过1年,或者2年吧。所以我们可以先计算出以今天为中心的前后各十年(这个看你估量设置)的时间区间上所有发生时间点。
这张图看起来类有点多,但是一点都不复杂,它的层次设计也是完全按照业务模型来设计的。简要说明一下这几个类:
  • Rule 是最顶层接口,用户直接操作的也只会是这个类型,这样用户就不用知道太多细节。
  • AbstractRule 是对事件中和时间相关属性的一些基本框架方法定义。
  • OnceTimeRule 是一次性事件,即只发生一次非重复发生事件
  • AbstractRecurRule 是所有重复事件的抽象类
  • DailyRule , WeeklyRule , AbstractMonthlyRule , AbstractYearlyRule 分别代表按天,周,月,年重复事件规则。
  • GregorianMonthlyRule,LunarMonthlyRule,GregorianYearlyRule, LunarYearlyRule 分别是农历,公历的按月,按年重复事件规则
  • AbstractMutliCalendarRuleHelper,GregorianCalenarRuleHelper,LunarCalenarRuleHelper 是公历,农历规则计算中使用到的辅助类。
Code: https://github.com/hongfuli/simplecal/tree/redis

http://www.mitbbs.com/article_t1/JobHunting/32635005_0_1.html
scheduler
event
notifier
这种高度抽象的题,我觉得第一步面试官期待的是你能narrow down到一个个的基本问题。任何一个小问题都够你们讨论整个面试。
老三样 kafka cassandra spark
show and resolve conflict among multiples users' calendar
scheduling periodic event
alarm before event occurrence

new
invite
display
modify
notify
delete
email function
holiday automatic mark
提供map的link - integrate with other systems: gmail, maps, search etc
貌似还有multiple calendar

开始什么?pseudo code? flowchart? 还是coding?
这种面试对互动能力和思维活跃度的功底要求很高
平时没这种能力的 装不了几秒就趴窝了
我想先整理一下core functionality, 然后开始设计

1. what kind of fancy features they expected...
2. features cases:
1) non special
2) special

先假设是case 1), 那么basic data structures 就要列出一个list

这里又分几种cases
a)  日历最重要的结构是timing
这部分又分成几个不同的组成部分
i. 当前的年月日 系统时间里取出
ii. 未来的时间年月日
iii. 过去的年月日
iv. 历史上的今天, 也包括国家假日

1.先做features basics
2.在此基础上,其他的additional,一个一个往上加

通常basic requirement比较fixed,就那几个modules,timing这部分要设置几个分支
的模块

这个没有什么挑战度吧,就是围绕着不同时间的表达,再加上repeatable alerts,
这种topic,要稍微了解一下目前工业所hot的DB存储模式,面试的时候,往上套就行了

假设有1 billion user
每个user平均每天new一个event
平均每天读10次

那么大概每秒10k的写和100k的读
如果每个user可以使用1M的存储空间,那么total就是1PB,属于大数据了
当然实际使用的情况感觉应该没有这个大,但是potentilly还是可能的, 我感觉实际情
况100T应该是够了 (90%的user不怎么使用calendar)

从这个分析来说, Cassandra handle起来应该没什么问题,是一个不错的选择, 一般
的SQL就不适合处理这么大量了。

先分析core functionality
一般就是CRUD,这个用rest来实现就很好,frontend就先不提了,都是JS的工作
那么这步重要的是设计C*的schema

我刚才看了一下,主要的大概有这样的信息
user gmail account: String
subject: String
start_at: timestamp
end_at: timestamp
Repeat: 比较复杂,可以用map
Where: string
Description: string
Reminder: map
Guests: set

NOSQL的schema的设计一般是按照query来的, Calendar的查询最典型的就是按照时间
来查询了,所以
partition key: email
clustering key: start_at

但是同一个start_at,同一个user是可以有多个event的,所以primary key里还需要另
外一个东西,我觉得可以每个event生成一个uuid,加入到primary key中。

一般大数据了就要上NOSQL了,而NOSQL scale最好的就是C*和HBase了。

https://medium.com/adventures-in-consumer-technology/google-calendar-concept-7291a9923711
Your content will be available directly within your calendar.
Navigating dozens of apps to find your content is cumbersome and time consuming. As a consequence, platforms that bundle content from various sources have become extremely valuable and popular. A calendar platform can automatically pull relevant events from your apps, which means that you don’t have to spend time digging through those apps or manually creating events.

Your calendar will get you excited for upcoming events.
Tapping the location reveals options to view the address in Google Maps or call a car (on the day of the event). The QR code expands when tapped, making it easy to check in once you arrive at the venue. Your calendar should make you feel at ease, knowing that everything you need is instantly accessible.

Your calendar will evoke nostalgia.
Your calendar should encompass all the fun things you do, and make it easy to look back and relive your favorite experiences. In this example, a calendar event summarizes a weekend trip with friends using photos, videos, profile pictures, and a map.
Integrate and share with G+

Your calendar will suggest relevant events.
Your calendar will respond dynamically to changing contexts.
This flow demonstrates how your calendar will update automatically based on new messages you receive. In the future, you won’t have to adjust calendar events as details change. More importantly, you’ll never show up to the wrong restaurant because you missed a text.
Your calendar will help you better understand your behavior, and recommend adjustments.
https://medium.com/adventures-in-consumer-technology/facebook-events-concept-29ca03297e19#.b8c33x4lx
Facebook Messenger, Instagram, and Whatsapp show that the long-term benefits of a standalone app far outweigh the short-term costs of migration.
Tickets will be one tap away.
Events will spur activity.
Events will match preferences among friends.
Events will be integrated with your favorite services.
Google Calendar Link shows Google's Strong URL architecture
Internet Calendaring and Scheduling Core Object Specification (iCalendar)

Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts