Massive Technical Interviews Tips: Cracking the coding interview--Q12.1 Stock Price

Saturday, July 25, 2015

Cracking the coding interview--Q12.1 Stock Price

Cracking the coding interview--Q12.1
If you were integrating a feed of end of day stock price information ( open, high, low,and closing price) for 5,000 companies, how would you do it? You are responsible for the development, rollout and ongoing monitoring and maintenance of the feed. Describe the different methods you considered and why you would recommend your approach. The feed is delivered once per trading day in a comma-separated format via an FTP site. The feed will be used by 1000 daily users in a web application.
译文：
如果你要为5000家公司的股价信息整合摘要，你会怎么做？你要负责摘要的开发，部署，监控和维护。描述你能想到的不同方法，及为什么你会推荐这些方法。摘要以逗号分隔的格式经由FTP进行交付，每个交易日一次。每日有1000个用户在web应用程序中使用这些摘要信息。

https://tianrunhe.wordpress.com/2012/04/08/design-a-feed-system-of-stock-informaiton/
Let’s assume we have some scripts which are scheduled to get the data via FTP at the end of the day. Where do we store the data? How do we store the data in such a way that we can do various analyses of it?
Proposal #1
Keep the data in text files. This would be very difficult to manage and update, as well as very hard to query. Keeping unorganized text files would lead to a very inefficient data model.
Proposal #2
We could use a database. This provides the following benefits:

Logical storage of data.
Facilitates an easy way of doing query processing over the data.

Example: return all stocks having open > N AND closing price < M.
Advantages:
Makes the maintenance easy once installed properly.
Roll back, backing up data, and security could be provided using standard database features. We don’t have to “reinvent the wheel.”
Proposal #3
If requirements are not that broad and we just want to do a simple analysis and distribute the data, then XML could be another good option.
Our data has fixed format and fixed size: company_name, open, high, low, closing price.

Benefits:
Very easy to distribute. This is one reason that XML is a standard data model to share /distribute data.
Efficient parsers are available to parse the data and extract out only desired data.
We can add new data to the XML file by carefully appending data. We would not have to re-query the database.

解答
假设我们有一些脚本在每日结束时通过FTP获取数据。我们把数据存储在哪？我们怎样存储数据有助于我们对数据进行分析？
方案一
将数据保存在文本文件中。这样子的话，管理和更新都非常麻烦，而且难以查询。保存无组织的文本文件是一种非常低效的方法。
方案二
使用数据库。这将带来以下的好处：

数据的逻辑存储
提供了非常便捷的数据查询方式

例子：return all stocks having open > N AND closing price < M (返回开盘价大于N且收盘价小于M的所有股票)
优势：

使得维护更加简单
标准数据库提供了回滚，数据备份和安全保证等功能。我们不需要重复造轮子。

方案三
如果要求不是那么宽泛，我们只想做简单的分析和数据分发，那么XML是另一个很好的选择。我们的数据有固定的格式和大小：公司名称，开盘价，最高价，最低价和收盘价。 XML看起来应当如下所示：

<root>
  <date value=“2008-10-12”>
    <company name=“foo”>
      <open>126.23</open>
      <high>130.27</high>
      <low>122.83</low>
      <closingPrice>127.30</closingPrice>
    </company>
    <company name=“bar”>
      <open>52.73</open>
      <high>60.27</high>
      <low>50.29</low>
      <closingPrice>54.91</closingPrice>
    </company>
  </date>
  <date value=“2008-10-11”> . . . </date>
</root>

优势：

便于数据分发。这就是为什么XML是共享及分发数据的一个标准模型。
我们有高效的解析器来提取出想要的数据
我们可以往XML文件中追加新数据

不足之处是数据查询比较麻烦(这点比不上数据库)。
全书题解目录：
Cracking the coding interview–问题与解答

Read full article from Cracking the coding interview--Q12.1

Saturday, July 25, 2015

Cracking the coding interview--Q12.1 Stock Price

Labels

Popular Posts