Tuesday, May 17, 2016

programming development misc


http://www.cnblogs.com/baochuan/p/4625262.html
  有一张收藏表,里面存储的是用户和图书ID。数据量为1亿。现在要求分页获取所有用户ID(不重复),写下你的sql语句。
三种设计
 
  最容易想到的第一种分页语句是(这也是我们最容易想到的语句):
1
2
select distinct uid from 收藏表 order by uid desc limit 0, 10;
select distinct uid from 收藏表 order by uid desc limit 11, 10;
  再高级点语句,第二种($last_min_uid表示上一次读到的最后一个uid):
1
2
select distinct uid from 收藏表 order by uid desc limit 10;
select distinct uid from 收藏表 where uid < $last_min_uid order by uid desc limit 10;
  最高级的方式
1
2
select uid from 收藏表 group by uid order by uid desc limit 10;
select uid from 收藏表 group by uid having uid < $last_min_uid order by uid desc limit 10;
  以上三种方式都可以实现分页获取到用户ID列表,那么区别是什么?我现在就把每一种跟大家分析下。

  第一种在业务场景中,会出现丢数据的情况。——这是比较严重的情况,不予采纳。
  具体的业务场景是这样的:当你读取第5页的时候,前四页的用户id列表中,假如有一页的用户ID从库中删除掉,那么你这时读到的第5页(limit 51, 10),就是原来的第6页,你会把1页的用户ID丢失掉。

  第二种的第二条语句,通过explain分析,实际并没有命中唯一索引,而只是命中了一般索引,数据查询范围在7百万级别,故explain建议我们使用group by。——这个查询会有严重的性能问题
+----+--------------+---------------+-------+-------------------------------------------------------------+-------------+----------+-------+------------+------------------------------------------------------------------------+
| id   | select_type | table           | type  | possible_keys                                                      | key           | key_len | ref     | rows        | Extra                                                                               |
+----+--------------+---------------+-------+-------------------------------------------------------------+---------------------+---------+------+---------+---------------------------------------------------------------------+
| 1    | SIMPLE      | ubook_room | range | uid_book_id                                                       | uid_status  | 4          | NULL | 7066423  | Using where; Using index for group-by; Using temporary; Using filesort |
+----+--------------+---------------+-------+-------------------------------------------------------------+-------------+----------+-------+------------+------------------------------------------------------------------------+
  
  第三种explain分析,数据查询范围在12万级别(跟第二种相差一个数量级),查询性能高
+----+---------------+------------+-------+-----------------+-----------------+---------+----------+----------+-------------+
| id   | select_type  | table       | type  | possible_keys  | key               | key_len | ref       | rows      | Extra         |
+----+---------------+------------+-------+-----------------+-----------------+---------+----------+----------+-------------+
| 1    | SIMPLE       | 收藏表      | index | NULL             | uid_book_id   | 12         | NULL  | 121719   | Using index | 


No comments:

Post a Comment

Labels

Review (554) System Design (293) System Design - Review (189) Java (178) Coding (75) Interview-System Design (65) Interview (60) Book Notes (59) Coding - Review (59) to-do (45) Knowledge (39) Linux (39) Interview-Java (35) Knowledge - Review (32) Database (30) Design Patterns (29) Product Architecture (28) Big Data (27) Soft Skills (27) Miscs (25) MultiThread (25) Concurrency (24) Cracking Code Interview (24) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Distributed (20) Interview Q&A (20) OOD Design (20) System Design - Practice (19) Security (17) Algorithm (15) How to Ace Interview (15) Brain Teaser (14) Google (13) Linux - Shell (13) Spark (13) Spring (13) Code Quality (12) How to (12) Interview-Database (12) Interview-Operating System (12) Redis (12) Tools (12) Architecture Principles (11) Company - LinkedIn (11) Testing (11) Resource (10) Solr (10) Amazon (9) Cache (9) Search (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Company - Uber (8) Interview - MultiThread (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Scalability (8) Cassandra (7) Git (7) Interview Corner (7) JVM (7) Java Basics (7) Machine Learning (7) NoSQL (7) C++ (6) Design (6) File System (6) Highscalability (6) How to Better (6) Kafka (6) Network (6) Restful (6) Trouble Shooting (6) CareerCup (5) Code Review (5) Company - Facebook (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Be Architect (4) Big Fata (4) C (4) Company Product Architecture (4) Data structures (4) Design Principles (4) Facebook (4) GeeksforGeeks (4) Generics (4) Google Interview (4) Hardware (4) JDK8 (4) Optimization (4) Product + Framework (4) Shopping System (4) Source Code (4) Web Service (4) node.js (4) Back-of-Envelope (3) Company - Pinterest (3) Company - Twiiter (3) Company - Twitter (3) Consistent Hash (3) GOF (3) Game Design (3) GeoHash (3) Growth (3) Guava (3) Interview-Big Data (3) Interview-Linux (3) Interview-Network (3) Java EE Patterns (3) Javarevisited (3) Map Reduce (3) Math - Probabilities (3) Performance (3) Puzzles (3) Python (3) Resource-System Desgin (3) Scala (3) UML (3) geeksquiz (3) AI (2) API Design (2) AngularJS (2) Behavior Question (2) Bugs (2) Coding Interview (2) Company - Netflix (2) Crawler (2) Cross Data Center (2) Data Structure Design (2) Database-Shard (2) Debugging (2) Docker (2) Elasticsearch (2) Garbage Collection (2) Go (2) Hadoop (2) Html (2) Interview - Soft Skills (2) Interview-Miscs (2) Interview-Web (2) JDK (2) Logging (2) POI (2) Papers (2) Programming (2) Project Practice (2) Random (2) Software Desgin (2) System Design - Feed (2) Thread Synchronization (2) Video (2) ZooKeeper (2) reddit (2) Ads (1) Advanced data structures (1) Algorithm - Review (1) Android (1) Approximate Algorithms (1) Base X (1) Bash (1) Books (1) C# (1) CSS (1) Chrome (1) Client-Side (1) Cloud (1) CodingHorror (1) Company - Yelp (1) Counter (1) DSL (1) Dead Lock (1) Difficult Puzzles (1) Distributed ALgorithm (1) Eclipse (1) Facebook Interview (1) Function Design (1) Functional (1) GoLang (1) How to Solve Problems (1) ID Generation (1) IO (1) Important (1) Internals (1) Interview - Dropbox (1) Interview - Project Experience (1) Interview Tips (1) Interview-Brain Teaser (1) Interview-How (1) Interview-Mics (1) Interview-Process (1) Jeff Dean (1) Joda (1) LeetCode - Review (1) Library (1) LinkedIn (1) LintCode (1) Mac (1) Micro-Services (1) Mini System (1) MySQL (1) Nigix (1) NonBlock (1) Process (1) Productivity (1) Program Output (1) Programcreek (1) Quora (1) RPC (1) Raft (1) RateLimiter (1) Reactive (1) Reading (1) Reading Code (1) Refactoring (1) Resource-Java (1) Resource-System Design (1) Resume (1) SQL (1) Sampling (1) Shuffle (1) Slide Window (1) Spotify (1) Stability (1) Storm (1) Summary (1) System Design - TODO (1) Tic Tac Toe (1) Time Management (1) Web Tools (1) algolist (1) corejavainterviewquestions (1) martin fowler (1) mitbbs (1)

Popular Posts