[NineChap System Design] Class 3.2: Web Service - Shuatiblog.com
fix MP3 problem
The process of fetching a MP3 (from CDN):
Question: in step 2, there's more Network error, but in step 4, there's no Network error, but Timeout. Why?
Solution: fix the server.
So when CDN server's clock and web server's clock are not synchronized well, MP3 url can expire.
Solution: every 10 minutes sync CDN clock with web server clock.
Which CDN should client choose?
Problem: iOS device can never play Flash.
Solution: develop HTML5 player.
One day retention rate:
Each machine may takes 10,000 or more requests per second.
Queue A will redirect most requests to a static page (cached).
Read full article from [NineChap System Design] Class 3.2: Web Service - Shuatiblog.com
fix MP3 problem
The process of fetching a MP3 (from CDN):
- acquire MP3 link, and send request
- send request to CDN
- CDN receive request, find MP3
- response to client
- play the music
Question: in step 2, there's more Network error, but in step 4, there's no Network error, but Timeout. Why?
Fix step 2, Network error
Problem is: MP3 url invalid. It actually comes from a failed CDN sever.Solution: fix the server.
Fix step 3, CDN can't find MP3
Problem associated with Anti-Leech.a leech) is one who benefits, usually deliberately, from others' information or effort but does not offer anything in return.See that some P2P and leeching software will steal your url links, so the MP3 url expiration time is 5 minutes.
Example: Wi-Fi leeches, Direct linking (or hot-linking) and In most P2P-networks, leeching is whose who download too much.
Anti-Leech specializes in protecting file downloads and stopping bandwidth leeching.
So when CDN server's clock and web server's clock are not synchronized well, MP3 url can expire.
Solution: every 10 minutes sync CDN clock with web server clock.
Fix step 4, Timeout error
Some MP3 are relatively large. Thus timeout.MP3 performs better at higher bps, and aac(Advanced Audio Coding) works better at lower bps.Solution:
- compress MP3 to 48bps, or use aac format. So, play lower-rate music first, then switch automatically.
- pre-load a music while previous is playing.
- optimize CDN
CDN content delivery network is a large distributed system of servers deployed in multiple data centers across the Internet.
The goal of a CDN is to serve content to end-users with high availability and high performance.
Which CDN should client choose?
Not DNS, but web server calculates which to choose. It can be calculated using IP distance, or ISP provider, but not accurate.
We can also use local desktop apps (in different locations) to ping CDN servers. This may violate user privacy, though.
Fix step 5, Fail to play
Problem: some files got wrong decoding.Fix step 6, unkown error
Problem: some users close the page while MP3 loading.Question 5
fix player problemProblem: iOS device can never play Flash.
Solution: develop HTML5 player.
5.2 how to evaluate that you solved the problem
- user complains
- important: daily retention rate!
One day retention rate:
Today's visitor = {U1, U3, U7, U9, U10}
Tomorrow's visitor = {U2, U3, U9,}
Today's one day retention rate = 2/5
Question 6 秒杀
Design
Queue A and Queue BQueue A
Many queues, each one locates on a individual web server or reverse proxy. It is mainly used to accept large amount of requests coming from the clients.Each machine may takes 10,000 or more requests per second.
Queue A will redirect most requests to a static page (cached).
Queue B
Queue B is a single machine, to avoid distributed problems. It takes in small amount of requests and decides results (eg. redirect to payment page).Now, why do we need 2 queues, not 1?
Think about a hospital. Queue A is the hospital lobby and security guard, and Queue B is the queue of patience.
How to reduce traffic
- no image
- cache more static pages
- reverse proxy: batch sending the requests
How to keep it simple?
- no DB: basic logic. But rmb to use a log file
- no lock
How to improve stability
- use new server to do Miao Sha, in case of crash
- asyc prcossing everything! Don't let other people wait, in case of crash.
How to defend hackers
- IP address (to defend auto softwares), but it's easy for hackers to fake IP address
- CAPTCHA
CAPTCHA (an acronym for "Completely Automated Public Turing test to tell Computers and Humans Apart") is a type of challenge-response test used in computing to determine whether or not the user is human.
follow-up
How to design 12306 (support several million QPS)Read full article from [NineChap System Design] Class 3.2: Web Service - Shuatiblog.com