Massive Technical Interviews Tips: Build APIs You Wont Hate

Monday, September 28, 2015

Build APIs You Wont Hate

Build APIs You Wont Hate
https://github.com/philsturgeon/build-apis-you-wont-hate
《Build APIs You Won't Hate》筆記
1 Useful Database Seeding
Don’t use integers as timestamps.
If you store America/New_York or Asia/Khandyga for users, then the offset and daylight savings time will be automatically calculated.
java-faker

2 Planning and Creating Endpoints
GET /resources/X,Y,Z - The client wants multiple things, so give them multiple things.
• GET /places/X/checkins - Find all the checkins for a specific place.
• GET /users/X/checkins - Find all the checkins for a specific user.
• GET /users/X/checkins/Y - Find a specific checkin for a specific user.

Auto-Increment is the Devil - Use UUID
DELETE /places/X,Y,Z - Delete a bunch of places.
• DELETE /places
• DELETE /places/X/image - Delete the image for a place, or:
• DELETE /places/X/images - If you chose to have multiple images this would remove all of them.

PUT is used if you know the entire URL before hand and the action is idempotent.
Use Plural Noun

3 Input and Output Theory
No Form Data
x-www-form-urlencoded is a mess
Don't mix JSON with form data
Much Namespace, Nice Output
{
"data": {
"name": "Phil Sturgeon",
"id": "511501255"
}
}

Ask for a collection of things and get a collection of things, but namespaced:
{
"data": [
{
"name": "Hulk Hogan",
"id": "100002"
},
{
"name": "Mick Foley",
"id": "100003"
}
]
}
By placing the collection into the "data" namespace you can easily add other content next to it which relates to the response but is not part of the list of resources at all. Counts, links, etc can all go here

4 Status Codes, Errors and Messages
• 202 - Accepted but is being processed async (for a video means encoding, for an image means resizing,
etc)
• 400 - Bad Request (should really be for invalid syntax, but some folks use for validation)
• 403 - The current user is forbidden from accessing this data
• 410 - Data has been deleted, deactivated, suspended, etc
• 405 - Method Not Allowed (your framework will probably do this for you)
• 500 - Something unexpected happened and it is the APIs fault
• 503 - API is not here right now, please try again later
your Amazon Elastic Load Balancer has no healthy instances (503) or if your hard-drive fills up somehow (507)

{
"error": {
"type": "OAuthException",
"code": "ERR-01234",
"documentation_url": "http://example.com/docs/errors/#ERR-01234"
"message": "..."
}
}

4.4 Error or Errors
simply stop processing (exit out) after the first error to avoid further controller interaction.

5 Endpoint Testing

6 Outputting Data
Performance: If you return “all” items then that will be fine during development, but suck when you have a thousand records in that table… or a million.

6.4 Hiding Schema Updates
Change the old value to new value:
'status' => $place->status === 'available' ? 'active' : $place->status,

7 Data Relationships
“downloading enough data to avoid making the user wait for subsequent loads”
and “downloading too much data to make them wait for the initial load” is hard.

An API needs the flexibility and making sub-resources the only way to load related data is restrictive for the API consumer.

7.3 Foreign Key Arrays
7.4 Compound Documents (a.k.a Side-Loading)
7.5 Embedded Documents (a.k.a Nesting)
/places?embed=checkins,merchant
/places?embed=checkins,merchant,current_opp.images

8 Debugging
curl -X POST http://localhost/places/fg345d/checkins --data @payload.json
Postman
Debug Panel
RailsPanel² - Chrome-only DevTool panel with logging and profiling for Ruby on Rails. (RailsCasts Video³).
Clockwork⁴ - Chrome DevTool panel and standalone web app with logging and profiling for PHP.
Chrome Logger⁵ - Chrome Logger only for Python, PHP, Ruby, Node, .NET, CF and Go.

8.4 Network Debugging
Charles

9 Authentication
Approach #1: Basic Authentication
php -a
echo base64_decode('QWxhZGRpbjpvcGVuIHNlc2FtZQ==');

Approach #2: Digest Authentication
Approach #4: OAuth 2.0 - ssl, access token
file_get_contents('https://graph.facebook.com/me?access_token=DFGJKHDFGHDIFHGFKDJGHIU');
use the Authorization header to send your tokens whenever possible.

“Short”-life Tokens
Grant Types
OpenID
Hawk
Oz

10 Pagination
Define a Maximum
"pagination" : {
"total": 1000,
"count": 12,
"per_page": 12,
"current_page": 1,
"total_pages": 84,
"next_url": "https://api.example.com/places?page=2&number=12",
}

Counting lots of Data is Hard - SELECT count(*) is expensive
10.3 Offsets and Cursors
If there is more data to be found, the API will return that data. If there is not more data, then either an error (404) or an empty collection will be returned.
"pagination" : {
"cursors": {
"after": 12,
"next_url": "https://api.example.com/places?cursor=12&number=12"
}
}

Obscuring Cursors

11 Documentation
API Reference
Sample Code
Guides or Tutorials
Tool: swagger,API Blueprint

12 HATEOAS
1. Content negotiation
2. Hypermedia controls
"links": [
{
"rel": "self",
"uri": "/places/2"
},
{
"rel": "place.checkins",
"uri": "/places/2/checkins"
},
{
"rel": "place.image",
"uri": "/places/2/image"
}
]

http://www.iana.org/assignments/link-relations/link-relations.xhtml
$response = $client->options('/places/2/checkins');

13 API Versioning
Versioning and Types in REST/HTTP API Resources
http://thereisnorightway.blogspot.com/2011/02/versioning-and-types-in-resthttp-api.html

How are REST APIs versioned?
http://www.lexicalscope.com/blog/2012/03/12/how-are-rest-apis-versioned/

Approach #1: URI
https://api.example.com/v1/places

As long as you share a Git history you can pull from the other repository or branch, and merge changes up
from older versions to newer versions.

Approach #2: Hostname
https://api-v1.example.com/places

Approach #3: Body and Query Params
{"version" : "1.0"}

Approach #3: Custom Request Header
BadAPIVersion: 1.1
Vary: BadAPIVersion
• Cache systems can get confused

Approach #4: Content Negotiation
Accept: application/vnd.github.v3+json
• Keeps URLs the same
• HATEOAS-friendly
• Cache-friendly

Approach #5: Content Negotiation for Resources
1 Accept: application/vnd.github.user.v4+json
Alternatively, the Accept header is capable of containing arbitrary parameters.
1 Accept: application/vnd.github.user+json; version=4.0

application/vnd.example.place.v1+json and application/vnd.example.place.json;version=1

Approach #6: Feature Flagging
Expires, Etag, Retry-After

³http://barelyenough.org/blog/2008/05/versioning-rest-web-services/
⁴http://nvie.com/posts/a-successful-git-branching-model/

https://pastewall.com/sticker/c98b00bf9fbd46789a9302c5b98cbdfa

Ch1. Useful Database Seeding

凡事寫Testing的程式幾乎都會做Seeding，也就是準備好一些隨機的資料餵進資料庫，然後用這些資料做測試。當然Seeding也有可能用在人員做local端的開發時，需要假資料來測試一些功能的時候。

其中一個重點是，關於紀錄使用者的時區，可不能只有紀錄+8的offset，應該要紀錄完整的timezone時區名稱，不然北京跟台北都是+8，但很多人會在意時區名稱的差別。

Ch2. Planning and Creating Endpoints

這邊教了一個開始訂API的一個小技巧：首先將所有會用到的物件(everything)列出來

，然後對這些物件訂下會執行的動作(action)。這裡的動作可能不單純只是Create、Read、Update、Delete，也有可能是List(列出這物件的清單)、Image(為這物件附加一個圖片)、Like(使用者喜歡這物件)，盡可能地列出。接下來就可以訂API的URL了。

接下來就是訂URL的技巧了：

善用HTTP Method：GET、POST、PUT、DELETE、PATCH，其中POST跟PUT都算是更新資料，但就差在前者不是idempotent，也就是前者重複呼叫結果可能會不同，後者呼叫多少次應該都要有相同結果。像是POST /users/1/messages會新增一則訊息，POST很多次就會產生很多個新訊息，但PUT /users/1/messages可能只是修改這則訊息的內容。
URL中的所有物件名稱都應該使用複數，這樣單一Endpoint也可只支援多個參數，像是：
/users - 條列出多個使用者的資料
/users/1 - 指定某個使用者的資料
/users/1,2 - 指定多個使用者的資料
URL應該都只有出現名詞，因為動詞已經用HTTP Method來表達了。

Ch3. Input and Output Theory

由於現在瀏覽器跟網頁技術很先進了，呼叫Server API可以的話應該使用application/json來傳遞參數給server端。

Server的回應也應該以application/json為主，XML格式因為無法表達出字串以外的資料型態，像JSON就有boolean、number、string這些差別，所以不建議使用XML的回應。若要使用的話應該用程式從JSON格式自動轉XML就可以了。

回應的格式應該將回傳的資料部分用個變數"data"包起來：

GET /users/1 回傳單筆資料：

{

"data" : { "id": 1, "username": "Foo" }

}

GET /users 回傳清單資料：

{

"data": [ { "id": 1, "username": "Foo" }, { "id": 2, "username": "Bar" } ]

}

這樣就避免回傳就直接是 { "id": 1, "username": "Foo" } 結果一些metadata就放不進回傳之中了，像是pagination或是error message，因為放進去就跟物件的資料混在一起了。

Ch4. Status Codes, Errors and Messages

善用HTTP Status Code：

2xx is all about success
3xx is all about redirection
4xx is all about client errors
5xx is all about service errors

但記住Status Code也不要用太多，全用上並不會解除成就（You won’t unlock any achievements for using them all.）。

當發生錯誤時，建議回傳一個錯誤的清單：

{

"errors": [

{

"code": "109",

"title": "OAuth Exception",

"details": "There is no access_token in the request header",

"href": "http://doc.example.com/errors/109"

}

]

}

使用HTTP Status Code搭配error code，讓語意能夠被理解，又有code可以讓程式對這個錯誤做反應。title跟details都是給人看的資料，當然也可以做成給使用者看的錯誤訊息。href就讓使用者知道錯誤的詳細資訊的連結。

Ch5. Endpoint Testing

這就跟每個Framework或是開發語言相關了，就不再詳細介紹。

Ch6. Outputting Data

輸出資料時記得不要傻傻地將ORM回傳的東西直接轉JSON丟出去了，這樣可能會有一些不應該被公開的欄位被拿到。

Ch7. Data Relationships

這張主要介紹當物件之間有關連時，例如物件A包含多個物件B，要怎麼在讀取A的時候拿到B的資料。

最簡單的方法就是A一個API，然後B一個API，但因為拿到A之後就要發很多個request去將B撈齊全，這樣會有太多request讓效能變超級差，所以要依照client合理的使用狀況將資料嵌入A的回傳資料中。嵌入方法有以下：

Foreign Key Arrays：在物件A裡面加入一串B的key，然後之後再呼叫B的API去撈取這些資料。雖然可以合併拿資料，像是使用 /b-resource/1,2 ，但還是有點麻煩，而且前端會很累。

Compound Documents（Side Loading）：回應裡面物件A長的跟「Foreign Key Arrays」一樣，但是"data"之外再加上一個區塊"included"表示物件B的資料：

{

"data": { "id": 1, "title": "Foo", "content": "Bar", "comments": [ "1", "2" ] },

"included": [{

"type": "comment",

"id": 1,

"author": "Baz",

"content": "Baz"

}]

}

這也是有client端要手動縫合資料的問題，可能也不是非常建議。

Embedded Documents：將物件B的資料直接放在物件A當中，但要展開的資料可以從request URL裡面指定：

http://api.example.com/a-resource?include=comments,related_posts

這樣應該是最好用的，只是Server端就要做額外處理，像是Ch6提到的欄位過濾就要變成聽include去額外包進一些欄位。

Ch8. Debugging

這章節只是提一下如何用一些工具看到API回傳的值，像是curl、Postman、Fiddler之類的工具介紹。

Ch9. Authorization

作者推薦使用OAuth 2.0，不過建議找一個別人已經實作好驗證過的版本，這樣比較不會有問題。作者除了有使用OAuth 2.0預設的四種Grant Type，還有自己實作一個"social" Grant Type，用來連接其他第三方的登入，像是Facebook或Google。

Ch10. Pagination

列資料清單的時候通常都要有個方法只拿取區塊的資料，這時就要有資料分頁的設計了。這裡有兩種主要設計方法：

Paginators：使用page跟per_page的資料指定每頁多少筆資料，然後是第幾頁。

Server回應的資料可以長得像：

{

"data": [ ... ],

"pagination": {

"total": 1000,

"count": 12,

"per_page": 12,

"current_page": 1,

"total_pages": 84,

"next_url": "/a_resources?page=2&per_page=12"

}

total指這個物件總共有多少筆，count指目前data種有多少筆，per_page跟current_page就是client呼叫API時的per_page跟page，total_pages幫你算好應該要有多少頁資料，next_url給你下一頁的API連結。

但這種方法會因為資料一直在增加減少，而導致抓下一頁資料的時候出現重複資訊，現實中這有點難避免，因為當資料順序都可以任意換的時候，應該是沒有一個簡單的解決方法可以處理這問題。

Cursors：使用一個字串代表一個定位點，像是使用資料的ID當作cursor，使用這個cursor就代表要找這個ID之後（或之前，都沒有包含這個ID）的資料。或是直接用Offset當成cursor，找多少筆之後的資料，不過這就有點像Paginators，只是cursor=page*per_page而已。

Server回應的資料可以長得像：

{

"data": [ ... ],

"pagination": {

"after": "MjAwMA==",

"before": "MTk4MA==",

"next_url": "/a_resources?after=MjAwMA%3d%3d&per_page=20"

}

after跟before都是Server知道是什麼就可以了，而做這個隨機字串的原因並非安全性，而是讓client無法自己簡單的推算出其他的cursor。

Ch11. Documentation

寫文件重點基本上就是API Reference清單、Sample Code、Guides或Tutorial。有Sample Code可以讓讀文件的人快速上手API實際上怎麼呼叫，Guides可以快速學習API的使用方式。

之後作者就推了幾個整理文件的工具，像是 Swagger、API Blueprint。

Ch12. HATEOAS

全名「Hypermedia as the Engine of Application State」，主要就是做到API就像是網頁一樣能夠從一個點瀏覽到其他地方，總之重點是這兩個：

Content negotiation：這個下面API Versioning講到的很像，回傳文件的資料格式可以在request的時候，利用HTTP Accept header指定我想要哪種回傳格式（json, xml, yaml）。
Hypermedia controls：簡單的說法就是為資料再加上相關資料的連結，像是文章裡可以加上文章留言的API網址。基本上要稱為100% RESTful的API就得做到這點才行。

Server回應的資料可以長得像：

{

"data": {

"id": 1,

"title": "Foo",

"content": "Bar",

"links": {

"rel": "post.comments",

"uri": "/post/1/comments"

}

links就放跟這個文件相關的資料連結，rel代表這個資料的層級關係，只要自己訂出一套命名法則，讓每個物件的關係都能夠不至於混淆，就算是合格的設計。uri就代表要跟這個資源互動的API。而互動時可以用的HTTP Method，就使用HTTP的OPTIONS Method去跟API Server詢問，這樣就可以大致知道能夠對這個資源做的操作。

Ch13. API Versioning

這章節就沒有所謂的Best practices了，書中提到一些方法都各有優缺點，這裡有個網頁列出大家通常都用哪種方式做versioning的。

URI加版本號
http://api.example.com/v1/users
好處是client超好使用，server也算好寫，只是當不同版本的API若要給不同的server去handle的話，可能就要用Apache或Nginx寫特別的條件才行。
壞處是這一點也不RESTful（不過誰管他，大家都在用呢XD）
在Hostname家版本號
http://api.v1.example.com/users
優缺點就跟「URI加版本號」一樣，只是放版本號的地方不同
在Request Body或query放版本號
http://api.example.com/users?version=1
http://api.example.com/users（然後Request Body放 { "version": 1 }）
好處是API的網址都會是相同的，只要丟不同的version參數就好。
壞處是當client傳送的資料不是JSON的時候，就比較難在Request Body放這個版本號
自訂版本號的Header
http://api.example.com/users（然後加Header X-Api-Version: 1）
好處跟「在Request Body或query放版本號」一樣
壞處是Server要做cache的時候會比較麻煩，因為有可能會將version 1.1的資料回傳給version 1.0的request，這時就要加上vary: X-Api-Version的資料才能確保cache正確。
Content Negotiation
http://api.example.com/users（然後Accept設定application/vnd.service[.version]+json）
這方法比較神秘，使用這種方法的人有Github，應該可以參考Github的文件來瞭解這種版本設定的方法。
好處是HATEOAS-friendly, Cache-friendly
壞處是client一開始可能不太會用這種方法
Content Negotiation for Resources
http://api.example.com/users（然後Accept設定application/vnd.service.user+json; version=1）
很像「Content Negotiation」，但這種方法我真的不熟，感覺很難用的樣子XD
Feature Flagging
API版本號不在request之中，而是呼叫API時的token相對應的後台可以設定要用哪種版本的回傳資料。Facebook就使用這種方法，讓你有時間可以migrate，但是一定要在一定時間內migrate不然就永遠無法使用這支API了。
好處是版本控制變簡單。
壞處是client可能要依照新舊API版本多加判斷式，然後deploy之後在由後台切換API版本，這樣才能夠不中斷服務又能正常使用API。不過這樣的行為讓維護client程式變得有點複雜，甚至之後要手動移除判斷版本的程式，維護就變成兩個階段。

書經過一個禮拜的努力終於看完了，只有180頁左右，大家應該也能夠快速KO掉這本書吧。整體來說是本很不錯的API設計入門書，但是更細節的部份可能就要有實際的經驗來補足了。

Monday, September 28, 2015

Build APIs You Wont Hate

Labels

Popular Posts