Tuesday, February 16, 2016

API Design Tutorial



http://tutorials.jenkov.com/api-design/index.html
The primary goal of an API or component is to solve some problem the user has. The secondary, but still important goal is to do so with as little effort required from the user as possible. Third, your API should not create any new problems for the user. I have summed that up in this sentence, which is by the way also how I define the concept of "Functional Software Elegance":
Solve my problem, with minimal effort from me, and don't get in my way.
This means that the API should be:
  • As easy to use as possible.
  • As easy to learn as possible.
  • As flexible as possible.
  • Should actually solve users problem.
  • Should not create new problems (e.g. by having annoying limitations)
http://tutorials.jenkov.com/api-design/lots-of-upfront-design.html
The agile community has long promoted the idea that "Change is Cheap". Don't over-design now. You can always change the design later.

This may be true when developing an application in which you have control over all parts. But when you are designing an API that is to be used by external users, the situation is different. The API becomes a part of somebody else's application. Change in the API may be cheap for you, but expensive for the users of your API.

Some of the issues you might want to think about could be:
  • How will the public interface look?
  • How will the API be configured?
  • What defaults should the API assume?
  • Should any of the API's abstraction layers be optional?
If you find out that you really really need to change the public interface of your API, here is what you might consider doing:
  1. Provide an alternative interface, and leave the old interface in the API too.
  2. Deprecate the old interface to signal to users that they should switch to the new interface.
After a few releases with a deprecated interface you might consider removing it completely. But give users of your API a chance to upgrade at their own pace.
Therefore, for every feature you want to add to your API you should give considerate thought to whether that feature actually belongs in your API.
It is easy to feel tempted to add features to your API once your users send you emails with all kinds of suggestions. You should resist the temptation to implement a suggestion right away, unless you know for certain that the suggestion lies perfectly within the core problem domain addressed by your API.

Almost any API could have lots of little nice-to-have features added. Some of these features make the API easier to use, and are thus justified. Other features may seem like a good idea at first, but aren't really core features, or they only apply to a limited set of the total use cases within the domain they address. These should perhaps be left out. They may end up cluttering your API more than they improve it.

Implement the 90-10% Cases
If on the other hand a feature is only used by 10% (or less) of the API users, or only in 10% (or less) of the use cases, it is not a core feature, and should probably be left out. It is probably better to just show a code example of how the user can implement this herself, outside the API.

Yet, if the feature can be implemented in a way that it will not bother the users that arent't using it, perhaps you could still implement it.

Don't Expose More than Necessary
Say I had the Crawler and Indexerseparated internallly, so I could change one without affecting the other. To the user on the outside though, I do not want to expose the Indexer. Writing an indexer is a complicated matter, and not one that I expect users of the web crawler to undertake themselves.

Notice how the Indexer is not visible to the outside world. In other words, the user doesn't have to learn about Indexer in order to use the Crawler. The Crawler works as a central point of access (a facade really) to the crawler API.
Notice how I could also change the API to allow an Indexer to be plugged in, if necessary:
public class Crawler{

  protected Indexer         indexer  = null;
  protected CrawlerListener listener = null;

  public Crawler(CrawlerListener listener){
    this.indexer  = new IndexerImpl();
    this.listener = listener;
  }

  public Crawler(CrawlerListener listener, Indexer indexer){
    this.indexer  = indexer;
    this.listener = listener;
  }


  public void crawl(String url){
    ...
  }

}
Now the user can plugin an Indexer if she is up to the task of implementing one. The API is now exposing the Indexer as well as the Crawler, which isn't desirable most of the time. However, the user is still able to just ignore it, and use the first constructor. The user still doesn't need to know about the IndexerImplclass.
The ability to plugin an Indexer could be useful during unit testing of the crawler. Plugging in a mockIndexer would make it possible to test the Crawler in isolation. This is not nearly as easy when it is not possible to plugin a mock Indexer. So, this code now adheres to the tip Design for Testing. Note however, that this is a bit of a tradeoff between exposing as little as possible, and designing for testability. The world isn't perfect. Neither is software design.


Notice also how this code uses the tip Provide Sensible Defaults. The first constructor creates an instance internally of the default Indexer implementation IndexerImpl.

Provide Sensible Defaults
By "defaults" is meant, that if a certain parameter value, interface implementation, or subclass is used most of the time, provide a method that doesn't take that parameter. Instead it should use that value internally. In other words, "hardcode" it.
An API typically consists of various components and methods. One such method could be:
public class MyClass {

  public void readStream(InputStream stream, boolean closeStreamAferReading){
     ...
  }
}
If in most use cases the users just want the InputStream closed after calling the readStream() method, then most of the time the users will pass in true in the boolean parameter. Rather than burden the user with that, provide a convenience method, like this:
public class MyClass {

  public void readStream(InputStream stream){
      readStream(stream, true);
      }
  
  public void readStream(InputStream stream, boolean closeStreamAferReading){
     ...
  }
}

Provide Default Dependencies

The same is true for dependencies used internally. Say you have some interface called MyDependencywhich MyComponent needs an implementation of. Here is how that could look:
public class MyComponent{

  protected MyDependency dependency = null;

  public MyComponent(MyDependency dependency){
    this.dependency = dependency;
  }
}
If the same implementation of MyDependency is used most of the time, it can be a good idea to provide a default constructor that initializes the internal member dependency with that implementation. Say the implementation is called MyDefaultImpl, then the constructor could look like this:
public class MyComponent{

  protected MyDependency dependency = null;

  public MyComponent(){
      this.dependency = new MyDefaultImpl();
      }

  public MyComponent(MyDependency dependency){
    this.dependency = dependency;
  }
}
Just keep in mind that this default constructor creates a hidden dependency on the MyDefaultImpl class, from the MyComponent class. In many cases, this dependency is harmless though. Dependency injection fanatics may disagree with this. They would claim that you should mark the default implementation with an annotation stating it is the default implementation, and then have the DI container inject the default implementation when you ask the container for a MyComponent instance.

As long as the dependencies do not have side effects, like requiring some JAR file to be present on the classpath which isn't already present, a dependency like the one shown here will most likely not cause any problems. If you really need something else than the default implementation, the second constructor allows you to plug that implementation in. Thus you are never stuck with the default implementation. There is no reason to be over-religious about decoupling dependencies. Remember, looking at the constructor and seeing only a version that takes a MyDependency instance, can also be confusing, if you do not at the same time mention which implementations exists of MyDependency, and which is the default implementation.

Optional Abstractions
Software is often layered in multiple layers ontop of each other, each layer calling down trough the layer below to obtain some service. Each layer is an abstraction which makes the layers below easier to work with for the layer above. Your API can be thought of as one of these layers


One of the main concepts of layered software is that each layer only talks to the layer directly below it. In other words, each layer is an abstraction of the layers below it.

Optional Abstractions

As mentioned earlier, each layer in a layered software model is only supposed to communicate with the layer just below, and just above itself. However, what often makes an API really flexible is the ability tobypass a layer and communicate directly with the lower layers. In other words, that the layers (abstractions) are optional. This is important to make sure that an abstraction (layer) does not "get in the users way"

I had to allow the flexibility of bypassing the automatic mapping. In fact, I made it possible to combine automatic and manual mapping, for increased ease of use and flexibility.

Central Point of Access
one of the goals of an API is to make it as easy as learn as possible. One way to make it easy to learn is if you keep the number of classes down that the user needs to know before she can use the API. A way to achieve this is to provide a central point of access to the API.

Think Code Completion

Factories as Central Point of Access
Managers as Central Point of Access
Service Proxies as Central Point of Access
Facades as Central Point of Access
Another way to provide a central point of access to an API is by providing a Facade (the design pattern) for the API. Rather than accessing all the classes of the API directly, the user will access the services provided by the API via this Facade class.
Providing a Facade can be handy if it is not possible or does not make sense to have a single central factory or manager class (well, a manager class can also be thought of as a kind of Facade). For instance, your API may have several different factories each responsible for creating part of the objects needed to perform the service the API provides. And, you might want to make it possible to replace factory implementations too. In that case it may not really make sense to have a central factory class.

One of consequences of the populary of dependency injection is unfortunately also a sometimes over-religious belief that every dependency should be injected. This is a false belief in my opinion. Especially in the case of dependencies used internally in an API.
Even if only a single class of an API is exposed to the outside world, you may still decide to split up the internal implementation into several smaller classes for various reasons. Let's say that the exposed component A needs both a B, C and D internally to do it's job. Let's again say, that C needs E and F to do its job too. Here is how the dependency hierarchy looks:
    A
      --> B
      --> C
            --> E
            --> F
      --> D
Let's take a step further, and assert that you will never need a different implementation of either B, C, D, E or F, nor a different configuration of any of these instances. You know that for a fact. In that case there is no reason at all to have A (and C for that matter) assembled via dependency injection, as I have also stated in the text When to use Dependency Injection. You might as well have A instantiate B, C and D internally, and have C instantiate E and F internally. There is no reason to expose B, C, D, E and F to the world.
If you do force the user to assemble the whole hierarchy, the user will have to learn more details of your API than necessary. Even if you have a DI container inject all the instances, looking at the API docs may still confuse the user more than necessary.

When you are implementing an API it may sometimes be a temptation to use external libraries, for instance the Apache Commons or Log4J, in your API.
Don't do it, unless there is absolutely no way around!
External dependencies make your API code swell quickly. Just look at Spring, or Apache Axis for proof of that. This means larger code bases, for the end user of the API, and thus sometimes slower build time. Slow build time can be really annoying during development, and a real time robber and productivity killer.
Additionally, the version of the external dependency you are using may clash with the version used in other API's, or in the final application your API is being used in.
External dependencies are, in my opinion, primarily for use in the final applications, not in API's and frameworks. Not unless you know for sure that the final application will also use the same version of that dependency. Or, if that dependency can be swapped for a different version without problems.

Logging inside your API is really just a special case of the Avoid External Dependencies case. When you log inside your API you call a log API to do so. By doing so you make a choice about what logging API to use on behalf of the user of your API. If the user is using a different logging API than your API, the user now has to deal with two log API's.
If you really really need to allow logging of actions inside your API, have the API take a custom event listener. This event listener is an interface you specify. For every interesting event happening inside your API you call a corresponding method on this event listener. The user of your API is thus free to plugin whatever log API she wants to.
Here is a simple code example:
public interface MyEventListener {

  public void onEvent(String msg);

}

Don't Log Exceptions Either

You definately don't want to log any exceptions that occur inside your API either. Nor do you want to call the event listener with an exception. The user of your API is notified of exceptions by the thrown exception. The user of your API will then decide whether to log that exception, or propagate it up the call stack to be logged in a central place. 

  1. Testability of the API itself
  2. Testability of code that uses the API
The easiest way to test code is typically via mock testing. This means that it should be easy to mock up the classes of your API, and easy to inject those mocks into the components you want to test.
To be able to inject mocks into the internals of your classes, you will unfortunately have to expose methods on the class interfaces that enable you to do so. For instance, either a constructor or setter method taking the mock to inject as parameter. To avoid exposing constructors or setters to users of the API, consider making these methods either package access scoped, or protected. If you put your test code in the same package (not necessarily same directory) as the class(es) you need to inject the mocks into, you will be able to access these extra injection methods.
Below is a code example. Imagine that the member variable dependency does not need to be exposed to the user of this class. It is only exposed so that a mock implementation can be injected. It is the protected constructor and setter that expose the dependency member variable.
public class MyAPIComponent {

  protected Dependency dependency = null;

  public MyAPIComponent(){
    this.dependency = new DependencyImpl();
  }

  protected MyAPIComponent(Dependency dependency){
    this.dependency = dependency;
  }

  protected void setMyAPIComponent(Dependency dependency){
    this.dependency = dependency;
  }

}

Designing for Testability of Code Using the API

Use Interfaces
The easiest way to make your classes mockable is to have them implement an interface. Dynamic mock API's can then create mock of that interface at runtime, during the unit test. It can also wrap the original implementation, thus merely recording and forwarding all calls to the mock, to the real implementation.

Use Extendable Classes

If you have not, or cannot have your classes implement interfaces, you should consider making the classes easy to subclass at least. That way a mock can be created by subclassing your API classes, and override the methods that need to be mocked / stubbed.
Design for Easy Configuration
API's often need some kind of configuration before being able to perform its service. A persistence API may need a JDBC driver, database url, user name and password, plus perhaps some object-to-table mappings. A dependency injection container needs instantiation configurations. Etc.
Some of the most common API configuration mechanisms are:
  1. Method Calls on Components
  2. Annotations
  3. JVM Parameters
  4. Command Line Arguments
  5. Property Files
  6. XML Files
  7. A Domain Specific Language
Which of these configuration mechanisms is most appropriate for your API depends on several factors, like:
  1. How much configuration is needed?
  2. Is configuration an implementation choice or deployment choice?
  3. Is the configuration mechanism easy to learn, easy to use and concise?
  4. Which limitations does the configuration mechanism have?

Implementation configuration choices are often best to do via code. That way no external configuration files are needed for configurations that are really part of the code.
Deployment configuration parameters must almost always be exernalized from the application code, and separated into configuration files, databases etc. However, even if the client of the API will externalize the configuration of your API, it may still be more appropriate to allow your API to be configured via code. Then the user of your API can externalize these settings in whatever configuration mechanism they deem appropriate for their application.

I have seen several API's choose Java annotations as configuration mechanism. For instance, a persistence API may allow the user to mark the classes to be persisted with annotations saying which fields should be stored in which columns, and what table objects of this class should be persisted in.
Annotations, however, are class static. This means that it is not possible to have two different configurations of the same class. You can have only one. This is a serious limitation of annotations.


Similarly, ordinary property files may also impose some kind of limitations on your configuration options. For instance, it will be hard to configure hierarchical settings. For this purpose an XML file would be much more suitable.

Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts