Monday, March 7, 2016

Repository Pattern



http://thinkinginobjects.com/2012/08/26/dont-use-dao-use-repository/
Data Access Object (DAO) is a commonly used pattern to persist domain objects into a database. The most common form of a DAO pattern is a class that contains CRUD methods for a particular domain entity type.
The AccountDAO interface may have multiple implementations which use some kind of O/R mapper or executing plan sql queries.
The pattern has these advantages:
  • It separates the domain logic that use it from any particular persistence mechanism or APIs.
  •  The interface methods signature are independent of the content of the Account class. When you add a telephone number field to the Account, you don’t need to change the AccountDAO interface nor its callers’.
The pattern has many questions unanswered however. What if I need to query a list of accounts having a specific last name? Am I allow to add a method to update only the email field of an account? What if I change to use a long id instead of userName? What exactly a DAO is responsible for?
The problem of the DAO pattern is that it’s responsibility is not well-defined. Many people think it as a gateway to the database and add methods to it when they find potential new ways they’d like to talk to the database. Hence it is not uncommon to see a DAO getting bloated like the one below.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
package com.thinkinginobjects.dao;
import java.util.List;
import com.thinkinginobjects.domainobject.Account;
public interface BloatAccountDAO {
    Account get(String userName);
    void create(Account account);
    void update(Account account);
    void delete(String userName);
    List getAccountByLastName(String lastName);
    List getAccountByAgeRange(int minAge, int maxAge);
    void updateEmailAddress(String userName, String newEmailAddress);
    void updateFullName(String userName, String firstName, String lastName);
}
In the BloatAccountDAO, I added two query methods to look up Accounts with different parameters. If I had more fields and more use cases that query the account differently, I may end up with written more query methods. The consequences are:
  1. Mocking the DAO interface becomes harder in unit test. I need to implement more methods in the DAO even my particular test scenario only use one of them.
  2.  The DAO interface becomes more coupled to the fields of Account object. I have to change the interface and all its implementations if I change the type of fields those stored in Account.
To make things even worse, I added two additional update methods to the DAO as well. They are the direct result of two new use cases which update different subset of the fields of an account. They seem like harmless optimisation and fit into the AccountDAO interface if I naively treat the interface as a gateway to the persistence store. Again, the DAO pattern and its class name “AccountDAO” is too loosely defined to stop me doing this.
I end up with a fat DAO interface and I am sure it will only encourages my colleagues to add even more methods to it in the future. One year later I will have a DAO class with 20+ methods and I can only blame myself chosen this weakly defined pattern.
Repository Pattern:

A Repository represents all objects of a certain type as a conceptual set. It acts like a collection, except with more elaborate querying capability.”
public interface AccountRepository {
    void addAccount(Account account);
    void removeAccount(Account account);
    void updateAccount(Account account); // Think it as replace for set
    List query(AccountSpecification specification);
}
The “add” and “update” methods look identical to the save and update method of my original AccountDAO. The “remove” method differs to the DAO’s delete method by taking an Account object rather than the userName (Account’s identifier). It you think the Repository as a Collection, this change makes a lot of sense. You avoid to expose the type of Accounts identity to the Repository interface. It makes my life easy if I’d like to use long values to identify the accounts.
The Repository may decide to generate a sql against the database if it is backed by a database table, or it may simply iterate through its collection if it is backed by a collection in memory.
One common implementation of a criterion is Specification pattern. A specification is a simple predicate that takes a domain object and returns a boolean.
public interface AccountSpecification {
    boolean specified(Account account);
}
Therefore, I can create one implementation for each different way I’d like to query AccountRepository.
The standard Specification works well with in memory Repository, but cannot be used with database backed repository because of inefficiency.
To work with a sql backed AccountRepository implementation, my specifications need to implement SqlSpecification interface as well.
public interface SqlSpecification {
    String toSqlClauses();
}
A plan sql backed repository can take advantage of this interface and use the produced partial sql clauses to perform database query. If I use a hibernate backed repository, I may use the HibernateSpecification interface instead, which generates a hibernate Criteria when invoked.
The sql and hibernate backed repositories does not use the “specified” method, however I found it is very beneficial to implement it in all cases. Therefore I can use the same implementation classes with a stub AccountRepository for testing purpose and also with a caching implementation of the repository before the query hit the real one.
We can even take a step further to composite Specifications together with ConjunctionSpecification and DisjunctionSpecification to perform more complicate queries.
public class AccountSpecificationByUserName implements AccountSpecification, HibernateSpecification {
    private String desiredUserName;
    public AccountSpecificationByUserName(String desiredUserName) {
        super();
        this.desiredUserName = desiredUserName;
    }
    @Override
    public boolean specified(Account account) {
        return account.hasUseName(desiredUserName);
    }
    @Override
    public Criterion toCriteria() {
        return Restrictions.eq("userName", desiredUserName);
    }
}
public class AccountSpecificationByAgeRange implements AccountSpecification, SqlSpecification{
    private int minAge;
    private int maxAge;
    public AccountSpecificationByAgeRange(int minAge, int maxAge) {
        super();
        this.minAge = minAge;
        this.maxAge = maxAge;
    }
    @Override
    public boolean specified(Account account) {
        return account.ageBetween(minAge, maxAge);
    }
    @Override
    public String toSqlClauses() {
        return String.format("age between %s and %s", minAge, maxAge);
    }
}
DAO pattern offers only a loosely defined contract. It suffers from getting potential misused and bloated implementations. The repository pattern uses a metaphor of a Collection. This metaphor gives the pattern a tight contract and make it easier to understand by your fellow colleagues.
If we treat our repositories as simple collections then we are giving them a single responsibility. I don't want collection classes that are also factories.
The primary benefit of repositories is to abstract the storage mechanism for the authoritative collection of entities.
interface MemberRepository {
    public function save(Member $member);
    public function getAll();
    public function findById(MemberId $memberId);
}
Repository interfaces belong to the domain-layer. The implementation of repositories belong to the application-service layer. This means that we're free to type-hint for our repositories in our domain-layer without ever having to depend on the service layer.
  • ..it's important to give repositories the singular task of functioning as collection objects.
  • ..we shouldn't use repositories to create new object instances.
Do not add anything into the repository class until the very moment that you need it

it provides an abstraction of data, so that your application can work with a simple abstraction that has an interface approximating that of a collection. Adding, removing, updating, and selecting items from this collection is done through a series of straightforward methods, without the need to deal with database concerns like connections, commands, cursors, or readers.
Repository Per Entity or Business Object
The simplest approach, especially with an existing system, is to create a new Repository implementation for each business object you need to store to or retrieve from your persistence layer. Further, you should only implement the specific methods you are calling in your application. Avoid the trap of creating a “standard” repository class, base class, or default interface that you must implement for all repositories. Yes, if you need to have an Update or a Delete method, you should strive to make its interface consistent (does Delete take an ID, or does it take the object itself?), but don’t implement a Delete method on your LookupTableRepository that you’re only ever going to be calling List() on. The biggest benefit of this approach is YAGNI – you won’t waste any time implementing methods that never get called.

Generic Repository Interface

Another approach is to go ahead and create a simple, generic interface for your Repository. You can constrain what kind of types it works with to be of a certain type, or to implement a certain interface (e.g. ensuring it has an Id property, as is done below using a base class).
public interface IRepository<T> where T : EntityBase
{
    T GetById(int id);
    IEnumerable<T> List();
    IEnumerable<T> List(Expression<Func<T, bool>> predicate);
    void Add(T entity);
    void Delete(T entity);
    void Edit(T entity);
}
public abstract class EntityBase
{
   public int Id { get; protected set; }
}

Greg Young talks about the generic repository pattern and how to reduce the architectural seam of the contract between the domain layer and the persistence layer. The Repository is the contract of the domain layer with the persistence layer - hence it makes sense to have the contract of the repository as close to the domain as possible. Instead of a contract as opaque asRepository.FindAllMatching(QueryObject o), it is always recommended that the domain layer looks at something self revealing as CustomerRepository.getCustomerByName(String name) that explicitly states out the participating entities of the domain.

 I had suggested the use of the Bridge pattern to allow independent evolution of the interface and the implementation hierarchies. The interface side of the bridge will model the domain aspect of the repository and will ultimately terminate at the contracts that the domain layer will use. The implementation side of the bridge will allow for multiple implementations of the generic repository, e.g. JPA, native Hibernate or even, with some tweaking, some other storage technologies like CouchDB or the file system. After all, the premise of the Repository is to offer a transparent storage and retrieval engine, so that the domain layer always has the feel that it is operating on an in-memory collection.
// root of the repository interface
public interface IRepository<T> {
  List<T> read(String query, Object[] params);
}

public class Repository<T> implements IRepository<T> {

  private RepositoryImpl repositoryImpl;

  public List<T> read(String query, Object[] params) {
    return repositoryImpl.read(query, params);
  }

  //..
}
Base class of the implementation side of the Bridge ..
public abstract class RepositoryImpl {
  public abstract <T> List<T> read(String query, Object[] params);
}
One concrete implementation using JPA ..
public class JpaRepository extends RepositoryImpl {

  // to be injected through DI in Spring
  private EntityManagerFactory factory;

  @Override
  public <T> List<T> read(String query, Object[] params) {
    
  //..
}
Another implementation using Hibernate. We can have similar implementations for a file system based repository as well ..
public class HibernateRepository extends RepositoryImpl {
  @Override
  public <T> List<T> read(String query, Object[] params) {
    // .. hibernate based implementation
  }
}
Domain contract for the repository of the entity Restaurant. It is not opaque or narrow, uses the Ubiquitous language and is self-revealing to the domain user ..
public interface IRestaurantRepository {
  List<Restaurant> restaurantsByName(final String name);
  //..
}
A concrete implementation of the above interface. Implemented in terms of the implementation artifacts of the Bridge pattern. At the same time the implementation is not hardwired with any specific concrete repository engine (e.g. JPA or filesystem). This wiring will be done during runtime using dependency injection.
public class RestaurantRepository extends Repository<Restaurant>
  implements IRestaurantRepository {

  public List<Restaurant> restaurantsByEntreeName(String entreeName) {
    Object[] params = new Object[1];
    params[0] = entreeName;
    return read(
      "select r from Restaurant r where r.entrees.name like ?1",
      params);
  }
  // .. other methods implemented
}
One argument could be that the query string passed to the read() method is dependent on the specific engine used. But it can very easily be abstracted using a factory that returns the appropriate metadata required for the query (e.g. named queries for JPA).
http://stackoverflow.com/questions/8550124/what-is-the-difference-between-dao-and-repository-patterns
DAO is an abstraction of data persistence. Repository is an abstraction of a collection of objects.
DAO would be considered closer to the database, often table-centric. Repository would be considered closer to the Domain, dealing only in Aggregate Roots. A Repository could be implemented using DAO's, but you wouldn't do the opposite.
Also, a Repository is generally a narrower interface. It should be simply a collection of objects, with a Get(id)Find(ISpecification)Add(Entity)
http://blog.lowendahl.net/data-access/the-repository-pattern-explained-and-implemented/

https://msdn.microsoft.com/en-us/library/ff649690.aspx

Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts