Tuesday, April 16, 2019

Eviction policy - Design Pattern



关于cache的 eviction policy设计
TODO: https://github.com/apache/ignite/tree/master/modules/core/src/main/java/org/apache/ignite/cache/eviction
https://www.1point3acres.com/bbs/forum.php?mod=viewthread&tid=316982

我觉得可能strategy看起来更好些,考虑到是个behavioral pattern,不同algorithm 重写同一个方法, 但是需要考虑的重点应该变成了如何使得这个重写的方法尽可能够都使用公共的数据存储结构。最坏不做抽象的情况,就是数据结构保存在当前类中,就变成了类似于factory模式的模式了

我刚搜了圈,发现 apache ignite 有类似的抽象的例子
https://ignite.apache.org/releas ... EvictionPolicy.html,不过好像就是把evictionPloicy抽象成abstract class了,然后数据结构也是在子类中定义的。就如同你说的,不一样的policy的数据结构是不一样的,强行去abstract common 数据结构会有点得不偿失,所以用个工厂模式可能会比较好懂,代码也比较直观

就是设计一个cache的eviction policy,你可以选择是LRU, 还是LFU,还是其他eviction policy. 选择哪个,就按照哪个eviction policy执行。
大家认为这种情况,用哪个pattern比较好?


首先回顾一下LeetCode的LRUCache这道题目所设计的接口:
  1. class LRUCache {
  2. public:
  3.   // Get the value (will always be positive) of the key if the key exists in the
  4.   // cache, otherwise return -1.
  5.   int get(int key);

  6.   // Set or insert the value if the key is not already present. When the cache
  7.   // reached its capacity, it should invalidate the least recently used item
  8.   // before inserting a new item.
  9.   void put(int key, int value);
  10. };
复制代码


当然这个接口非常简陋,至少有两个问题:

(1) Cache和eviction policy(LRU)的逻辑是合并在一起的

这也正是我们这里OOD设计所要解决的问题:将eviction policy解耦(decouple)出来,使得其可以作为一个参数(或者模板参数)被传入Cache类,那么得到的Cache对象就会使用对应的policy。

(2) 没有触及cache设计的一个重要部分:value的生存期的管理

如果value是占用较多内存空间的大对象(而不是LeetCode题中仅占用4个字节的栈区int),或者是C/C++写的代码,一般是由cache管理生存期。
也就是说当evict一个value时,其占用内存应该立刻被回收,而这时所有对该value对象的引用应当立刻失效 —— 在这种情况下要避免代码出bug(非法内存访问),必须由cache对value进行引用计数,且返回值使用"counted reference"(类似于C++ shared pointer)。
不过如果返回的是栈区对象/对象的拷贝/带GC语言下(例如Java)的对象引用,那么不需要考虑引用计数问题,但是这时cache就无法保证内存的使用量被控制在规定范围内。

第(2)点其实是为了说一下eviction policy设计时关于引用计数的接口 —— 下面简化的代码中没有实现这一接口,但是需要了解一下,因为很多实际系统中是有相关内容的,例如PostgreSQL:https://github.com/postgres/postgres/blob/master/src/backend/storage/buffer/bufmgr.c#L139

现在回到(1),下面贴的代码主要包含了:
(a) EvictionPolicy的抽象类
(b) 继承了EvictionPolicy的LRU/LFU实现
(c) 一个EvictionPolicy工厂,使得可以比较容易地添加新的Policy实现
(c) 使用EvictionPolicy实现的Cache类


  1. // Eviction policy的抽象
  2. template <typename Entry>
  3. class EvictionPolicy {
  4. public:
  5.   virtual Entry chooseEntryToEvict() = 0;

  6.   virtual void entryCreated(const Entry &entry) = 0;

  7.   virtual void entryReferenced(const Entry &entry) = 0;

  8.   // 如果维护引用计数的话需要这个接口
  9.   // virtual void entryUnreferenced(const Entry &entry) = 0;

  10.   virtual void entryEvicted(const Entry &entry) = 0;
  11. };


  12. // Eviction policy的类型
  13. enum class EvictionPolicyType : int {
  14.   kLRU = 0,
  15.   kLFU,
  16.   kClock,
  17.   // ...
  18. };


  19. // 一个简单的LRU实现,不支持并发访问
  20. template <typename Entry>
  21. class LRUPolicy : public EvictionPolicy<Entry> {
  22. public:
  23.   Entry chooseEntryToEvict() override {
  24.     return list_.back();
  25.   }

  26.   void entryCreated(const Entry &entry) override {
  27.     list_.push_front(entry);
  28.     index_.emplace(entry, list_.begin());
  29.   }

  30.   void entryReferenced(const Entry &entry) override {
  31.     const auto &it = index_.at(entry);
  32.     if (it != list_.begin()) {
  33.       list_.splice(list_.begin(), list_, it, std::next(it));
  34.     }
  35.   }

  36.   void entryEvicted(const Entry &entry) override {
  37.      const auto it = index_.find(entry);
  38.      list_.erase(it->second);
  39.      index_.erase(it);
  40.   }

  41. private:
  42.   using EntryList = std::list<Entry>;
  43.   using EntryListIterator = typename EntryList::iterator;

  44.   std::list<Entry> list_;
  45.   std::unordered_map<Entry, const EntryListIterator> index_;
  46. };


  47. // 一个简单的LFU实现,不支持并发访问
  48. template <typename Entry>
  49. class LFUPolicy : public EvictionPolicy<Entry> {
  50. public:
  51.   Entry chooseEntryToEvict() override {
  52.     return std::get<0>(*list_.begin());
  53.   }

  54.   void entryCreated(const Entry &entry) override {
  55.     const auto ret = list_.emplace(entry, Clock::now(), 1);
  56.     index_.emplace(entry, ret.first);
  57.   }

  58.   void entryReferenced(const Entry &entry) override {
  59.     auto &it = index_.at(entry);
  60.     const std::size_t count = std::get<2>(*it);
  61.     list_.erase(it);
  62.     const auto ret = list_.emplace(entry, Clock::now(), count + 1);
  63.     it = ret.first;
  64.   }

  65.   void entryEvicted(const Entry &entry) override {
  66.     const auto it = index_.find(entry);
  67.     list_.erase(it->second);
  68.     index_.erase(it);
  69.   }

  70. private:
  71.   using Clock = std::chrono::high_resolution_clock;
  72.   using TimePoint = std::chrono::time_point<Clock>;
  73.   using Record = std::tuple<Entry, TimePoint, std::size_t>;
  74.   struct RecordComparator {
  75.     inline bool operator()(const Record &lhs, const Record &rhs) const {
  76.       if (std::get<2>(lhs) != std::get<2>(rhs)) {
  77.         return std::get<2>(lhs) < std::get<2>(rhs);
  78.       }
  79.       if (std::get<1>(lhs) != std::get<1>(rhs)) {
  80.         return std::get<1>(lhs) < std::get<1>(rhs);
  81.       }
  82.       return std::get<0>(lhs) < std::get<0>(rhs);
  83.     }
  84.   };
  85.   using SortedList = std::set<Record, RecordComparator>;
  86.   using SortedListIterator = typename SortedList::iterator;

  87.   SortedList list_;
  88.   std::unordered_map<Entry, SortedListIterator> index_;
  89. };


  90. // Eviction policy的单例工厂
  91. template <typename Entry>
  92. class EvictionPolicyFactory {
  93. public:
  94.   static const EvictionPolicyFactory& Instance() {
  95.     static EvictionPolicyFactory<Entry> instance;
  96.     return instance;
  97.   }

  98.   EvictionPolicy<Entry>* createEvictionPolicy(const EvictionPolicyType type) const {
  99.     switch (type) {
  100.       case EvictionPolicyType::kLRU:
  101.         return new LRUPolicy<Entry>();
  102.       case EvictionPolicyType::kLFU:
  103.         return new LFUPolicy<Entry>();
  104.       default:
  105.         break;
  106.     }
  107.     throw std::runtime_error("Unsupported eviction policy type");
  108.   };
  109. };


  110. // 一个简单的泛型Cache实现
  111. template <typename Key, typename Value>
  112. class Cache {
  113. private:
  114.   using Storage = std::unordered_map<Key, Value>;
  115.   using Entry = const typename Storage::value_type*;
  116.   using Factory = EvictionPolicyFactory<Entry>;

  117. public:
  118.   Cache(const std::size_t capacity, const EvictionPolicyType type)
  119.       : capacity_(capacity),
  120.         eviction_policy_(Factory::Instance().createEvictionPolicy(type)) {
  121.   }

  122.   const Value* get(const Key &key) {
  123.     const auto it = storage_.find(key);
  124.     if (it == storage_.end()) {
  125.       return nullptr;
  126.     }

  127.     eviction_policy_->entryReferenced(&(*it));
  128.     return &it->second;
  129.   }

  130.   void put(const Key &key, const Value &value) {
  131.     const auto it = storage_.find(key);
  132.     if (it == storage_.end()) {
  133.       if (storage_.size() >= capacity_) {
  134.         const auto ev = eviction_policy_->chooseEntryToEvict();
  135.         eviction_policy_->entryEvicted(ev);
  136.         storage_.erase(ev->first);
  137.       }
  138.       const auto ret = storage_.emplace(key, value);
  139.       eviction_policy_->entryCreated(&(*ret.first));
  140.     } else {
  141.       it->second = value;
  142.       eviction_policy_->entryReferenced(&(*it));
  143.     }
  144.   }

  145. private:
  146.   const std::size_t capacity_;
  147.   Storage storage_;
  148.   std::unique_ptr<EvictionPolicy<Entry>> eviction_policy_;
  149. };

  150. /*******************************************************************************
  151. * 可以用于提交LeetCode 146的代码

  152. class LRUCache {
  153. public:
  154.   LRUCache(int capacity)
  155.       : cache_(capacity, EvictionPolicyType::kLRU) {
  156.   }

  157.   int get(int key) {
  158.     const int *value = cache_.get(key);
  159.     return value == nullptr ? -1 : *value;
  160.   }

  161.   void put(int key, int value) {
  162.     cache_.put(key, value);
  163.   }

  164. private:
  165.   Cache<int, int> cache_;
  166. };

  167. *******************************************************************************/

  168. template <typename C, typename T>
  169. void Print(C &cache, const T &key) {
  170.   const auto *value = cache.get(key);
  171.   if (value == nullptr) {
  172.     std::cout << "[" << key << "] NULL\n";
  173.   } else {
  174.     std::cout << "[" << key << "] " << *value << "\n";
  175.   }
  176. }

Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts