Tuesday, May 10, 2016

SimpleArrayMap



https://segmentfault.com/a/1190000005052408
https://github.com/android/platform_frameworks_support/blob/master/v4/java/android/support/v4/util/SimpleArrayMap.java
SimpleArrayMap跟v4包中的ArrayMap最大的区别,证明就是ArrayMap继承了SimpleArrayMap,又实现了Map的接口;主要的操作,则是通过引入MapCollections类,使用Map中的Entry结构,这样在ArrayMap中就可以通过Iterator来进行数据的的迭代操作。
  • 思想:SimpleArrayMap采用了两个数组来进行hash值与key、value值得保存,另外,数组大小超过8时,并需要进行扩容时,只增大当前数组大小的一半,并对大小为4和8的数组进行缓存。这样最后带来的好处就是最大程度保证了数组空间都能够被使用,一定程度上避免了内存空间的浪费。
  • 数据结构方式:使用了两个数组,一个是Hash数组,另一个是大小*2的Array数组,为了保证通用性,这里所使用的是Object数组。Array数组中使用key+value间隔存取的方式,偶数为即0 -> key1 1 -> value1 2 -> key2 3 -> value2 。另外Hash数组,则是对应的Key的Hash值数组,并且这是一个有序的int数组,这样在进行Key的查找时,使用二分查找则是最有效率的方式
int[] mHashes;
Object[] mArray;
int mSize;
代码中,mHashes数组为mArray中的key对应的hash值得数组,而mArray即是HashMap中key与value间隔混合的一个数组。
public void clear() {
   if (mSize != 0) {
      freeArrays(mHashes, mArray, mSize);
      mHashes = ContainerHelpers.EMPTY_INTS;
      mArray = ContainerHelpers.EMPTY_OBJECTS;
      mSize = 0;
   }
}
代码中提及的EMPTY_INTSEMPTY_OBJECTS,仅仅如下的两个空数组:
static final int[] EMPTY_INTS = new int[0];

static final Object[] EMPTY_OBJECTS = new Object[0];
在Hash数组中,采取二分查找,定位到指定hash值所对应的index值;之后根据index值,来调整并存放key跟value的值

public V get(Object key) {
   final int index = indexOfKey(key);
   return index >= 0 ? (V)mArray[(index<<1)+1] : null;
}
通过key来获取数据就非常简单了,根据key获取到相应的index值,在array数据中根据index乘2加1返回相应的value即可。
public int indexOfKey(Object key) {
   return key == null ? indexOfNull() : indexOf(key, key.hashCode());
}

由上发现允许key为null,进行index的查询,当key不为空时,通过key及其key的hashCode,来进行查询。
代码中,是先对Hash数组进行二分查找,获取index,之后根据index获取hash数组中对应的值,通过与key来比较是否相等,相等则直接返回,若不相等,则先从index之后的数据进行比较,没找到,则再找之前的数据。可以看出这样是支持存在多个key的hash值相同的情况,那再看看支不支持多个key为null的情况呢?
int indexOf(Object key, int hash) {
    final int N = mSize;

    // Important fast case: if nothing is in here, nothing to look for.
    if (N == 0) {
        return ~0;
    }

    int index = ContainerHelpers.binarySearch(mHashes, N, hash);

    // If the hash code wasn't found, then we have no entry for this key.
    if (index < 0) {
        return index;
    }

    // If the key at the returned index matches, that's what we want.
    if (key.equals(mArray[index<<1])) {
        return index;
    }

    // Search for a matching key after the index.
    int end;
    for (end = index + 1; end < N && mHashes[end] == hash; end++) {
        if (key.equals(mArray[end << 1])) return end;
    }

    // Search for a matching key before the index.
    for (int i = index - 1; i >= 0 && mHashes[i] == hash; i--) {
        if (key.equals(mArray[i << 1])) return i;
    }

    // Key not found -- return negative value indicating where a
    // new entry for this key should go.  We use the end of the
    // hash chain to reduce the number of array entries that will
    // need to be copied when inserting.
    return ~end;
}
从上可以看出当key为null的时候,采取获取的方法跟key不为null获取是很相似的了,都要进行整个数组的遍历,不过这里对应的hash都是为0。但key为null只能在数组中存在一个的,因为在数据的put操作的时候,会对key进行检查,这样保证了key为null只能存在一个。
int indexOfNull() {
    final int N = mSize;

    // Important fast case: if nothing is in here, nothing to look for.
    if (N == 0) {
        return ~0;
    }

    int index = ContainerHelpers.binarySearch(mHashes, N, 0);

    // If the hash code wasn't found, then we have no entry for this key.
    if (index < 0) {
        return index;
    }

    // If the key at the returned index matches, that's what we want.
    if (null == mArray[index<<1]) {
        return index;
    }

    // Search for a matching key after the index.
    int end;
    for (end = index + 1; end < N && mHashes[end] == 0; end++) {
        if (null == mArray[end << 1]) return end;
    }

    // Search for a matching key before the index.
    for (int i = index - 1; i >= 0 && mHashes[i] == 0; i--) {
        if (null == mArray[i << 1]) return i;
    }

    // Key not found -- return negative value indicating where a
    // new entry for this key should go.  We use the end of the
    // hash chain to reduce the number of array entries that will
    // need to be copied when inserting.
    return ~end;
}

public V remove(Object key) {
   final int index = indexOfKey(key);
   if (index >= 0) {
      return removeAt(index);
   }

   return null;
}
这里先忽略hash数组长度的判断(主要进行数组缓存的操作)只看主要的代码,即最后的一个else的代码,使用System.arraycopy方法将hash数组跟array数组中index之后的数据往前移动1位,而将最后一位的数据进行至空。
public V removeAt(int index) {
   final Object old = mArray[(index << 1) + 1];
   if (mSize <= 1) {
      // Now empty.
      if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to 0");
      freeArrays(mHashes, mArray, mSize);
      mHashes = ContainerHelpers.EMPTY_INTS;
      mArray = ContainerHelpers.EMPTY_OBJECTS;
      mSize = 0;
   } else {
      // 满足条件,对数组进行加入缓存的操作。
      if (mHashes.length > (BASE_SIZE*2) && mSize < mHashes.length/3) {
         // Shrunk enough to reduce size of arrays.  We don't allow it to
         // shrink smaller than (BASE_SIZE*2) to avoid flapping between
         // that and BASE_SIZE.
         final int n = mSize > (BASE_SIZE*2) ? (mSize + (mSize>>1)) : (BASE_SIZE*2);

         if (DEBUG) Log.d(TAG, "remove: shrink from " + mHashes.length + " to " + n);

         final int[] ohashes = mHashes;
         final Object[] oarray = mArray;
         allocArrays(n);

         mSize--;
         if (index > 0) {
            if (DEBUG) Log.d(TAG, "remove: copy from 0-" + index + " to 0");
            System.arraycopy(ohashes, 0, mHashes, 0, index);
            System.arraycopy(oarray, 0, mArray, 0, index << 1);
         }
         if (index < mSize) {
            if (DEBUG) Log.d(TAG, "remove: copy from " + (index+1) + "-" + mSize
                  + " to " + index);
            System.arraycopy(ohashes, index + 1, mHashes, index, mSize - index);
            System.arraycopy(oarray, (index + 1) << 1, mArray, index << 1,
                  (mSize - index) << 1);
         }
      } else {
         mSize--;
         if (index < mSize) {
            if (DEBUG) Log.d(TAG, "remove: move " + (index+1) + "-" + mSize
                  + " to " + index);
            System.arraycopy(mHashes, index + 1, mHashes, index, mSize - index);
            System.arraycopy(mArray, (index + 1) << 1, mArray, index << 1,
                  (mSize - index) << 1);
         }
         mArray[mSize << 1] = null;
         mArray[(mSize << 1) + 1] = null;
      }
   }
   return (V)old;
}
public V get(Object key) {
   final int index = indexOfKey(key);
   return index >= 0 ? (V)mArray[(index<<1)+1] : null;
}
代码中,可以看出arrayMap允许key为空,所有的key都不能重复。
另外,在进行容量修改的时候,进行的操作是:mSize跟hash数组长度的判断,当大于等于的时候,需要对数组的容量进行一些扩容,并拷贝数组到新的数组中。(扩容操作:当size大于8, 取size + size /2 ; 当size大于4小于8时, 取8 ,当size小于4时,取4)
public V put(K key, V value) {
   final int hash;
   int index;
   if (key == null) {
      // 查找key为null的情况
      hash = 0;
      index = indexOfNull();
   } else {
      hash = key.hashCode();
      index = indexOf(key, hash);
   }
   if (index >= 0) {
      // 数组中存在相同的key,则更新并返回旧的值
      index = (index<<1) + 1;
      final V old = (V)mArray[index];
      mArray[index] = value;
      return old;
   }

   index = ~index;
   if (mSize >= mHashes.length) {
      // 当容量不够时,需要建立一个新的数组,来进行扩容操作。
      final int n = mSize >= (BASE_SIZE*2) ? (mSize+(mSize>>1))
         : (mSize >= BASE_SIZE ? (BASE_SIZE*2) : BASE_SIZE);

      if (DEBUG) Log.d(TAG, "put: grow from " + mHashes.length + " to " + n);

      final int[] ohashes = mHashes;
      final Object[] oarray = mArray;
      allocArrays(n);

      if (mHashes.length > 0) {
         if (DEBUG) Log.d(TAG, "put: copy 0-" + mSize + " to 0");
         System.arraycopy(ohashes, 0, mHashes, 0, ohashes.length);
         System.arraycopy(oarray, 0, mArray, 0, oarray.length);
      }

      freeArrays(ohashes, oarray, mSize);
   }

   // 将index之后的数据进行后移
   if (index < mSize) {
      if (DEBUG) Log.d(TAG, "put: move " + index + "-" + (mSize-index)
            + " to " + (index+1));
      System.arraycopy(mHashes, index, mHashes, index + 1, mSize - index);
      System.arraycopy(mArray, index << 1, mArray, (index + 1) << 1, (mSize - index) << 1);
   }

   // 赋值给index位置上hash值
   mHashes[index] = hash;
   // 更新array数组中对应的key跟value值。
   mArray[index<<1] = key;
   mArray[(index<<1)+1] = value;
   mSize++;
   return null;
}
讲到这里,就基本可以结束了,而源码中看到了两个神奇的数组,他俩主要的目的是对固定的数组来进行缓存,官方给的说法是避免内存抖动,毕竟这里是纯数组来实现的,而当数组容量不够的时候,就需要建立一个新的数组,这样旧的数组不就浪费了,所以这里的缓存还是灰常必要的。接下来看看他俩是怎样玩的
private static final int BASE_SIZE = 4;

/**
 * Maximum number of entries to have in array caches.
 */
private static final int CACHE_SIZE = 10;

/**
 * Caches of small array objects to avoid spamming garbage.  The cache
 * Object[] variable is a pointer to a linked list of array objects.
 * The first entry in the array is a pointer to the next array in the
 * list; the second entry is a pointer to the int[] hash code array for it.
 */
static Object[] mBaseCache;
static int mBaseCacheSize;
static Object[] mTwiceBaseCache;
static int mTwiceBaseCacheSize;
代码中有两个静态的Object数组,这两个静态数组采用链表的方式来缓存所有的数组。即Object数组会用来指向array数组,而这个array的第一个值为指针,指向下一个array,而第二个值是对应的hash数组,其他的值则为空。另外,缓存数组即baseCache和twiceBaseCache,它俩大小容量的限制:最小值为4,最大值为10,而BaseCache数组主要存储的是容量为4的数组,twiceBaseCache主要存储容量为8的数组。
这个时候,当size跟缓存的数组大小相同,即要么等于4,要么等于8,即可从缓存中拿取数组来用。这里主要的操作就是baseCache指针的移动,指向array[0]指向的指针,hash数组即为array[0],而当前的这个array咱们就可以使用了。
  • SimpleArrayMap是可以替代ArrayMap来使用的,区别只是其内部采用单纯的数组来实现,而ArrayMap中采用了EntrySet跟KeySet的结构,这样方便使用Iterator来数据的遍历获取。
  • ArrayMap适用于少量的数据,因为存取的复杂度,对数量过大的就不太合适。这个量笔者建议破百就放弃ArrayMap的使用吧。
  • ArrayMap支持key为null,但数组只能有一个key为null的存在。另外,允许多个key的hash值相同,不过尽量避免吧,不然二分查找获取不到,又会进行遍历查找;而key都必须是唯一,不能重复的。
  • 主要目的是避免占用大量的内存切无法得到地充分利用。
  • 对容量为4和容量为8的数组,进行缓存,来防止内存抖动的发生。


Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts