java集合王国——映射部门之HashMap大臣有话说
2018-08-06 09:00:53来源:博客园 阅读 ()
目录
1、HashMap述职:
2、HashMap提供的服务:
2.1、获得HashMap对象实例
2.2、HashMap常见操作
3、HashMap工作组成:
3.1、属性
3.2、构造方法
3.3、数据结构
4、HashMap的业务实现原理(存取实现):
4.1、存的实现
4.2、取的实现
5、HashMap的大脑——hash算法
6、HashMap的业务量——resize
7、HashMap的效率——性能
8、更优秀的HashMap——ConcurrentHashMap
正文
看过很多HashMap的学习文档,也自己点过HashMap的源码文件,看别人写的很好,但是总会发现总会有这里那里不是很合自己的心意,所以把自己的一些认识以及总结写一下。有错误的烦请指正,谢谢!(ps:当前我用的jdk版本是1.8.0_144)
1、HashMap述职:
HashMap是基于哈希表的 Map 接口的实现。此实现提供所有可选的映射操作(即:key-value形式),并允许使用 null 值和 null 键。(除了不同步和允许使用 null 之外,HashMap 类与 Hashtable 大致相同。)此类不保证映射的顺序,特别是它不保证该顺序恒久不变。
值得注意的是HashMap不是线程安全的,如果想要线程安全的HashMap,可以通过Collections类的静态方法synchronizedMap获得线程安全的HashMap。
2、HashMap提供的服务——(HashMap常见的使用):
2.1、获得HashMap对象实例(虽然HashMap有涉及到泛型的情况,但在本篇不详细讲解,故代码不深入泛型设置)
HashMap hashMap = new HashMap<>();//获得默认的空的HashMap集合
2.2、HashMap常见操作
HashMap hashMap = new HashMap<>(); hashMap.put("book", "java程序语言设计");//添加键--值 hashMap.put("language", "java"); System.out.println(hashMap); String book = (String) hashMap.get("book");//获取键--值 System.out.println(book); hashMap.put("book", "C语言程序设计");//修改键的值 System.out.println(hashMap); hashMap.remove(book);//删除键--值 System.out.println(hashMap); int hashSize = hashMap.size();//元素个数 System.out.println(hashSize);
结果显示是:
{book=java程序语言设计, language=java} java程序语言设计 {book=C语言程序设计, language=java} {book=C语言程序设计, language=java} 2
3、HashMap工作组成——(源码构成):
HashMap类的相关联的类图图和代码描述如下:
public class HashMap<K, V> extends AbstractMap<K, V> implements Map<K, V>, Cloneable, Serializable {
3.1、属性
private static final long serialVersionUID = 362498820763181265L; static final int DEFAULT_INITIAL_CAPACITY = 16;//默认大小 static final int MAXIMUM_CAPACITY = 1073741824; static final float DEFAULT_LOAD_FACTOR = 0.75F; static final int TREEIFY_THRESHOLD = 8; static final int UNTREEIFY_THRESHOLD = 6; static final int MIN_TREEIFY_CAPACITY = 64; transient HashMap.Node<K, V>[] table;//存储元素的实体数组, 这里需要注意这句话,至于原因后面会讲到 transient Set<Entry<K, V>> entrySet; transient int size;//存放元素的个数 transient int modCount;//被修改的次数 modCount声明为volatile,保证线程之间修改的可见性。(volatile之所以线程安全是因为被volatile修饰的变量不保存缓存,直接在内存中修改,因此能够保证线程之间修改的可见性)。 int threshold;//临界值 当实际大小超过临界值时,会进行扩容threshold = 加载因子*容量 final float loadFactor;//加载因子,表示Hsah表中元素的填满的程度.
若:加载因子越大,填满的元素越多,好处是,空间利用率高了,但:冲突的机会加大了.链表长度会越来越长,查找效率降低。
反之,加载因子越小,填满的元素越少,好处是:冲突的机会减小了,但:空间浪费多了.表中的数据将过于稀疏(很多空间还没用,就开始扩容了)
冲突的机会越大,则查找的成本越高.
因此,必须在 "冲突的机会"与"空间利用率"之间寻找一种平衡与折衷. 这种平衡与折衷本质上是数据结构中有名的"时-空"矛盾的平衡与折衷.
如果机器内存足够,并且想要提高查询速度的话可以将加载因子设置小一点;相反如果机器内存紧张,并且对查询速度没有什么要求的话可以将加载因子设置大一点。不过一般我们都不用去设置它,让它取默认值0.75就好了。
3.2、构造方法
/** 以指定初始容量、指定的负载因子创建一个 HashMap。 * Constructs an empty <tt>HashMap</tt> with the specified initial * capacity and load factor. * * @param initialCapacity the initial capacity * @param loadFactor the load factor * @throws IllegalArgumentException if the initial capacity is negative * or the load factor is nonpositive */ public HashMap(int initialCapacity, float loadFactor) { if (initialCapacity < 0)// 初始容量不能为负数 throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity); if (initialCapacity > MAXIMUM_CAPACITY)// 如果初始容量大于最大容量,给予最大容量 initialCapacity = MAXIMUM_CAPACITY; if (loadFactor <= 0 || Float.isNaN(loadFactor))// 负载因子必须大于 0 的数值 throw new IllegalArgumentException("Illegal load factor: " + loadFactor); this.loadFactor = loadFactor; this.threshold = tableSizeFor(initialCapacity);//计算出大于 initialCapacity 的最小的 2 的 n 次方值。
} /** 构建一个初始容量为 initialCapacity,负载因子为 0.75 的 HashMap。 * Constructs an empty <tt>HashMap</tt> with the specified initial * capacity and the default load factor (0.75). * * @param initialCapacity the initial capacity. * @throws IllegalArgumentException if the initial capacity is negative. */ public HashMap(int initialCapacity) { this(initialCapacity, DEFAULT_LOAD_FACTOR); } /** 构建一个初始容量为 16,负载因子为 0.75 的 HashMap * Constructs an empty <tt>HashMap</tt> with the default initial capacity * (16) and the default load factor (0.75). */ public HashMap() { this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted } /** * Constructs a new <tt>HashMap</tt> with the same mappings as the * specified <tt>Map</tt>. The <tt>HashMap</tt> is created with * default load factor (0.75) and an initial capacity sufficient to * hold the mappings in the specified <tt>Map</tt>. * * @param m the map whose mappings are to be placed in this map * @throws NullPointerException if the specified map is null */ public HashMap(Map<? extends K, ? extends V> m) { this.loadFactor = DEFAULT_LOAD_FACTOR; putMapEntries(m, false); }
/** * Returns a power of two size for the given target capacity. */ static final int tableSizeFor(int cap) {
// 计算出大于 initialCapacity 的最小的 2 的 n 次方值。
int n = cap - 1; n |= n >>> 1; n |= n >>> 2; n |= n >>> 4; n |= n >>> 8; n |= n >>> 16; return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1; }
真正实例化table数组的是,put功能的方法
/** * Associates the specified value with the specified key in this map. * If the map previously contained a mapping for the key, the old * value is replaced. * * @param key key with which the specified value is to be associated * @param value value to be associated with the specified key * @return the previous value associated with <tt>key</tt>, or * <tt>null</tt> if there was no mapping for <tt>key</tt>. * (A <tt>null</tt> return can also indicate that the map * previously associated <tt>null</tt> with <tt>key</tt>.) */ public V put(K key, V value) { return putVal(hash(key), key, value, false, true); } /** * Implements Map.put and related methods * * @param hash hash for key * @param key the key * @param value the value to put * @param onlyIfAbsent if true, don't change existing value * @param evict if false, the table is in creation mode. * @return previous value, or null if none */ final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) { Node<K,V>[] tab; Node<K,V> p; int n, i; if ((tab = table) == null || (n = tab.length) == 0) n = (tab = resize()).length;//进行初始化数组 if ((p = tab[i = (n - 1) & hash]) == null) tab[i] = newNode(hash, key, value, null);//进行初始化数组 else { Node<K,V> e; K k; if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))) e = p; else if (p instanceof TreeNode) e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value); else { for (int binCount = 0; ; ++binCount) { if ((e = p.next) == null) { p.next = newNode(hash, key, value, null); if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st treeifyBin(tab, hash); break; } if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) break; p = e; } } if (e != null) { // existing mapping for key V oldValue = e.value; if (!onlyIfAbsent || oldValue == null) e.value = value; afterNodeAccess(e); return oldValue; } } ++modCount; if (++size > threshold) resize(); afterNodeInsertion(evict); return null; } 以及 // 把链表转为红黑树 /** * Replaces all linked nodes in bin at index for given hash unless * table is too small, in which case resizes instead. */ final void treeifyBin(Node<K,V>[] tab, int hash) { int n, index; Node<K,V> e; if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY) resize(); else if ((e = tab[index = (n - 1) & hash]) != null) { TreeNode<K,V> hd = null, tl = null; do { TreeNode<K,V> p = replacementTreeNode(e, null); if (tl == null) hd = p; else { p.prev = tl; tl.next = p; } tl = p; } while ((e = e.next) != null); if ((tab[index] = hd) != null) hd.treeify(tab); } } /** * Copies all of the mappings from the specified map to this map. * These mappings will replace any mappings that this map had for * any of the keys currently in the specified map. * * @param m mappings to be stored in this map * @throws NullPointerException if the specified map is null */ public void putAll(Map<? extends K, ? extends V> m) { putMapEntries(m, true); } /** * Constructs a new <tt>HashMap</tt> with the same mappings as the * specified <tt>Map</tt>. The <tt>HashMap</tt> is created with * default load factor (0.75) and an initial capacity sufficient to * hold the mappings in the specified <tt>Map</tt>. * * @param m the map whose mappings are to be placed in this map * @throws NullPointerException if the specified map is null */ public HashMap(Map<? extends K, ? extends V> m) { this.loadFactor = DEFAULT_LOAD_FACTOR; putMapEntries(m, false); } /** * Implements Map.putAll and Map constructor * * @param m the map * @param evict false when initially constructing this map, else * true (relayed to method afterNodeInsertion). */ final void putMapEntries(Map<? extends K, ? extends V> m, boolean evict) { int s = m.size(); if (s > 0) { if (table == null) { // pre-size float ft = ((float)s / loadFactor) + 1.0F; int t = ((ft < (float)MAXIMUM_CAPACITY) ? (int)ft : MAXIMUM_CAPACITY); if (t > threshold) threshold = tableSizeFor(t); } else if (s > threshold) resize(); for (Map.Entry<? extends K, ? extends V> e : m.entrySet()) { K key = e.getKey(); V value = e.getValue(); putVal(hash(key), key, value, false, evict); } } } 真正底层的实例化过程在resize /** * Initializes or doubles table size. If null, allocates in * accord with initial capacity target held in field threshold. * Otherwise, because we are using power-of-two expansion, the * elements from each bin must either stay at same index, or move * with a power of two offset in the new table. * * @return the table */ final Node<K,V>[] resize() { Node<K,V>[] oldTab = table; int oldCap = (oldTab == null) ? 0 : oldTab.length; int oldThr = threshold; int newCap, newThr = 0; if (oldCap > 0) { if (oldCap >= MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return oldTab; } else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr << 1; // double threshold } else if (oldThr > 0) // initial capacity was placed in threshold newCap = oldThr; else { // zero initial threshold signifies using defaults newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } if (newThr == 0) { float ft = (float)newCap * loadFactor; newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ? (int)ft : Integer.MAX_VALUE); } threshold = newThr; @SuppressWarnings({"rawtypes","unchecked"}) Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];//进行数组初始化 table = newTab; if (oldTab != null) { for (int j = 0; j < oldCap; ++j) { Node<K,V> e; if ((e = oldTab[j]) != null) { oldTab[j] = null; if (e.next == null) newTab[e.hash & (newCap - 1)] = e; else if (e instanceof TreeNode) ((TreeNode<K,V>)e).split(this, newTab, j, oldCap); else { // preserve order Node<K,V> loHead = null, loTail = null; Node<K,V> hiHead = null, hiTail = null; Node<K,V> next; do { next = e.next; if ((e.hash & oldCap) == 0) { if (loTail == null) loHead = e; else loTail.next = e; loTail = e; } else { if (hiTail == null) hiHead = e; else hiTail.next = e; hiTail = e; } } while ((e = next) != null); if (loTail != null) { loTail.next = null; newTab[j] = loHead; } if (hiTail != null) { hiTail.next = null; newTab[j + oldCap] = hiHead; } } } } } return newTab; }
newNode方式如下:
// Create a regular (non-tree) node Node<K,V> newNode(int hash, K key, V value, Node<K,V> next) { return new Node<>(hash, key, value, next); }
我们可以看到在构造HashMap的时候如果我们指定了加载因子和初始容量的话就调用第一个构造方法,否则的话就是用默认的。默认初始容量为16,默认加载因子为0.75。我们可以看到上面代码中,tableSizeFor()方法确保容量为2的n次幂,使capacity为大于initialCapacity的最小的2的n次幂,至于为什么要把容量设置为2的n次幂,我们等下再看。
3.3、数据结构
在jdk8中总共有两种存储类
1 // 1. 哈希冲突时采用链表法的类,一个哈希桶多于8个元素改为TreeNode 2 static class Node<K,V> implements Map.Entry<K,V> 3 // 2. 哈希冲突时采用红黑树存储的类,一个哈希桶少于6个元素改为Node 4 static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V>
HashMap的底层主要是基于数组和链表来实现的,它之所以有相当快的查询速度主要是因为它是通过计算散列码来决定存储的位置。HashMap中主要是通过key的hashCode来计算hash值的,只要hashCode相同,计算出来的hash值就一样。如果存储的对象对多了,就有可能不同的对象所算出来的hash值是相同的,这就出现了所谓的hash冲突。学过数据结构的同学都知道,解决hash冲突的方法有很多,HashMap底层是通过链表来解决hash冲突的。示例图如下:
(一 图中,紫色部分即代表哈希表,也称为哈希数组,数组的每个元素都是一个单链表的头节点,链表是用来解决冲突的,如果不同的key映射到了数组的同一位置处,就将其放入单链表中。)
从上图中可以看出,HashMap底层就是一个数组结构(即属性里面的table数组: transient HashMap.Node<K, V>[] table;),数组中的每一项又是一个链表。当新建一个HashMap的时候,就会初始化一个数组。源码如下(即HashMap.Node<K, V>):
/**
Node 是单向链表。
它是 “HashMap链式存储法”对应的链表。
它实现了Map.Entry 接口,即实现getKey(), getValue(), setValue(V value), equals(Object o), hashCode()这些函数
*/
/** * Basic hash bin node, used for most entries. (See below for * TreeNode subclass, and in LinkedHashMap for its Entry subclass. */ static class Node<K,V> implements Map.Entry<K,V> { final int hash; final K key; V value; Node<K,V> next;//指向下一个节点
// 构造函数。 输入参数包括"哈希值(h)", "键(k)", "值(v)", "下一节点(n)"
Node(int hash, K key, V value, Node<K,V> next) { this.hash = hash; this.key = key; this.value = value; this.next = next; } public final K getKey() { return key; } public final V getValue() { return value; } public final String toString() { return key + "=" + value; } public final int hashCode() { return Objects.hashCode(key) ^ Objects.hashCode(value); } public final V setValue(V newValue) { V oldValue = value; value = newValue; return oldValue; } // 判断两个Entry是否相等,若两个Entry的“key”和“value”都相等,则返回true。否则,返回false public final boolean equals(Object o) { if (o == this) return true; if (o instanceof Map.Entry) { Map.Entry<?,?> e = (Map.Entry<?,?>)o; if (Objects.equals(key, e.getKey()) && Objects.equals(value, e.getValue())) return true; } return false; } }
HashMap其实就是一个Map.Entry数组,Entry对象中包含了键和值,其中next也是一个Entry对象,它就是用来处理hash冲突的,形成一个链表。
当我们往hashmap中put元素的时候,先根据key的hash值得到这个元素在数组中的位置(即下标),然后就可以把这个元素放到对应的位置中了。如果这个元素所在的位子上已经存放有其他元素了,那么在同一个位子上的元素将以链表的形式存放,新加入的放在链头,最先加入的放在链尾。从hashmap中get元素时,首先计算key的hashcode,找到数组中对应位置的某一元素,然后通过key的equals方法在对应位置的链表中找到需要的元素。从这里我们可以想象得到,如果每个位置上的链表只有一个元素,那么hashmap的get效率将是最高的,但是理想总是美好的,现实总是有困难需要我们去克服,哈哈~
接下来解决hash冲突的红黑树类TreeNode如下:
1 /** 2 * Entry for Tree bins. Extends LinkedHashMap.Entry (which in turn 3 * extends Node) so can be used as extension of either regular or 4 * linked node. 5 */ 6 static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> { 7 TreeNode<K,V> parent; // red-black tree links 8 TreeNode<K,V> left; 9 TreeNode<K,V> right; 10 TreeNode<K,V> prev; // needed to unlink next upon deletion 11 boolean red; 12 TreeNode(int hash, K key, V val, Node<K,V> next) { 13 super(hash, key, val, next); 14 } 15 16 /** 17 * Returns root of tree containing this node. 18 */ 19 final TreeNode<K,V> root() { 20 for (TreeNode<K,V> r = this, p;;) { 21 if ((p = r.parent) == null) 22 return r; 23 r = p; 24 } 25 } 26 27 /** 28 * Ensures that the given root is the first node of its bin. 29 */ 30 static <K,V> void moveRootToFront(Node<K,V>[] tab, TreeNode<K,V> root) { 31 int n; 32 if (root != null && tab != null && (n = tab.length) > 0) { 33 int index = (n - 1) & root.hash; 34 TreeNode<K,V> first = (TreeNode<K,V>)tab[index]; 35 if (root != first) { 36 Node<K,V> rn; 37 tab[index] = root; 38 TreeNode<K,V> rp = root.prev; 39 if ((rn = root.next) != null) 40 ((TreeNode<K,V>)rn).prev = rp; 41 if (rp != null) 42 rp.next = rn; 43 if (first != null) 44 first.prev = root; 45 root.next = first; 46 root.prev = null; 47 } 48 assert checkInvariants(root); 49 } 50 } 51 52 /** 53 * Finds the node starting at root p with the given hash and key. 54 * The kc argument caches comparableClassFor(key) upon first use 55 * comparing keys. 56 */ 57 final TreeNode<K,V> find(int h, Object k, Class<?> kc) { 58 TreeNode<K,V> p = this; 59 do { 60 int ph, dir; K pk; 61 TreeNode<K,V> pl = p.left, pr = p.right, q; 62 if ((ph = p.hash) > h) 63 p = pl; 64 else if (ph < h) 65 p = pr; 66 else if ((pk = p.key) == k || (k != null && k.equals(pk))) 67 return p; 68 else if (pl == null) 69 p = pr; 70 else if (pr == null) 71 p = pl; 72 else if ((kc != null || 73 (kc = comparableClassFor(k)) != null) && 74 (dir = compareComparables(kc, k, pk)) != 0) 75 p = (dir < 0) ? pl : pr; 76 else if ((q = pr.find(h, k, kc)) != null) 77 return q; 78 else 79 p = pl; 80 } while (p != null); 81 return null; 82 } 83 84 /** 85 * Calls find for root node. 86 */ 87 final TreeNode<K,V> getTreeNode(int h, Object k) { 88 return ((parent != null) ? root() : this).find(h, k, null); 89 } 90 91 /** 92 * Tie-breaking utility for ordering insertions when equal 93 * hashCodes and non-comparable. We don't require a total 94 * order, just a consistent insertion rule to maintain 95 * equivalence across rebalancings. Tie-breaking further than 96 * necessary simplifies testing a bit. 97 */ 98 static int tieBreakOrder(Object a, Object b) { 99 int d; 100 if (a == null || b == null || 101 (d = a.getClass().getName(). 102 compareTo(b.getClass().getName())) == 0) 103 d = (System.identityHashCode(a) <= System.identityHashCode(b) ? 104 -1 : 1); 105 return d; 106 } 107 108 /** 109 * Forms tree of the nodes linked from this node. 110 * @return root of tree 111 */ 112 final void treeify(Node<K,V>[] tab) { 113 TreeNode<K,V> root = null; 114 for (TreeNode<K,V> x = this, next; x != null; x = next) { 115 next = (TreeNode<K,V>)x.next; 116 x.left = x.right = null; 117 if (root == null) { 118 x.parent = null; 119 x.red = false; 120 root = x; 121 } 122 else { 123 K k = x.key; 124 int h = x.hash; 125 Class<?> kc = null; 126 for (TreeNode<K,V> p = root;;) { 127 int dir, ph; 128 K pk = p.key; 129 if ((ph = p.hash) > h) 130 dir = -1; 131 else if (ph < h) 132 dir = 1; 133 else if ((kc == null && 134 (kc = comparableClassFor(k)) == null) || 135 (dir = compareComparables(kc, k, pk)) == 0) 136 dir = tieBreakOrder(k, pk); 137 138 TreeNode<K,V> xp = p; 139 if ((p = (dir <= 0) ? p.left : p.right) == null) { 140 x.parent = xp; 141 if (dir <= 0) 142 xp.left = x; 143 else 144 xp.right = x; 145 root = balanceInsertion(root, x); 146 break; 147 } 148 } 149 } 150 } 151 moveRootToFront(tab, root); 152 } 153 154 /** 155 * Returns a list of non-TreeNodes replacing those linked from 156 * this node. 157 */ 158 final Node<K,V> untreeify(HashMap<K,V> map) { 159 Node<K,V> hd = null, tl = null; 160 for (Node<K,V> q = this; q != null; q = q.next) { 161 Node<K,V> p = map.replacementNode(q, null); 162 if (tl == null) 163 hd = p; 164 else 165 tl.next = p; 166 tl = p; 167 } 168 return hd; 169 } 170 171 /** 172 * Tree version of putVal. 173 */ 174 final TreeNode<K,V> putTreeVal(HashMap<K,V> map, Node<K,V>[] tab, 175 int h, K k, V v) { 176 Class<?> kc = null; 177 boolean searched = false; 178 TreeNode<K,V> root = (parent != null) ? root() : this; 179 for (TreeNode<K,V> p = root;;) { 180 int dir, ph; K pk; 181 if ((ph = p.hash) > h) 182 dir = -1; 183 else if (ph < h) 184 dir = 1; 185 else if ((pk = p.key) == k || (k != null && k.equals(pk))) 186 return p; 187 else if ((kc == null && 188 (kc = comparableClassFor(k)) == null) || 189 (dir = compareComparables(kc, k, pk)) == 0) { 190 if (!searched) { 191 TreeNode<K,V> q, ch; 192 searched = true; 193 if (((ch = p.left) != null && 194 (q = ch.find(h, k, kc)) != null) || 195 ((ch = p.right) != null && 196 (q = ch.find(h, k, kc)) != null)) 197 return q; 198 } 199 dir = tieBreakOrder(k, pk); 200 } 201 202 TreeNode<K,V> xp = p; 203 if ((p = (dir <= 0) ? p.left : p.right) == null) { 204 Node<K,V> xpn = xp.next; 205 TreeNode<K,V> x = map.newTreeNode(h, k, v, xpn); 206 if (dir <= 0) 207 xp.left = x; 208 else 209 xp.right = x; 210 xp.next = x; 211 x.parent = x.prev = xp; 212 if (xpn != null) 213 ((TreeNode<K,V>)xpn).prev = x; 214 moveRootToFront(tab, balanceInsertion(root, x)); 215 return null; 216 } 217 } 218 } 219 220 /** 221 * Removes the given node, that must be present before this call. 222 * This is messier than typical red-black deletion code because we 223 * cannot swap the contents of an interior node with a leaf 224 * successor that is pinned by "next" pointers that are accessible 225 * independently during traversal. So instead we swap the tree 226 * linkages. If the current tree appears to have too few nodes, 227 * the bin is converted back to a plain bin. (The test triggers 228 * somewhere between 2 and 6 nodes, depending on tree structure). 229 */ 230 final void removeTreeNode(HashMap<K,V> map, Node<K,V>[] tab, 231 boolean movable) { 232 int n; 233 if (tab == null || (n = tab.length) == 0) 234 return; 235 int index = (n - 1) & hash; 236 TreeNode<K,V> first = (TreeNode<K,V>)tab[index], root = first, rl; 237 TreeNode<K,V> succ = (TreeNode<K,V>)next, pred = prev; 238 if (pred == null) 239 tab[index] = first = succ; 240 else 241 pred.next = succ; 242 if (succ != null) 243 succ.prev = pred; 244 if (first == null) 245 return; 246 if (root.parent != null) 247 root = root.root(); 248 if (root == null || root.right == null || 249 (rl = root.left) == null || rl.left == null) { 250 tab[index] = first.untreeify(map); // too small 251 return; 252 } 253 TreeNode<K,V> p = this, pl = left, pr = right, replacement; 254 if (pl != null && pr != null) { 255 TreeNode<K,V> s = pr, sl; 256 while ((sl = s.left) != null) // find successor 257 s = sl; 258 boolean c = s.red; s.red = p.red; p.red = c; // swap colors 259 TreeNode<K,V> sr = s.right; 260 TreeNode<K,V> pp = p.parent; 261 if (s == pr) { // p was s's direct parent 262 p.parent = s; 263 s.right = p; 264 } 265 else { 266 TreeNode<K,V> sp = s.parent; 267 if ((p.parent = sp) != null) { 268 if (s == sp.left) 269 sp.left = p; 270 else 271 sp.right = p; 272 } 273 if ((s.right = pr) != null) 274 pr.parent = s; 275 } 276 p.left = null; 277 if ((p.right = sr) != null) 278 sr.parent = p; 279 if ((s.left = pl) != null) 280 pl.parent = s; 281 if ((s.parent = pp) == null) 282 root = s; 283 else if (p == pp.left) 284 pp.left = s; 285 else 286 pp.right = s; 287 if (sr != null) 288 replacement = sr; 289 else 290 replacement = p; 291 } 292 else if (pl != null) 293 replacement = pl; 294 else if (pr != null) 295 replacement = pr; 296 else 297 replacement = p; 298 if (replacement != p) { 299 TreeNode<K,V> pp = replacement.parent = p.parent; 300 if (pp == null) 301 root = replacement; 302 else if (p == pp.left) 303 pp.left = replacement; 304 else 305 pp.right = replacement; 306 p.left = p.right = p.parent = null; 307 } 308 309 TreeNode<K,V> r = p.red ? root : balanceDeletion(root, replacement); 310 311 if (replacement == p) { // detach 312 TreeNode<K,V> pp = p.parent; 313 p.parent = null; 314 if (pp != null) { 315 if (p == pp.left) 316 pp.left = null; 317 else if (p == pp.right) 318 pp.right = null; 319 } 320 } 321 if (movable) 322 moveRootToFront(tab, r); 323 } 324 325 /** 326 * Splits nodes in a tree bin into lower and upper tree bins, 327 * or untreeifies if now too small. Called only from resize; 328 * see above discussion about split bits and indices. 329 * 330 * @param map the map 331 * @param tab the table for recording bin heads 332 * @param index the index of the table being split 333 * @param bit the bit of hash to split on 334 */ 335 final void split(HashMap<K,V> map, Node<K,V>[] tab, int index, int bit) { 336 TreeNode<K,V> b = this; 337 // Relink into lo and hi lists, preserving order 338 TreeNode<K,V> loHead = null, loTail = null; 339 TreeNode<K,V> hiHead = null, hiTail = null; 340 int lc = 0, hc = 0; 341 for (TreeNode<K,V> e = b, next; e != null; e = next) { 342 next = (TreeNode<K,V>)e.next; 343 e.next = null; 344 if ((e.hash & bit) == 0) { 345 if ((e.prev = loTail) == null) 346 loHead = e; 347 else 348 loTail.next = e; 349 loTail = e; 350 ++lc; 351 } 352 else { 353 if ((e.prev = hiTail) == null) 354 hiHead = e; 355 else 356 hiTail.next = e; 357 hiTail = e; 358 ++hc; 359 } 360 } 361 362 if (loHead != null) { 363 if (lc <= UNTREEIFY_THRESHOLD) 364 tab[index] = loHead.untreeify(map); 365 else { 366 tab[index] = loHead; 367 if (hiHead != null) // (else is already treeified) 368 loHead.treeify(tab); 369 } 370 } 371 if (hiHead != null) { 372 if (hc <= UNTREEIFY_THRESHOLD) 373 tab[index + bit] = hiHead.untreeify(map); 374 else { 375 tab[index + bit] = hiHead; 376 if (loHead != null) 377 hiHead.treeify(tab); 378 } 379 } 380 } 381 382 /* ------------------------------------------------------------ */ 383 // Red-black tree methods, all adapted from CLR 384 385 static <K,V> TreeNode<K,V> rotateLeft(TreeNode<K,V> root, 386 TreeNode<K,V> p) { 387 TreeNode<K,V> r, pp, rl; 388 if (p != null && (r = p.right) != null) { 389 if ((rl = p.right = r.left) != null) 390 rl.parent = p; 391 if ((pp = r.parent = p.parent) == null) 392 (root = r).red = false; 393 else if (pp.left == p) 394 pp.left = r; 395 else 396 pp.right = r; 397 r.left = p; 398 p.parent = r; 399 } 400 return root; 401 } 402 403 static <K,V> TreeNode<K,V> rotateRight(TreeNode<K,V> root, 404 TreeNode<K,V> p) { 405 TreeNode<K,V> l, pp, lr; 406 if (p != null && (l = p.left) != null) { 407 if ((lr = p.left = l.right) != null) 408 lr.parent = p; 409 if ((pp = l.parent = p.parent) == null) 410 (root = l).red = false; 411 else if (pp.right == p) 412 pp.right = l; 413 else 414 pp.left = l; 415 l.right = p; 416 p.parent = l; 417 } 418 return root; 419 } 420 421 static <K,V> TreeNode<K,V> balanceInsertion(TreeNode<K,V> root, 422 TreeNode<K,V> x) { 423 x.red = true; 424 for (TreeNode<K,V> xp, xpp, xppl, xppr;;) { 425 if ((xp = x.parent) == null) { 426 x.red = false; 427 return x; 428 } 429 else if (!xp.red || (xpp = xp.parent) == null) 430 return root; 431 if (xp == (xppl = xpp.left)) { 432 if ((xppr = xpp.right) != null && xppr.red) { 433 xppr.red = false; 434 xp.red = false; 435 xpp.red = true; 436 x = xpp; 437 } 438 else { 439 if (x == xp.right) { 440 root = rotateLeft(root, x = xp); 441 xpp = (xp = x.parent) == null ? null : xp.parent; 442 } 443 if (xp != null) { 444 xp.red = false; 445 if (xpp != null) { 446 xpp.red = true; 447 root = rotateRight(root, xpp); 448 } 449 } 450 } 451 } 452 else { 453 if (xppl != null && xppl.red) { 454 xppl.red = false; 455 xp.red = false; 456 xpp.red = true; 457 x = xpp; 458 } 459 else { 460 if (x == xp.left) { 461 root = rotateRight(root, x = xp); 462 xpp = (xp = x.parent) == null ? null : xp.parent; 463 } 464 if (xp != null) { 465 xp.red = false; 466 if (xpp != null) { 467 xpp.red = true; 468 root = rotateLeft(root, xpp); 469 } 470 } 471 } 472 } 473 } 474 } 475 476 static <K,V> TreeNode<K,V> balanceDeletion(TreeNode<K,V> root, 477 TreeNode<K,V> x) { 478 for (TreeNode<K,V> xp, xpl, xpr;;) { 479 if (x == null || x == root) 480 return root; 481 else if ((xp = x.parent) == null) { 482 x.red = false; 483 return x; 484 } 485 else if (x.red) { 486 x.red = false; 487 return root; 488 } 489 else if ((xpl = xp.left) == x) { 490 if ((xpr = xp.right) != null && xpr.red) { 491 xpr.red = false; 492 xp.red = true; 493 root = rotateLeft(root, xp); 494 xpr = (xp = x.parent) == null ? null : xp.right; 495 } 496 if (xpr == null) 497 x = xp; 498 else { 499 TreeNode<K,V> sl = xpr.left, sr = xpr.right; 500 if ((sr == null || !sr.red) && 501 (sl == null || !sl.red)) { 502 xpr.red = true; 503 x = xp; 504 } 505 else { 506 if (sr == null || !sr.red) { 507 if (sl != null) 508 sl.red = false; 509 xpr.red = true; 510 root = rotateRight(root, xpr); 511 xpr = (xp = x.parent) == null ? 512 null : xp.right; 513 } 514 if (xpr != null) { 515 xpr.red = (xp == null) ? false : xp.red; 516 if ((sr = xpr.right) != null) 517 sr.red = false; 518 } 519 if (xp != null) { 520 xp.red = false; 521 root = rotateLeft(root, xp); 522 } 523 x = root; 524 } 525 } 526 } 527 else { // symmetric 528 if (xpl != null && xpl.red) { 529 xpl.red = false; 530 xp.red = true; 531 root = rotateRight(root, xp); 532 xpl = (xp = x.parent) == null ? null : xp.left; 533 } 534 if (xpl == null) 535 x = xp; 536 else { 537 TreeNode<K,V> sl = xpl.left, sr = xpl.right; 538 if ((sl == null || !sl.red) && 539 (sr == null || !sr.red)) { 540 xpl.red = true; 541 x = xp; 542 } 543 else { 544 if (sl == null || !sl.red) { 545 if (sr != null) 546 sr.red = false; 547 xpl.red = true; 548 root = rotateLeft(root, xpl); 549 xpl = (xp = x.parent) == null ? 550 null : xp.left; 551 } 552 if (xpl != null) { 553 xpl.red = (xp == null) ? false : xp.red; 554 if ((sl = xpl.left) != null) 555 sl.red = false; 556 } 557 if (xp != null) { 558 xp.red = false; 559 root = rotateRight(root, xp); 560 } 561 x = root; 562 } 563 } 564 } 565 } 566 } 567 568 /** 569 * Recursive invariant check 570 */ 571 static <K,V> boolean checkInvariants(TreeNode<K,V> t) { 572 TreeNode<K,V> tp = t.parent, tl = t.left, tr = t.right, 573 tb = t.prev, tn = (TreeNode<K,V>)t.next; 574 if (tb != null && tb.next != t) 575 return false; 576 if (tn != null && tn.prev != t) 577 return false; 578 if (tp != null && t != tp.left && t != tp.right) 579 return false; 580 if (tl != null && (tl.parent != t || tl.hash > t.hash)) 581 return false; 582 if (tr != null && (tr.parent != t || tr.hash < t.hash)) 583 return false; 584 if (t.red && tl != null && tl.red && tr != null && tr.red) 585 return false; 586 if (tl != null && !checkInvariants(tl)) 587 return false; 588 if (tr != null && !checkInvariants(tr)) 589 return false; 590 return true; 591 } 592 }
4、HashMap的业务实现原理(存取实现):
下面我们重点分析下HashMap中用的最多的两个方法put和get,即存储key——value,以及取key——value。
4.1、存的实现
存功能的代码如下(虽然上面有贴,但这里我们重新贴一下):
/** * Associates the specified value with the specified key in this map. * If the map previously contained a mapping for the key, the old * value is replaced. * * @param key key with which the specified value is to be associated * @param value value to be associated with the specified key * @return the previous value associated with <tt>key</tt>, or * <tt>null</tt> if there was no mapping for <tt>key</tt>. * (A <tt>null</tt> return can also indicate that the map * previously associated <tt>null</tt> with <tt>key</tt>.) */ public V put(K key, V value) { return putVal(hash(key), key, value, false, true); }
可以看到put方法是去调用putVal方法的,当然在调用之前是去获取了一下key的hashcode(hash(key))《每次存钱的时候,银行都需要你报一下银行卡号,好识别一下卡是否是银行里面已经存在的》
/** * Computes key.hashCode() and spreads (XORs) higher bits of hash * to lower. Because the table uses power-of-two masking, sets of * hashes that vary only in bits above the current mask will * always collide. (Among known examples are sets of Float keys * holding consecutive whole numbers in small tables.) So we * apply a transform that spreads the impact of higher bits * downward. There is a tradeoff between speed, utility, and * quality of bit-spreading. Because many common sets of hashes * are already reasonably distributed (so don't benefit from * spreading), and because we use trees to handle large sets of * collisions in bins, we just XOR some shifted bits in the * cheapest possible way to reduce systematic lossage, as well as * to incorporate impact of the highest bits that would otherwise * never be used in index calculations because of table bounds. */ static final int hash(Object key) { int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);//这里可以看到HashMap可以支持null的key,当然存在数组的话是0位置。 }
,接下来我们去看putVal方法:
/** * Implements Map.put and related methods * * @param hash hash for key * @param key the key * @param value the value to put * @param onlyIfAbsent if true, don't change existing value * @param evict if false, the table is in creation mode. * @return previous value, or null if none */ final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) { Node<K,V>[] tab; Node<K,V> p; int n, i; if ((tab = table) == null || (n = tab.length) == 0)//如果table(HashMap的数组内容)没有初始化,或者是0的容量的话,那么重新初始化 n = (tab = resize()).length; if ((p = tab[i = (n - 1) & hash]) == null)//当前hash对应的数组位置没有元素的话,直接放入当前的key——value tab[i] = newNode(hash, key, value, null); else {//key冲突的情况,hash碰撞的情况 Node<K,V> e; K k; if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))) e = p; else if (p instanceof TreeNode) e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value); else { for (int binCount = 0; ; ++binCount) { if ((e = p.next) == null) { p.next = newNode(hash, key, value, null);//如果存在,那么就以链表的形式往后延伸 if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st treeifyBin(tab, hash); break; } if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) break; p = e; } } if (e != null) { // existing mapping for key V oldValue = e.value; if (!onlyIfAbsent || oldValue == null) e.value = value; afterNodeAccess(e); return oldValue; } } ++modCount; if (++size > threshold) resize(); afterNodeInsertion(evict); return null; }
根据上面 put 方法的源代码可以看出,当程序试图将一个key-value对放入HashMap中时,程序首先根据该 key 的 hashCode() 返回值决定该 Entry 的存储位置:如果两个 Entry 的 key 的 hashCode() 返回值相同,那它们的存储位置相同。如果这两个 Entry 的 key 通过 equals 比较返回 true,新添加 Entry 的 value 将覆盖集合中原有 Entry 的 value,但key不会覆盖。如果这两个 Entry 的 key 通过 equals 比较返回 false,新添加的 Entry 将与集合中原有 Entry 形成 Entry 链,而且新添加的 Entry 位于 Entry 链的头部——具体说明继续看 addEntry() 方法的说明。
4.2、取的实现
读取key的内容是get方法,代码如下:
1 /** 2 * Returns the value to which the specified key is mapped, 3 * or {@code null} if this map contains no mapping for the key. 4 * 5 * <p>More formally, if this map contains a mapping from a key 6 * {@code k} to a value {@code v} such that {@code (key==null ? k==null : 7 * key.equals(k))}, then this method returns {@code v}; otherwise 8 * it returns {@code null}. (There can be at most one such mapping.) 9 * 10 * <p>A return value of {@code null} does not <i>necessarily</i> 11 * indicate that the map contains no mapping for the key; it's also 12 * possible that the map explicitly maps the key to {@code null}. 13 * The {@link #containsKey containsKey} operation may be used to 14 * distinguish these two cases. 15 * 16 * @see #put(Object, Object) 17 */ 18 public V get(Object key) { 19 Node<K,V> e; 20 return (e = getNode(hash(key), key)) == null ? null : e.value; 21 }
可以看到在20行,是首先获取传入key的hashCode,即hash(key)。
代码如下:
1 /** 2 * Computes key.hashCode() and spreads (XORs) higher bits of hash 3 * to lower. Because the table uses power-of-two masking, sets of 4 * hashes that vary only in bits above the current mask will 5 * always collide. (Among known examples are sets of Float keys 6 * holding consecutive whole numbers in small tables.) So we 7 * apply a transform that spreads the impact of higher bits 8 * downward. There is a tradeoff between speed, utility, and 9 * quality of bit-spreading. Because many common sets of hashes 10 * are already reasonably distributed (so don't benefit from 11 * spreading), and because we use trees to handle large sets of 12 * collisions in bins, we just XOR some shifted bits in the 13 * cheapest possible way to reduce systematic lossage, as well as 14 * to incorporate impact of the highest bits that would otherwise 15 * never be used in index calculations because of table bounds. 16 */ 17 static final int hash(Object key) { 18 int h; 19 return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); 20 }
通过 getNode(hash(key), key)获得节点信息,并且如果有对象就获取该节点的value值,若没有,则返回null;具体代码如下:
1 /**
2 * Implements Map.get and related methods
3 *
4 * @param hash hash for key
5 * @param key the key
6 * @return the node, or null if none
7 */
8 final Node<K,V> getNode(int hash, Object key) {
9 Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
10 if ((tab = table) != null && (n = tab.length) > 0 &&
// 注意这里的 (n-1) & hash 为根据hash值计算出hash桶
11 (first = tab[(n - 1) & hash]) != null) {12 if (first.hash == hash && // always check first node// 检查第一个节点,对于没有hash冲突的桶,第一个元素即为查找元素
13 ((k = first.key) == key || (key != null && key.equals(k))))
14 return first;
15 if ((e = first.next) != null) {
16 if (first instanceof TreeNode)// 如果hash桶已经树化,即超过8个元素转为红黑树
17 return ((TreeNode<K,V>)first).getTreeNode(hash, key);
18 do {//循环获得相对应key的hash相同的数组元素 // 否则遍历链表查找
19 if (e.hash == hash &&
20 ((k = e.key) == key || (key != null && key.equals(k))))
21 return e;
22 } while ((e = e.next) != null);
23 }
24 }
25 return null;
26 }
从上面的源代码中可以看出:从HashMap中get元素时,首先计算key的hashCode,找到数组中对应位置的某一元素,然后通过key的equals方法在对应位置的链表中找到需要的元素。
5、HashMap的大脑——hash算法
1、 散列表要解决的一个问题就是散列值的冲突问题,通常是两种方法:链表法和开放地址法。链表法就是将相同hash值的对象组织成一个链表放在hash值对应的槽位;开放地址法是通过一个探测算法,当某个槽位已经被占据的情况下继续查找下一个可以使用的槽位。java.util.HashMap采用的链表法的方式,链表是单向链表,因此在删除过程中要自己维持prev节点,我想不采用双向链表是从节省空间考虑。一个典型的查找过程:
HashMap采用链表法而不是开放地址法,猜想可能的原因是从实用角度出发,对空间和时间效率做出的折中选择。采用开放地址法,无论是线性探测或者二次探测都可能造成群集现象,而双重散列会要求散列表的装填程度比较低的情况下会有比较好的查找效率,容易造成空间的浪费。
2、 什么是负载因子?负载因子a定义为 a=散列表的实际元素数目(n)/ 散列表的容量(m)
负载因子衡量的是一个散列表的空间的使用程度,负载因子越大表示散列表的装填程度越高,反之愈小。对于使用链表法的散列表来说,查找一个元素的平均时间是O(1+a),因此如果负载因子越大,对空间的利用更充分,然而后果是查找效率的降低;如果负载因子太小,那么散列表的数据将过于稀疏,对空间造成严重浪费。
回到HashMap的实现,HashMap中的loadFactor其实定义的就是该map对象允许的最大的负载因子,如果超过这个系数将重新resize。这个是通过threshold字段来判断,看threshold的计算:
结合上面的负载因子的定义公式可知,threshold就是在此loadFactor和capacity对应下允许的最大元素数目,超过这个数目就重新resize,以降低实际的负载因子。默认的的负载因子0.75是对空间和时间效率的一个平衡选择。注意到的一点是resize的规模是现有capacity的两倍:
3、可能你也注意到了,java.util.HashMap对key的hash值多做了一步处理,而不是直接使用hashCode:
这个处理的原因在于HashMap的容量总是采用2的n次幂,而取index(槽位)的方法是
这一运算等价于对length取模,也就是
h % 2^n
返回的将是h的n个最低位组成的数字,我们假设hash输入是符合简单一致散列,然而这一假设并不能推论出hash的n个最低位也会符合简单一致散列,也许h的这n个最低位相同的几率很大,那么冲突的几率就非常大了。优秀的散列函数应该需要考虑所有的位。
因此为了防止这些“坏”的散列函数造成效率的降低,HashMap预先对hash值做了处理以考虑到所有的位,根据注释也可以知道。这个处理我看不懂,留待高人解释,也许来自于某本算法书也不一定。
4、 我们知道java.util.HashMap不是线程安全的,因此如果在使用迭代器的过程中有其他线程修改了map,那么将抛出ConcurrentModificationException,这就是所谓fail-fast策略(速错),这一策略在源码中的实现是通过modCount域,modCount顾名思义就是修改次数,对HashMap内容的修改都将增加这个值,那么在迭代器初始化过程中会将这个值赋给迭代器的expectedModCount,
在迭代过程中,判断modCount跟expectedModCount是否相等,如果不相等就表示已经有其他线程修改了map
注意到modCount声明为volatile,保证线程之间修改的可见性。
在HashMap的API中指出:
由所有HashMap类的“collection 视图方法”所返回的迭代器都是快速失败的:在迭代器创建之后,如果从结构上对映射进行修改,除非通过迭代器本身的 remove 方法,其他任何时间任何方式的修改,迭代器都将抛出ConcurrentModificationException。因此,面对并发的修改,迭代器很快就会完全失败,而不冒在将来不确定的时间发生任意不确定行为的风险。
注意,迭代器的快速失败行为不能得到保证,一般来说,存在非同步的并发修改时,不可能作出任何坚决的保证。快速失败迭代器尽最大努力抛出 ConcurrentModificationException。因此,编写依赖于此异常的程序的做法是错误的,正确做法是:迭代器的快速失败行为应该仅用于检测程序错误。
6、HashMap的业务量——resize
resize()方法功能是: 重新调整HashMap的大小,newCapacity是调整后的单位
其代码如下:
/** * Initializes or doubles table size. If null, allocates in * accord with initial capacity target held in field threshold. * Otherwise, because we are using power-of-two expansion, the * elements from each bin must either stay at same index, or move * with a power of two offset in the new table. * * @return the table */ final Node<K,V>[] resize() { Node<K,V>[] oldTab = table;//将已有的元素保存在临时数组里 int oldCap = (oldTab == null) ? 0 : oldTab.length; int oldThr = threshold; int newCap, newThr = 0; if (oldCap > 0) { if (oldCap >= MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return oldTab; } else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr << 1; // double threshold //获得两倍容量 } else if (oldThr > 0) // initial capacity was placed in threshold newCap = oldThr; else { // zero initial threshold signifies using defaults newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } if (newThr == 0) { float ft = (float)newCap * loadFactor;//重新计算得到临界值 newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ? (int)ft : Integer.MAX_VALUE); } threshold = newThr;//重新计算得到临界值 @SuppressWarnings({"rawtypes","unchecked"}) Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap]; table = newTab; if (oldTab != null) {//用来将原先table的元素全部移到newTable里面 for (int j = 0; j < oldCap; ++j) { Node<K,V> e; if ((e = oldTab[j]) != null) { oldTab[j] = null; if (e.next == null) newTab[e.hash & (newCap - 1)] = e; else if (e instanceof TreeNode) ((TreeNode<K,V>)e).split(this, newTab, j, oldCap); else { // preserve order Node<K,V> loHead = null, loTail = null; Node<K,V> hiHead = null, hiTail = null; Node<K,V> next; do { next = e.next; if ((e.hash & oldCap) == 0) { if (loTail == null) loHead = e; else loTail.next = e; loTail = e; } else { if (hiTail == null) hiHead = e; else hiTail.next = e; hiTail = e; } } while ((e = next) != null); if (loTail != null) { loTail.next = null; newTab[j] = loHead; } if (hiTail != null) { hiTail.next = null; newTab[j + oldCap] = hiHead; } } } } } return newTab; }
当HashMap中的元素越来越多的时候,hash冲突的几率也就越来越高,因为数组的长度是固定的。所以为了提高查询的效率,就要对HashMap的数组进行扩容,数组扩容这个操作也会出现在ArrayList中,这是一个常用的操作,而在HashMap数组扩容之后,最消耗性能的点就出现了:原数组中的数据必须重新计算其在新数组中的位置,并放进去,这就是resize。
那么HashMap什么时候进行扩容呢?当HashMap中的元素个数超过数组大小*loadFactor时,就会进行数组扩容,loadFactor的默认值为0.75,这是一个折中的取值。也就是说,默认情况下,数组大小为16,那么当HashMap中元素个数超过16*0.75=12的时候,就把数组的大小扩展为 2*16=32,即扩大一倍,然后重新计算每个元素在数组中的位置,扩容是需要进行数组复制的,复制数组是非常消耗性能的操作,所以如果我们已经预知HashMap中元素的个数,那么预设元素的个数能够有效的提高HashMap的性能。
1 else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && 2 oldCap >= DEFAULT_INITIAL_CAPACITY) 3 newThr = oldThr << 1; // double threshold
7、HashMap的效率——性能
此处属于重复叙述:
HashMap的性能参数:
HashMap 包含如下几个构造器:
HashMap():构建一个初始容量为 16,负载因子为 0.75 的 HashMap。
HashMap(int initialCapacity):构建一个初始容量为 initialCapacity,负载因子为 0.75 的 HashMap。
HashMap(int initialCapacity, float loadFactor):以指定初始容量、指定的负载因子创建一个 HashMap。
HashMap的基础构造器HashMap(int initialCapacity, float loadFactor)带有两个参数,它们是初始容量initialCapacity和加载因子loadFactor。
initialCapacity:HashMap的最大容量,即为底层数组的长度。
loadFactor:负载因子loadFactor定义为:散列表的实际元素数目(n)/ 散列表的容量(m)。
负载因子衡量的是一个散列表的空间的使用程度,负载因子越大表示散列表的装填程度越高,反之愈小。对于使用链表法的散列表来说,查找一个元素的平均时间是O(1+a),因此如果负载因子越大,对空间的利用更充分,然而后果是查找效率的降低;如果负载因子太小,那么散列表的数据将过于稀疏,对空间造成严重浪费。
HashMap的实现中,通过threshold字段来判断HashMap的最大容量:
结合负载因子的定义公式可知,threshold就是在此loadFactor和capacity对应下允许的最大元素数目,超过这个数目就重新resize,以降低实际的负载因子。默认的的负载因子0.75是对空间和时间效率的一个平衡选择。当容量超出此最大容量时, resize后的HashMap容量是容量的两倍:
resize()方法Java代码 如下:
8、更优秀的HashMap——SynchronizedMap和ConcurrentHashMap
Collections类的静态方法synchronizedMap也获得线程安全的HashMap。
Map map = Collections.synchronizedMap(new HashMap());
ConcurrentHashMap
使用了几个技巧来获得高程度的并发以及避免锁定,包括为不同的 hash bucket(桶)使用多个写锁和使用 JMM 的不确定性来最小化锁被保持的时间――或者根本避免获取锁。对于大多数一般用法来说它是经过优化的,这些用法往往会检索一个很可能在 map 中已经存在的值。事实上,多数成功的 get() 操作根本不需要任何锁定就能运行。(警告:不要自己试图这样做!想比 JMM 聪明不像看上去的那么容易。util.concurrent
类是由并发专家编写的,并且在 JMM 安全性方面经过了严格的同行评审。)
ConcurrentHashMap
摒弃了单一的 map 范围的锁,取而代之的是由 32 个锁组成的集合,其中每个锁负责保护 hash bucket 的一个子集。锁主要由变化性操作(put()
和 remove()
)使用。具有 32 个独立的锁意味着最多可以有 32 个线程可以同时修改 map。这并不一定是说在并发地对 map 进行写操作的线程数少于 32 时,另外的写操作不会被阻塞――32 对于写线程来说是理论上的并发限制数目,但是实际上可能达不到这个值。但是,32 依然比 1 要好得多,而且对于运行于目前这一代的计算机系统上的大多数应用程序来说已经足够了。
有 32 个独立的锁,其中每个锁保护 hash bucket 的一个子集,这样需要独占访问 map 的操作就必须获得所有32个锁。一些 map 范围的操作,比如说size()
和 isEmpty(),
也许能够不用一次锁整个 map(通过适当地限定这些操作的语义),但是有些操作,比如 map 重排(扩大 hash bucket 的数量,随着 map 的增长重新分布元素),则必须保证独占访问。Java 语言不提供用于获取可变大小的锁集合的简便方法。必须这么做的情况很少见,一旦碰到这种情况,可以用递归方法来实现。
在进入 put()
、get()
和 remove()
的实现之前,让我们先简单地看一下 JMM。JMM 掌管着一个线程对内存的动作 (读和写)影响其他线程对内存的动作的方式。由于使用处理器寄存器和预处理 cache 来提高内存访问速度带来的性能提升,Java 语言规范(JLS)允许一些内存操作并不对于所有其他线程立即可见。有两种语言机制可用于保证跨线程内存操作的一致性――synchronized
和 volatile。
按照 JLS 的说法,“在没有显式同步的情况下,一个实现可以自由地更新主存,更新时所采取的顺序可能是出人意料的。”其意思是说,如果没有同步的话,在一个给定线程中某种顺序的写操作对于另外一个不同的线程来说可能呈现出不同的顺序, 并且对内存变量的更新从一个线程传播到另外一个线程的时间是不可预测的。
虽然使用同步最常见的原因是保证对代码关键部分的原子访问,但实际上同步提供三个独立的功能――原子性、可见性和顺序性。原子性非常简单――同步实施一个可重入的(reentrant)互斥,防止多于一个的线程同时执行由一个给定的监视器保护的代码块。不幸的是,多数文章都只关注原子性方面,而忽略了其他方面。但是同步在 JMM 中也扮演着很重要的角色,会引起 JVM 在获得和释放监视器的时候执行内存壁垒(memory barrier)。
一个线程在获得一个监视器之后,它执行一个读屏障(read barrier)――使得缓存在线程局部内存(比如说处理器缓存或者处理器寄存器)中的所有变量都失效,这样就会导致处理器重新从主存中读取同步代码块使用的变量。与此类似,在释放监视器时,线程会执行一个写屏障(write barrier)――将所有修改过的变量写回主存。互斥独占和内存壁垒结合使用意味着只要您在程序设计的时候遵循正确的同步法则(也就是说,每当写一个后面可能被其他线程访问的变量,或者读取一个可能最后被另一个线程修改的变量时,都要使用同步),每个线程都会得到它所使用的共享变量的正确的值。
如果在访问共享变量的时候没有同步的话,就会发生一些奇怪的事情。一些变化可能会通过线程立即反映出来,而其他的则需要一些时间(这由关联缓存的本质所致)。结果,如果没有同步您就不能保证内存内容必定一致(相关的变量相互间可能会不一致),或者不能得到当前的内存内容(一些值可能是过时的)。避免这种危险情况的常用方法(也是推荐使用的方法)当然是正确地使用同步。然而在有些情况下,比如说在像 ConcurrentHashMap
之类的一些使用非常广泛的库类中,在开发过程当中还需要一些额外的专业技能和努力(可能比一般的开发要多出很多倍)来获得较高的性能。
ConcurrentHashMap 实现
如前所述,ConcurrentHashMap
使用的数据结构与 Hashtable
或 HashMap
的实现类似,是 hash bucket 的一个可变数组,每个 ConcurrentHashMap
都由一个 Map.Entry
元素链构成,如清单1所示。与 Hashtable
和 HashMap
不同的是,ConcurrentHashMap
没有使用单一的集合锁(collection lock),而是使用了一个固定的锁池,这个锁池形成了bucket 集合的一个分区。
在实例化之前运行的static代码块,如下:
1 static { 2 try { 3 U = sun.misc.Unsafe.getUnsafe(); 4 Class<?> k = ConcurrentHashMap.class; 5 SIZECTL = U.objectFieldOffset 6 (k.getDeclaredField("sizeCtl")); 7 TRANSFERINDEX = U.objectFieldOffset 8 (k.getDeclaredField("transferIndex")); 9 BASECOUNT = U.objectFieldOffset 10 (k.getDeclaredField("baseCount")); 11 CELLSBUSY = U.objectFieldOffset 12 (k.getDeclaredField("cellsBusy")); 13 Class<?> ck = CounterCell.class; 14 CELLVALUE = U.objectFieldOffset 15 (ck.getDeclaredField("value")); 16 Class<?> ak = Node[].class; 17 ABASE = U.arrayBaseOffset(ak); 18 int scale = U.arrayIndexScale(ak); 19 if ((scale & (scale - 1)) != 0) 20 throw new Error("data type scale not a power of two"); 21 ASHIFT = 31 - Integer.numberOfLeadingZeros(scale); 22 } catch (Exception e) { 23 throw new Error(e); 24 } 25 }
清单1. ConcurrentHashMap 使用的 Map.Entry 元素
实现了数据数组的Node:
1 /** 2 * Key-value entry. This class is never exported out as a 3 * user-mutable Map.Entry (i.e., one supporting setValue; see 4 * MapEntry below), but can be used for read-only traversals used 5 * in bulk tasks. Subclasses of Node with a negative hash field 6 * are special, and contain null keys and values (but are never 7 * exported). Otherwise, keys and vals are never null. 8 */ 9 static class Node<K,V> implements Map.Entry<K,V> { 10 final int hash; 11 final K key; 12 volatile V val; 13 volatile Node<K,V> next; 14 15 Node(int hash, K key, V val, Node<K,V> next) { 16 this.hash = hash; 17 this.key = key; 18 this.val = val; 19 this.next = next; 20 } 21 22 public final K getKey() { return key; } 23 public final V getValue() { return val; } 24 public final int hashCode() { return key.hashCode() ^ val.hashCode(); } 25 public final String toString(){ return key + "=" + val; } 26 public final V setValue(V value) { 27 throw new UnsupportedOperationException(); 28 } 29 30 public final boolean equals(Object o) { 31 Object k, v, u; Map.Entry<?,?> e; 32 return ((o instanceof Map.Entry) && 33 (k = (e = (Map.Entry<?,?>)o).getKey()) != null && 34 (v = e.getValue()) != null && 35 (k == key || k.equals(key)) && 36 (v == (u = val) || v.equals(u))); 37 } 38 39 /** 40 * Virtualized support for map.get(); overridden in subclasses. 41 */ 42 Node<K,V> find(int h, Object k) { 43 Node<K,V> e = this; 44 if (k != null) { 45 do { 46 K ek; 47 if (e.hash == h && 48 ((ek = e.key) == k || (ek != null && k.equals(ek)))) 49 return e; 50 } while ((e = e.next) != null); 51 } 52 return null; 53 } 54 }
其实也使用了实现出了TreeNode,以及相应的转换,代码如下:
1 /* ---------------- TreeNodes -------------- */ 2 3 /** 4 * Nodes for use in TreeBins 5 */ 6 static final class TreeNode<K,V> extends Node<K,V> { 7 TreeNode<K,V> parent; // red-black tree links 8 TreeNode<K,V> left; 9 TreeNode<K,V> right; 10 TreeNode<K,V> prev; // needed to unlink next upon deletion 11 boolean red; 12 13 TreeNode(int hash, K key, V val, Node<K,V> next, 14 TreeNode<K,V> parent) { 15 super(hash, key, val, next); 16 this.parent = parent; 17 } 18 19 Node<K,V> find(int h, Object k) { 20 return findTreeNode(h, k, null); 21 } 22 23 /** 24 * Returns the TreeNode (or null if not found) for the given key 25 * starting at given root. 26 */ 27 final TreeNode<K,V> findTreeNode(int h, Object k, Class<?> kc) { 28 if (k != null) { 29 TreeNode<K,V> p = this; 30 do { 31 int ph, dir; K pk; TreeNode<K,V> q; 32 TreeNode<K,V> pl = p.left, pr = p.right; 33 if ((ph = p.hash) > h) 34 p = pl; 35 else if (ph < h) 36 p = pr; 37 else if ((pk = p.key) == k || (pk != null && k.equals(pk))) 38 return p; 39 else if (pl == null) 40 p = pr; 41 else if (pr == null) 42 p = pl; 43 else if ((kc != null || 44 (kc = comparableClassFor(k)) != null) && 45 (dir = compareComparables(kc, k, pk)) != 0) 46 p = (dir < 0) ? pl : pr; 47 else if ((q = pr.findTreeNode(h, k, kc)) != null) 48 return q; 49 else 50 p = pl; 51 } while (p != null); 52 } 53 return null; 54 } 55 } 56 57 /* ---------------- TreeBins -------------- */ 58 59 /** 60 * TreeNodes used at the heads of bins. TreeBins do not hold user 61 * keys or values, but instead point to list of TreeNodes and 62 * their root. They also maintain a parasitic read-write lock 63 * forcing writers (who hold bin lock) to wait for readers (who do 64 * not) to complete before tree restructuring operations. 65 */ 66 static final class TreeBin<K,V> extends Node<K,V> { 67 TreeNode<K,V> root; 68 volatile TreeNode<K,V> first; 69 volatile Thread waiter; 70 volatile int lockState; 71 // values for lockState 72 static final int WRITER = 1; // set while holding write lock 73 static final int WAITER = 2; // set when waiting for write lock 74 static final int READER = 4; // increment value for setting read lock 75 76 /** 77 * Tie-breaking utility for ordering insertions when equal 78 * hashCodes and non-comparable. We don't require a total 79 * order, just a consistent insertion rule to maintain 80 * equivalence across rebalancings. Tie-breaking further than 81 * necessary simplifies testing a bit. 82 */ 83 static int tieBreakOrder(Object a, Object b) { 84 int d; 85 if (a == null || b == null || 86 (d = a.getClass().getName(). 87 compareTo(b.getClass().getName())) == 0) 88 d = (System.identityHashCode(a) <= System.identityHashCode(b) ? 89 -1 : 1); 90 return d; 91 } 92 93 /** 94 * Creates bin with initial set of nodes headed by b. 95 */ 96 TreeBin(TreeNode<K,V> b) { 97 super(TREEBIN, null, null, null); 98 this.first = b; 99 TreeNode<K,V> r = null; 100 for (TreeNode<K,V> x = b, next; x != null; x = next) { 101 next = (TreeNode<K,V>)x.next; 102 x.left = x.right = null; 103 if (r == null) { 104 x.parent = null; 105 x.red = false; 106 r = x; 107 } 108 else { 109 K k = x.key; 110 int h = x.hash; 111 Class<?> kc = null; 112 for (TreeNode<K,V> p = r;;) { 113 int dir, ph; 114 K pk = p.key; 115 if ((ph = p.hash) > h) 116 dir = -1; 117 else if (ph < h) 118 dir = 1; 119 else if ((kc == null && 120 (kc = comparableClassFor(k)) == null) || 121 (dir = compareComparables(kc, k, pk)) == 0) 122 dir = tieBreakOrder(k, pk); 123 TreeNode<K,V> xp = p; 124 if ((p = (dir <= 0) ? p.left : p.right) == null) { 125 x.parent = xp; 126 if (dir <= 0) 127 xp.left = x; 128 else 129 xp.right = x; 130 r = balanceInsertion(r, x); 131 break; 132 } 133 } 134 } 135 } 136 this.root = r; 137 assert checkInvariants(root); 138 } 139 140 /** 141 * Acquires write lock for tree restructuring. 142 */ 143 private final void lockRoot() { 144 if (!U.compareAndSwapInt(this, LOCKSTATE, 0, WRITER)) 145 contendedLock(); // offload to separate method 146 } 147 148 /** 149 * Releases write lock for tree restructuring. 150 */ 151 private final void unlockRoot() { 152 lockState = 0; 153 } 154 155 /** 156 * Possibly blocks awaiting root lock. 157 */ 158 private final void contendedLock() { 159 boolean waiting = false; 160 for (int s;;) { 161 if (((s = lockState) & ~WAITER) == 0) { 162 if (U.compareAndSwapInt(this, LOCKSTATE, s, WRITER)) { 163 if (waiting) 164 waiter = null; 165 return; 166 } 167 } 168 else if ((s & WAITER) == 0) { 169 if (U.compareAndSwapInt(this, LOCKSTATE, s, s | WAITER)) { 170 waiting = true; 171 waiter = Thread.currentThread(); 172 } 173 } 174 else if (waiting) 175 LockSupport.park(this); 176 } 177 } 178 179 /** 180 * Returns matching node or null if none. Tries to search 181 * using tree comparisons from root, but continues linear 182 * search when lock not available. 183 */ 184 final Node<K,V> find(int h, Object k) { 185 if (k != null) { 186 for (Node<K,V> e = first; e != null; ) { 187 int s; K ek; 188 if (((s = lockState) & (WAITER|WRITER)) != 0) { 189 if (e.hash == h && 190 ((ek = e.key) == k || (ek != null && k.equals(ek)))) 191 return e; 192 e = e.next; 193 } 194 else if (U.compareAndSwapInt(this, LOCKSTATE, s, 195 s + READER)) { 196 TreeNode<K,V> r, p; 197 try { 198 p = ((r = root) == null ? null : 199 r.findTreeNode(h, k, null)); 200 } finally { 201 Thread w; 202 if (U.getAndAddInt(this, LOCKSTATE, -READER) == 203 (READER|WAITER) && (w = waiter) != null) 204 LockSupport.unpark(w); 205 } 206 return p; 207 } 208 } 209 } 210 return null; 211 } 212 213 /** 214 * Finds or adds a node. 215 * @return null if added 216 */ 217 final TreeNode<K,V> putTreeVal(int h, K k, V v) { 218 Class<?> kc = null; 219 boolean searched = false; 220 for (TreeNode<K,V> p = root;;) { 221 int dir, ph; K pk; 222 if (p == null) { 223 first = root = new TreeNode<K,V>(h, k, v, null, null); 224 break; 225 } 226 else if ((ph = p.hash) > h) 227 dir = -1; 228 else if (ph < h) 229 dir = 1; 230 else if ((pk = p.key) == k || (pk != null && k.equals(pk))) 231 return p; 232 else if ((kc == null && 233 (kc = comparableClassFor(k)) == null) || 234 (dir = compareComparables(kc, k, pk)) == 0) { 235 if (!searched) { 236 TreeNode<K,V> q, ch; 237 searched = true; 238 if (((ch = p.left) != null && 239 (q = ch.findTreeNode(h, k, kc)) != null) || 240 ((ch = p.right) != null && 241 (q = ch.findTreeNode(h, k, kc)) != null)) 242 return q; 243 } 244 dir = tieBreakOrder(k, pk); 245 } 246 247 TreeNode<K,V> xp = p; 248 if ((p = (dir <= 0) ? p.left : p.right) == null) { 249 TreeNode<K,V> x, f = first; 250 first = x = new TreeNode<K,V>(h, k, v, f, xp); 251 if (f != null) 252 f.prev = x; 253 if (dir <= 0) 254 xp.left = x; 255 else 256 xp.right = x; 257 if (!xp.red) 258 x.red = true; 259 else { 260 lockRoot(); 261 try { 262 root = balanceInsertion(root, x); 263 } finally { 264 unlockRoot(); 265 } 266 } 267 break; 268 } 269 } 270 assert checkInvariants(root); 271 return null; 272 } 273 274 /** 275 * Removes the given node, that must be present before this 276 * call. This is messier than typical red-black deletion code 277 * because we cannot swap the contents of an interior node 278 * with a leaf successor that is pinned by "next" pointers 279 * that are accessible independently of lock. So instead we 280 * swap the tree linkages. 281 * 282 * @return true if now too small, so should be untreeified 283 */ 284 final boolean removeTreeNode(TreeNode<K,V> p) { 285 TreeNode<K,V> next = (TreeNode<K,V>)p.next; 286 TreeNode<K,V> pred = p.prev; // unlink traversal pointers 287 TreeNode<K,V> r, rl; 288 if (pred == null) 289 first = next; 290 else 291 pred.next = next; 292 if (next != null) 293 next.prev = pred; 294 if (first == null) { 295 root = null; 296 return true; 297 } 298 if ((r = root) == null || r.right == null || // too small 299 (rl = r.left) == null || rl.left == null) 300 return true; 301 lockRoot(); 302 try { 303 TreeNode<K,V> replacement; 304 TreeNode<K,V> pl = p.left; 305 TreeNode<K,V> pr = p.right; 306 if (pl != null && pr != null) { 307 TreeNode<K,V> s = pr, sl; 308 while ((sl = s.left) != null) // find successor 309 s = sl; 310 boolean c = s.red; s.red = p.red; p.red = c; // swap colors 311 TreeNode<K,V> sr = s.right; 312 TreeNode<K,V> pp = p.parent; 313 if (s == pr) { // p was s's direct parent 314 p.parent = s; 315 s.right = p; 316 } 317 else { 318 TreeNode<K,V> sp = s.parent; 319 if ((p.parent = sp) != null) { 320 if (s == sp.left) 321 sp.left = p; 322 else 323 sp.right = p; 324 } 325 if ((s.right = pr) != null) 326 pr.parent = s; 327 } 328 p.left = null; 329 if ((p.right = sr) != null) 330 sr.parent = p; 331 if ((s.left = pl) != null) 332 pl.parent = s; 333 if ((s.parent = pp) == null) 334 r = s; 335 else if (p == pp.left) 336 pp.left = s; 337 else 338 pp.right = s; 339 if (sr != null) 340 replacement = sr; 341 else 342 replacement = p; 343 } 344 else if (pl != null) 345 replacement = pl; 346 else if (pr != null) 347 replacement = pr; 348 else 349 replacement = p; 350 if (replacement != p) { 351 TreeNode<K,V> pp = replacement.parent = p.parent; 352 if (pp == null) 353 r = replacement; 354 else if (p == pp.left) 355 pp.left = replacement; 356 else 357 pp.right = replacement; 358 p.left = p.right = p.parent = null; 359 } 360 361 root = (p.red) ? r : balanceDeletion(r, replacement); 362 363 if (p == replacement) { // detach pointers 364 TreeNode<K,V> pp; 365 if ((pp = p.parent) != null) { 366 if (p == pp.left) 367 pp.left = null; 368 else if (p == pp.right) 369 pp.right = null; 370 p.parent = null; 371 } 372 } 373 } finally { 374 unlockRoot(); 375 } 376 assert checkInvariants(root); 377 return false; 378 } 379 380 /* ------------------------------------------------------------ */ 381 // Red-black tree methods, all adapted from CLR 382 383 static <K,V> TreeNode<K,V> rotateLeft(TreeNode<K,V> root, 384 TreeNode<K,V> p) { 385 TreeNode<K,V> r, pp, rl; 386 if (p != null && (r = p.right) != null) { 387 if ((rl = p.right = r.left) != null) 388 rl.parent = p; 389 if ((pp = r.parent = p.parent) == null) 390 (root = r).red = false; 391 else if (pp.left == p) 392 pp.left = r; 393 else 394 pp.right = r; 395 r.left = p; 396 p.parent = r; 397 } 398 return root; 399 } 400 401 static <K,V> TreeNode<K,V> rotateRight(TreeNode<K,V> root, 402 TreeNode<K,V> p) { 403 TreeNode<K,V> l, pp, lr; 404 if (p != null && (l = p.left) != null) { 405 if ((lr = p.left = l.right) != null) 406 lr.parent = p; 407 if ((pp = l.parent = p.parent) == null) 408 (root = l).red = false; 409 else if (pp.right == p) 410 pp.right = l; 411 else 412 pp.left = l; 413 l.right = p; 414 p.parent = l; 415 } 416 return root; 417 } 418 419 static <K,V> TreeNode<K,V> balanceInsertion(TreeNode<K,V> root, 420 TreeNode<K,V> x) { 421 x.red = true; 422 for (TreeNode<K,V> xp, xpp, xppl, xppr;;) { 423 if ((xp = x.parent) == null) { 424 x.red = false; 425 return x; 426 } 427 else if (!xp.red || (xpp = xp.parent) == null) 428 return root; 429 if (xp == (xppl = xpp.left)) { 430 if ((xppr = xpp.right) != null && xppr.red) { 431 xppr.red = false; 432 xp.red = false; 433 xpp.red = true; 434 x = xpp; 435 } 436 else { 437 if (x == xp.right) { 438 root = rotateLeft(root, x = xp); 439 xpp = (xp = x.parent) == null ? null : xp.parent; 440 } 441 if (xp != null) { 442 xp.red = false; 443 if (xpp != null) { 444 xpp.red = true; 445 root = rotateRight(root, xpp); 446 } 447 } 448 } 449 } 450 else { 451 if (xppl != null && xppl.red) { 452 xppl.red = false; 453 xp.red = false; 454 xpp.red = true; 455 x = xpp; 456 } 457 else { 458 if (x == xp.left) { 459 root = rotateRight(root, x = xp); 460 xpp = (xp = x.parent) == null ? null : xp.parent; 461 } 462 if (xp != null) { 463 xp.red = false; 464 if (xpp != null) { 465 xpp.red = true; 466 root = rotateLeft(root, xpp); 467 } 468 } 469 } 470 } 471 } 472 } 473 474 static <K,V> TreeNode<K,V> balanceDeletion(TreeNode<K,V> root, 475 TreeNode<K,V> x) { 476 for (TreeNode<K,V> xp, xpl, xpr;;) { 477 if (x == null || x == root) 478 return root; 479 else if ((xp = x.parent) == null) { 480 x.red = false; 481 return x; 482 } 483 else if (x.red) { 484 x.red = false; 485 return root; 486 } 487 else if ((xpl = xp.left) == x) { 488 if ((xpr = xp.right) != null && xpr.red) { 489 xpr.red = false; 490 xp.red = true; 491 root = rotateLeft(root, xp); 492 xpr = (xp = x.parent) == null ? null : xp.right; 493 } 494 if (xpr == null) 495 x = xp; 496 else { 497 TreeNode<K,V> sl = xpr.left, sr = xpr.right; 498 if ((sr == null || !sr.red) && 499 (sl == null || !sl.red)) { 500 xpr.red = true; 501 x = xp; 502 } 503 else { 504 if (sr == null || !sr.red) { 505 if (sl != null) 506 sl.red = false; 507 xpr.red = true; 508 root = rotateRight(root, xpr); 509 xpr = (xp = x.parent) == null ? 510 null : xp.right; 511 } 512 if (xpr != null) { 513 xpr.red = (xp == null) ? false : xp.red; 514 if ((sr = xpr.right) != null) 515 sr.red = false; 516 } 517 if (xp != null) { 518 xp.red = false; 519 root = rotateLeft(root, xp); 520 } 521 x = root; 522 } 523 } 524 } 525 else { // symmetric 526 if (xpl != null && xpl.red) { 527 xpl.red = false; 528 xp.red = true; 529 root = rotateRight(root, xp); 530 xpl = (xp = x.parent) == null ? null : xp.left; 531 } 532 if (xpl == null) 533 x = xp; 534 else { 535 TreeNode<K,V> sl = xpl.left, sr = xpl.right; 536 if ((sl == null || !sl.red) && 537 (sr == null || !sr.red)) { 538 xpl.red = true; 539 x = xp; 540 } 541 else { 542 if (sl == null || !sl.red) { 543 if (sr != null) 544 sr.red = false; 545 xpl.red = true; 546 root = rotateLeft(root, xpl); 547 xpl = (xp = x.parent) == null ? 548 null : xp.left; 549 } 550 if (xpl != null) { 551 xpl.red = (xp == null) ? false : xp.red; 552 if ((sl = xpl.left) != null) 553 sl.red = false; 554 } 555 if (xp != null) { 556 xp.red = false; 557 root = rotateRight(root, xp); 558 } 559 x = root; 560 } 561 } 562 } 563 } 564 } 565 566 /** 567 * Recursive invariant check 568 */ 569 static <K,V> boolean checkInvariants(TreeNode<K,V> t) { 570 TreeNode<K,V> tp = t.parent, tl = t.left, tr = t.right, 571 tb = t.prev, tn = (TreeNode<K,V>)t.next; 572 if (tb != null && tb.next != t) 573 return false; 574 if (tn != null && tn.prev != t) 575 return false; 576 if (tp != null && t != tp.left && t != tp.right) 577 return false; 578 if (tl != null && (tl.parent != t || tl.hash > t.hash)) 579 return false; 580 if (tr != null && (tr.parent != t || tr.hash < t.hash)) 581 return false; 582 if (t.red && tl != null && tl.red && tr != null && tr.red) 583 return false; 584 if (tl != null && !checkInvariants(tl)) 585 return false; 586 if (tr != null && !checkInvariants(tr)) 587 return false; 588 return true; 589 } 590 591 private static final sun.misc.Unsafe U; 592 private static final long LOCKSTATE; 593 static { 594 try { 595 U = sun.misc.Unsafe.getUnsafe(); 596 Class<?> k = TreeBin.class; 597 LOCKSTATE = U.objectFieldOffset 598 (k.getDeclaredField("lockState")); 599 } catch (Exception e) { 600 throw new Error(e); 601 } 602 } 603 } 604 605 /* ----------------Table Traversal -------------- */
不用锁定遍历数据结构
与 Hashtable
或者典型的锁池 Map
实现不同,ConcurrentHashMap.get()
操作不一定需要获取与相关bucket 相关联的锁。如果不使用锁定,那么实现必须有能力处理它用到的所有变量的过时的或者不一致的值,比如说列表头指针和 Map.Entry
元素的域(包括组成每个 hash bucket 条目的链表的链接指针)。
大多并发类使用同步来保证独占式访问一个数据结构(以及保持数据结构的一致性)。ConcurrentHashMap
没有采用独占性和一致性,它使用的链表是经过精心设计的,所以其实现可以检测到它的列表是否一致或者已经过时。如果它检测到它的列表出现不一致或者过时,或者干脆就找不到它要找的条目,它就会对适当的 bucket 锁进行同步并再次搜索整个链。这样做在一般的情况下可以优化查找,所谓的一般情况是指大多数检索操作是成功的并且检索的次数多于插入和删除的次数。
使用不变性
不一致性的一个重要来源是可以避免得,其方法是使 Entry
元素接近不变性――除了值字段(它们是易变的)之外,所有字段都是 final 的。这就意味着不能将元素添加到 hash 链的中间或末尾,或者从 hash 链的中间或末尾删除元素――而只能从 hash 链的开头添加元素,并且删除操作包括克隆整个链或链的一部分并更新列表的头指针。所以说只要有对某个 hash 链的一个引用,即使可能不知道有没有对列表头节点的引用,您也可以知道列表的其余部分的结构不会改变。而且,因为值字段是易变的,所以能够立即看到对值字段的更新,从而大大简化了编写能够处理内存潜在过时的 Map
的实现。
新的 JMM 为 final 型变量提供初始化安全,而老的 JMM 不提供,这意味着另一个线程看到的可能是 final 字段的默认值,而不是对象的构造方法提供的值。实现必须能够同时检测到这一点,这是通过保证 Entry
中每个字段的默认值不是有效值来实现的。这样构造好列表之后,如果任何一个 Entry
字段有其默认值(零或空),搜索就会失败,提示同步 get()
并再次遍历链。
检索操作
检索操作首先为目标 bucket 查找头指针(是在不锁定的情况下完成的,所以说可能是过时的),然后在不获取 bucket 锁的情况下遍历 bucket 链。如果它不能发现要查找的值,就会同步并试图再次查找条目,如清单2 所示:
清单2. ConcurrentHashMap.get() 实现
1 /** 2 * Returns the value to which the specified key is mapped, 3 * or {@code null} if this map contains no mapping for the key. 4 * 5 * <p>More formally, if this map contains a mapping from a key 6 * {@code k} to a value {@code v} such that {@code key.equals(k)}, 7 * then this method returns {@code v}; otherwise it returns 8 * {@code null}. (There can be at most one such mapping.) 9 * 10 * @throws NullPointerException if the specified key is null 11 */ 12 public V get(Object key) { 13 Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek; 14 int h = spread(key.hashCode()); 15 if ((tab = table) != null && (n = tab.length) > 0 && 16 (e = tabAt(tab, (n - 1) & h)) != null) { 17 if ((eh = e.hash) == h) { 18 if ((ek = e.key) == key || (ek != null && key.equals(ek))) 19 return e.val; 20 } 21 else if (eh < 0) 22 return (p = e.find(h, key)) != null ? p.val : null;//获得元素 23 while ((e = e.next) != null) { 24 if (e.hash == h && 25 ((ek = e.key) == key || (ek != null && key.equals(ek)))) 26 return e.val; 27 } 28 } 29 return null; 30 }
当然,其中使用了spread()方法去获得散列码,起源码如下:
1 /* ---------------- Static utilities -------------- */ 2 3 /** 4 * Spreads (XORs) higher bits of hash to lower and also forces top 5 * bit to 0. Because the table uses power-of-two masking, sets of 6 * hashes that vary only in bits above the current mask will 7 * always collide. (Among known examples are sets of Float keys 8 * holding consecutive whole numbers in small tables.) So we 9 * apply a transform that spreads the impact of higher bits 10 * downward. There is a tradeoff between speed, utility, and 11 * quality of bit-spreading. Because many common sets of hashes 12 * are already reasonably distributed (so don't benefit from 13 * spreading), and because we use trees to handle large sets of 14 * collisions in bins, we just XOR some shifted bits in the 15 * cheapest possible way to reduce systematic lossage, as well as 16 * to incorporate impact of the highest bits that would otherwise 17 * never be used in index calculations because of table bounds. 18 */ 19 static final int spread(int h) { 20 return (h ^ (h >>> 16)) & HASH_BITS; 21 }
最关键的方法是 find(h, key),起源码如下:
//在Node类里面的方法: /** * Virtualized support for map.get(); overridden in subclasses. */ Node<K,V> find(int h, Object k) { Node<K,V> e = this; if (k != null) { do {//遍历获取元素节点 K ek; if (e.hash == h && ((ek = e.key) == k || (ek != null && k.equals(ek)))) return e; } while ((e = e.next) != null); } return null; }
删除操作
因为一个线程可能看到 hash 链中链接指针的过时的值,简单地从链中删除一个元素不足以保证其他线程在进行查找的时候不继续看到被删除的值。相反,从清单3我们可以看到,删除操作分两个过程――首先找到适当的 Entry
对象并把其值字段设为 null
,然后对链中从头元素到要删除的元素的部分进行克隆,再连接到要删除的元素之后的部分。因为值字段是易变的,如果另外一个线程正在过时的链中查找那个被删除的元素,它会立即看到一个空值,并知道使用同步重新进行检索。最终,原始 hash 链中被删除的元素将会被垃圾收集。
清单3. ConcurrentHashMap.remove() 实现
1 /** 2 * Removes the key (and its corresponding value) from this map. 3 * This method does nothing if the key is not in the map. 4 * 5 * @param key the key that needs to be removed 6 * @return the previous value associated with {@code key}, or 7 * {@code null} if there was no mapping for {@code key} 8 * @throws NullPointerException if the specified key is null 9 */ 10 public V remove(Object key) { 11 return replaceNode(key, null, null); 12 } 13 14 /** 15 * Implementation for the four public remove/replace methods: 16 * Replaces node value with v, conditional upon match of cv if 17 * non-null. If resulting value is null, delete. 18 */ 19 final V replaceNode(Object key, V value, Object cv) { 20 int hash = spread(key.hashCode()); 21 for (Node<K,V>[] tab = table;;) { 22 Node<K,V> f; int n, i, fh; 23 if (tab == null || (n = tab.length) == 0 || 24 (f = tabAt(tab, i = (n - 1) & hash)) == null) 25 break; 26 else if ((fh = f.hash) == MOVED) 27 tab = helpTransfer(tab, f); 28 else { 29 V oldVal = null; 30 boolean validated = false; 31 synchronized (f) {//加锁,使得操作线程安全 32 if (tabAt(tab, i) == f) { 33 if (fh >= 0) { 34 validated = true; 35 for (Node<K,V> e = f, pred = null;;) { 36 K ek; 37 if (e.hash == hash && 38 ((ek = e.key) == key || 39 (ek != null && key.equals(ek)))) { 40 V ev = e.val; 41 if (cv == null || cv == ev || 42 (ev != null && cv.equals(ev))) { 43 oldVal = ev; 44 if (value != null) 45 e.val = value; 46 else if (pred != null) 47 pred.next = e.next; 48 else 49 setTabAt(tab, i, e.next); 50 } 51 break; 52 } 53 pred = e; 54 if ((e = e.next) == null) 55 break; 56 } 57 } 58 else if (f instanceof TreeBin) { 59 validated = true; 60 TreeBin<K,V> t = (TreeBin<K,V>)f; 61 TreeNode<K,V> r, p; 62 if ((r = t.root) != null && 63 (p = r.findTreeNode(hash, key, null)) != null) { 64 V pv = p.val; 65 if (cv == null || cv == pv || 66 (pv != null && cv.equals(pv))) { 67 oldVal = pv; 68 if (value != null) 69 p.val = value; 70 else if (t.removeTreeNode(p)) 71 setTabAt(tab, i, untreeify(t.first)); 72 } 73 } 74 } 75 } 76 } 77 if (validated) { 78 if (oldVal != null) { 79 if (value == null) 80 addCount(-1L, -1); 81 return oldVal; 82 } 83 break; 84 } 85 } 86 } 87 return null; 88 }
图1为删除一个元素之前的 hash 链:
图1. Hash链
图2为删除元素3之后的链:
图2. 一个元素的删除过程
插入和更新操作
put()
的实现很简单。像 remove()
一样,put()
会在执行期间保持 bucket 锁,但是由于 put()
并不是都需要获取锁,所以这并不一定会阻塞其他读线程的执行(也不会阻塞其他写线程访问别的 bucket)。它首先会在适当的 hash 链中搜索需要的键值。如果能够找到,value
字段(易变的)就直接被更新。如果没有找到,新会创建一个用于描述新 map 的新 Entry
对象,然后插入到 bucket 列表的头部。
弱一致的迭代器
由 ConcurrentHashMap
返回的迭代器的语义又不同于 ava.util
集合中的迭代器;而且它又是 弱一致的(weakly consistent)而非 fail-fast 的(所谓 fail-fast 是指,当正在使用一个迭代器的时候,如何底层的集合被修改,就会抛出一个异常)。当一个用户调用 keySet().iterator()
去迭代器中检索一组 hash 键的时候,实现就简单地使用同步来保证每个链的头指针是当前值。next()
和 hasNext()
操作以一种明显的方式定义,即遍历每个链然后转到下一个链直到所有的链都被遍历。弱一致迭代器可能会也可能不会反映迭代器迭代过程中的插入操作,但是一定会反映迭代器还没有到达的键的更新或删除操作,并且对任何值最多返回一次。ConcurrentHashMap
返回的迭代器不会抛出 ConcurrentModificationException
异常。
动态调整大小
随着 map 中元素数目的增长,hash 链将会变长,因此检索时间也会增加。从某种意义上说,增加 bucket 的数目和重排其中的值是非常重要的。在有些像 Hashtable
之类的类中,这很简单,因为保持一个应用到整个 map 的独占锁是可能的。在 ConcurrentHashMap
中,每次一个条目插入的时候,如果链的长度超过了某个阈值,链就被标记为需要调整大小。当有足够多的链被标记为需要调整大小以后,ConcurrentHashMap
就使用递归获取每个 bucket 上的锁并重排每个 bucket 中的元素到一个新的、更大的 hash 表中。多数情况下,这是自动发生的,并且对调用者透明。
参考内容:
https://blog.csdn.net/u___u/article/details/69937891
http://www.cnblogs.com/xwdreamer/archive/2012/06/03/2532832.html
http://www.iteye.com/topic/539465
https://www.ibm.com/developerworks/cn/java/j-lo-hash/?ca=drs-tp4608
https://www.ibm.com/developerworks/cn/java/j-jtp08223/
http://www.iteye.com/topic/368087
http://zhangshixi.iteye.com/blog/672697
https://blog.csdn.net/linxdcn/article/details/71597964
https://blog.csdn.net/ns_code/article/details/36034955
https://www.ibm.com/developerworks/cn/java/j-jtp08223/
标签:
版权申明:本站文章部分自网络,如有侵权,请联系:west999com@outlook.com
特别注意:本站所有转载文章言论不代表本站观点,本站所提供的摄影照片,插画,设计作品,如需使用,请与原作者联系,版权归原作者所有
- 国外程序员整理的Java资源大全(全部是干货) 2020-06-12
- 2020年深圳中国平安各部门Java中级面试真题合集(附答案) 2020-06-11
- 2020年java就业前景 2020-06-11
- 04.Java基础语法 2020-06-11
- Java--反射(框架设计的灵魂)案例 2020-06-11
IDC资讯: 主机资讯 注册资讯 托管资讯 vps资讯 网站建设
网站运营: 建站经验 策划盈利 搜索优化 网站推广 免费资源
网络编程: Asp.Net编程 Asp编程 Php编程 Xml编程 Access Mssql Mysql 其它
服务器技术: Web服务器 Ftp服务器 Mail服务器 Dns服务器 安全防护
软件技巧: 其它软件 Word Excel Powerpoint Ghost Vista QQ空间 QQ FlashGet 迅雷
网页制作: FrontPages Dreamweaver Javascript css photoshop fireworks Flash