Redis研究院(2)——Dict-随遇而安

Redis字典使用哈希表作为底层实现，一个哈希表里面存在多个哈希表节点，而每个哈希表节点中又保存了一对键值对，同时哈希表也会进行rehash，防止hash碰撞。

哈希表

Redis的哈希表结构如下，dict.h/dictht：

typedef struct dictht {
	// 哈希表数组
   dictEntry **table;
   // 哈希表大小
   unsigned long size;
   // 哈希表掩码
   // 用于计算索引值，size-1
   unsigned long sizemask;
   // 已经存在的节点数量
   unsigned long used;
} dictht;

具体解释如下：

table，是一个数组，数组中的每个元素都是指向dictEntry结构的指针。
size，记录了哈希表的大小，也就是table数组的大小。
sizemask，总是等于size-1，通过sizemask来定位放在table数组的哪个索引上。
used，属性用于存储哈希表中已经存储dictEntry的数量。

哈希表节点

Redis使用如下结构保存键值对，dict.h/dictEntry：

typedef struct dictEntry {
    void *key;
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    struct dictEntry *next;
} dictEntry;

具体解释如下：

key，存储了键。
val，存储了值，值可以是一个指针，或是uint64_t，或是int64_t，或是double类型的双精度浮点数。
next，属性则是提供了指向下一个dictEntry的指针，从而形成了链表。

字典

Redis字典结构如下，dict.h/dict：

typedef struct dict {
    dictType *type;
    void *privdata;
    dictht ht[2];
    long rehashidx;
    unsigned long iterators;
} dict;

具体解释如下：

type，用于表示字典的类型。
privdata，用于存储字典的私有数据。
ht[2]，代表创建一个字典时，会创建两个哈希表。
rehashidx，存储了rehash时的索引。
iterators，正在迭代的迭代器数量。

一般情况下，我们在使用hash时，只会使用ht[0]哈希表，只有当进行rehash时，才会使用ht[1]哈希表。

当字典正常工作时，rehashidx的值通常为-1，代表着没有进行rehash操作。如果字典在进行rehash操作，那么rehashidx会随着rehash的进度的变化而变化。

哈希算法

当一个新的键值对添加到字典中时，需要先对键进行运算计算出哈希值，然后根据索引值进行与运算，得出该该节点存储的索引位置，最后将节点放到指定索引上，获取Hash索引的代码如下：

idx = hash & d->ht[table].sizemask;

rehash

随着增加或者删除的操作不断地进行，哈希表的大小会逐渐的增多或减少，当节点的数量太多或太少时，Redis会对字典进行扩容或者收缩操作，这些操作是通过rehash来实现的，我们来看下关键函数，dict.c/_dictExpandIfNeeded：

static int _dictExpandIfNeeded(dict *d)
{
    if (dictIsRehashing(d)) return DICT_OK;
    if (d->ht[0].size == 0) return dictExpand(d, DICT_HT_INITIAL_SIZE);
    if (d->ht[0].used >= d->ht[0].size &&
        (dict_can_resize ||
         d->ht[0].used/d->ht[0].size > dict_force_resize_ratio))
    {
        return dictExpand(d, d->ht[0].used*2);
    }
    return DICT_OK;
}

如果当前已经存储的节点数量，大于哈希表的大小，并且以下两个条件任意一个为真：

字典可以进行扩容或收缩。
字典已存储的节点的数量和哈希表的大小比值大于字典强制进行resize的值。

那么就会进行扩容或收缩操作，具体实现函数如下，dict.c/dictExpand：

int dictExpand(dict *d, unsigned long size)
{
    dictht n; 
    unsigned long realsize = _dictNextPower(size);
    if (dictIsRehashing(d) || d->ht[0].used > size)
        return DICT_ERR;
    if (realsize == d->ht[0].size) return DICT_ERR;
    n.size = realsize;
    n.sizemask = realsize-1;
    n.table = zcalloc(realsize*sizeof(dictEntry*));
    n.used = 0;
    if (d->ht[0].table == NULL) {
        d->ht[0] = n;
        return DICT_OK;
    }
    d->ht[1] = n;
    d->rehashidx = 0;
    return DICT_OK;
}

我们会根据ht[0]已经存储节点数量的2倍创建一个新的哈希表，同时设置好哈希表的其他属性。

并将字典ht[1]指向这个新创建的哈希表，最后开始进行后rehash操作。

也就是将ht[0]表中的数据，重新计算索引值，放入到ht[1]表中。

渐进式rehash

Redis为了保证读写效率，rehash动作是分多次，渐进式完成的，这样做的好处是，如果字典中的键值对过多，可以保证在rehash过程中，不会影响读写请求，具体实现函数如下：

int dictRehash(dict *d, int n) {
    int empty_visits = n*10;
    if (!dictIsRehashing(d)) return 0;
    while(n-- && d->ht[0].used != 0) {
        dictEntry *de, *nextde;
        assert(d->ht[0].size > (unsigned long)d->rehashidx);
        while(d->ht[0].table[d->rehashidx] == NULL) {
            d->rehashidx++;
            if (--empty_visits == 0) return 1;
        }
        de = d->ht[0].table[d->rehashidx];
        while(de) {
            unsigned int h;
            nextde = de->next;
            h = dictHashKey(d, de->key) & d->ht[1].sizemask;
            de->next = d->ht[1].table[h];
            d->ht[1].table[h] = de;
            d->ht[0].used--;
            d->ht[1].used++;
            de = nextde;
        }
        d->ht[0].table[d->rehashidx] = NULL;
        d->rehashidx++;
    }
    if (d->ht[0].used == 0) {
        zfree(d->ht[0].table);
        d->ht[0] = d->ht[1];
        _dictReset(&d->ht[1]);
        d->rehashidx = -1;
        return 0;
    }
    return 1;
}

渐进式rehash过程如下：

将rehashidx设置为0，rehash操作正式开始。
在rehash进行期间，Redis对该字典的添加、查找、更新、删除操作，除了执行正常的逻辑外，还会顺带将rehashidx索引上的所有节点rehash到ht[1]哈希表中。当该索引的rehash工作完成之后，rehashidx的值自增1。
随着字典操作的不断进行，最后rehash将会完成，Redis这是会将rehashidx设置为-1，代表rehash工作已经完成。

在进行rehash操作时，Redis对字典的加、查找、更新、删除操作会同时在两个哈希表上进行，具体实现函数如下：

dictEntry *dictAddRaw(dict *d, void *key, dictEntry **existing)
{
    int index;
    dictEntry *entry;
    dictht *ht;
    if (dictIsRehashing(d)) _dictRehashStep(d);
    if ((index = _dictKeyIndex(d, key, dictHashKey(d,key), existing)) == -1)
        return NULL;
    ht = dictIsRehashing(d) ? &d->ht[1] : &d->ht[0];
    entry = zmalloc(sizeof(*entry));
    entry->next = ht->table[index];
    ht->table[index] = entry;
    ht->used++;
    dictSetKey(d, entry, key);
    return entry;
}

Redis研究院(2)——Dict

哈希表

哈希表节点

字典

哈希算法

rehash

渐进式rehash

相关推荐

评论抢沙发

近期文章

我的标签

友情链接

码字不容易，路过请投币

支付宝扫一扫

微信扫一扫

哈希表

哈希表节点

字典

哈希算法

rehash

渐进式rehash

相关推荐

评论 抢沙发

近期文章

我的标签

友情链接

码字不容易，路过请投币

支付宝扫一扫

微信扫一扫

评论抢沙发