深入剖析 redis 数据结构 ziplist

捣乱小子 2014-11-26 22:51:51 累计浏览 2,740 次

本机暂存

内容概览

这篇讲的是 Redis 中为了极致节省内存而设计的压缩链表 ziplist 的实现细节。作者从 Redis 的 list 结构有两种底层实现（普通双链表和 ziplist）切入，重点剖析了后者。

ziplist 的核心巧妙之处在于，它用一段连续的内存空间模拟了双向链表的功能，从而省去了每个节点额外的前驱和后驱指针开销（每个指针8字节）。文章详细拆解了 ziplist 的整体格式以及每个 entry 的 TLV（类型-长度-值）结构，特别是通过 `prelen` 字段记录前一项的长度来实现反向遍历，通过精心设计的 `encoding` 字段对不同长度的字符串和整数进行紧凑编码。

通过分析 `ziplistFind()` 函数的源码，文章展示了 ziplist 如何进行数据查找与比较。最后，文章点明了 ziplist 在 Redis 中的实际应用场景（如 Hash 结构在数据量小时的底层存储），并解释了它的性能优势：紧凑的线性内存布局不仅节省空间，还可能更好地利用 CPU 缓存，使得在数据量较小时，其查找性能甚至可以媲美哈希表。

概述

在 redis 中，list 有两种存储方式：双链表(LinkedList)和压缩双链表(ziplist)。双链表即普通数据结构中遇到的，在 adlist.h 和 adlist.c 中实现。压缩双链表以连续的内存空间来表示双链表，压缩双链表节省前驱和后驱指针的空间(8B)，这在小的 list 上，压缩效率是非常明显的；压缩双链表在 ziplist.h 和 ziplist.c 中实现。

这篇主要详述压缩双链表，普通双链表可以参看其他资料。

压缩双链表的具体实现

在压缩双链表中，节省了前驱和后驱指针的空间，共 8个字节，这让数据在内存中更为紧凑。只要清晰的描述每个数据项的边界，就可以轻易得到后驱数据项的位置；只要描述前驱数据项的大小，就可以定位前驱数据项的位置，redis 就是这么做的。

ziplist 的格式可以表示为：

<zlbytes><zltail><zllen><entry>...<entry><zlend>

zlbytes 是 ziplist 占用的空间；zltail 是最后一个数据项的偏移位置，这方便逆向遍历链表，也是双链表的特性；zllen 是数据项 entry 的个数；zlend 就是 255，占 1B.详细展开 entry 的结构。

entry 的格式即为典型的 type-lenght-value，即 TLV，表述如下：

|<prelen><<encoding+lensize><len>><data>|
|---1----------------2--------------3---|

域 1)是前驱数据项的大小。因为不用描述前驱的数据类型，描述较为简单。

域 2) 是此数据项的的类型和数据大小。为了节省空间，redis 预设定了多种长度的字符串和整数。

3种长度的字符串
#define ZIP_STR_06B (0 << 6)
#define ZIP_STR_14B (1 << 6)
#define ZIP_STR_32B (2 << 6)
 
5种长度的整数
#define ZIP_INT_16B (0xc0 | 0<<4)
#define ZIP_INT_32B (0xc0 | 1<<4)
#define ZIP_INT_64B (0xc0 | 2<<4)
#define ZIP_INT_24B (0xc0 | 3<<4)
#define ZIP_INT_8B 0xfe

域 3)为真正的数据。

透过 ziplist 查找函数 ziplistFind()，熟悉 ziplist entry 对数据格式：

// 在 ziplist 中查找数据项
/* Find pointer to the entry equal to the specified entry. Skip 'skip' entries
* between every comparison. Returns NULL when the field could not be found. */
unsigned char *ziplistFind(unsigned char *p, unsigned char *vstr, unsigned int vlen, unsigned int skip) {
    int skipcnt = 0;
    unsigned char vencoding = 0;
    long long vll = 0;
 
    while (p[0] != ZIP_END) {
        unsigned int prevlensize, encoding, lensize, len;
        unsigned char *q;
 
        ZIP_DECODE_PREVLENSIZE(p, prevlensize);
 
        // 跳过前驱数据项大小，解析数据项大小
        // len 为 data 大小
        // lensize 为 len 所占内存大小
        ZIP_DECODE_LENGTH(p + prevlensize, encoding, lensize, len);
 
        // q 指向 data
        q = p + prevlensize + lensize;
 
        if (skipcnt == 0) {
            /* Compare current entry with specified entry */
            if (ZIP_IS_STR(encoding)) {
            // 字符串比较
                if (len == vlen && memcmp(q, vstr, vlen) == 0) {
                    return p;
                }
            } else {
            // 整数比较
                /* Find out if the searched field can be encoded. Note that
                 * we do it only the first time, once done vencoding is set
                 * to non-zero and vll is set to the integer value. */
                if (vencoding == 0) {
                    // 尝试将 vstr 解析为整数
                    if (!zipTryEncoding(vstr, vlen, &vll, &vencoding)) {
                        /* If the entry can't be encoded we set it to
                         * UCHAR_MAX so that we don't retry again the next
                         * time. */
                        // 不能编码为数字！！！会导致当前查找的数据项被跳过
                        vencoding = UCHAR_MAX;
                    }
                    /* Must be non-zero by now */
                    assert(vencoding);
                }
 
                /* Compare current entry with specified entry, do it only
                 * if vencoding != UCHAR_MAX because if there is no encoding
                 * possible for the field it can't be a valid integer. */
                if (vencoding != UCHAR_MAX) {
                    // 读取整数
                    long long ll = zipLoadInteger(q, encoding);
                    if (ll == vll) {
                        return p;
                    }
                }
            }
 
            /* Reset skip count */
            skipcnt = skip;
        } else {
            /* Skip entry */
            skipcnt--;
        }
 
        // 移动到 ziplist 的下一个数据项
        /* Move to next entry */
        p = q + len;
    }
 
    // 没有找到
    return NULL;
}

注意，ziplist 每次插入新的数据都要 realloc。

为什么要用 ziplist

redis HSET 命令官网的描述是：

Sets field in the hash stored at key to value. If key does not exist, a new key holding a hash is created. If field already exists in the hash, it is overwritten.

实际上，HSET 底层所使用的数据结构正是上面所说的 ziplist，而不是平时所说的 hashtable。

那为什么要使用 ziplist，反对的理由是查找来说，(ziplist O(N))VS(hashtable O(1))？redis 可是为内存节省想破了头。首先 ziplist 比 hashtable 更节省内存，再者，redis 考虑到如果数据紧凑的 ziplist 能够放入 CPU 缓存(hashtable 很难，因为它是非线性的)，那么查找算法甚至会比 hashtable 要快！。ziplist 由此有性能和内存空间的有事。

同分类推荐文章

使用deepseek进行Oracle恢复,引起重大故障（2026-06-22 10:56:00）
接手一个只差临门一脚的数据库恢复（2026-06-18 00:13:09）
我做了一个 AI 版的 StarRocks 升级风险扫描工具，直接帮我定位到一个风险（2026-06-15 01:00:00）

查看更多数据库文章 →

建议继续学习

redis源代码分析 - persistence （累计阅读 32,229）
Redis消息队列的若干实现方式（累计阅读 12,088）
基于Redis构建系统的经验和教训（累计阅读 10,522）
浅谈redis数据库的键值设计（累计阅读 9,354）
【2014年版】异地购房提取北京公积金（累计阅读 9,148）
redis运维的一些知识点（累计阅读 8,685）
redis在大数据量下的压测表现（累计阅读 8,295）
Redis和Memcached的区别（累计阅读 8,071）
redis 运维实际经验纪录之一（累计阅读 7,713）
Redis作者谈Redis应用场景（累计阅读 7,671）