current position:Home>Implementation principle of redis compressed list, skip table and bitmap

Implementation principle of redis compressed list, skip table and bitmap

2022-05-15 07:50:03Trouvailless

ziplist data structure

Memory will open up a large continuous space at one time , To hold the ziplist.

  • • zlbytes

32bit Memory space , Express ziplist The total number of bytes occupied

  • • zltail

32bit Memory space , Express ziplist The last item in the table (entry) stay ziplist The number of offset bytes in , adopt zltail You can easily find the last item , So that we can be in ziplist Fast execution at the end push or pop operation .

  • • zllen

16bit Memory space , Express ziplist Data item in entry The number of

  • • zlend:255

ziplist Last byte , It's an end mark , The value is fixed equal to 255

  • • entry

Represents the data item that actually stores the data , The length is variable

  • • prerawlen

Represents the data length information of the previous node .

If the size of the previous element prerawlen be equal to 254, Use a byte as a marker item , In use 4 Bytes describe the size of the previous element .

ziplist Additional information needs at most 5 Byte store , comparison quicklist The double end of a linked list requires one element 2 A pointer to the 16 Save a lot of memory overhead for a byte .

  • • len

entry The length of the data in

  • • data

Represents the real data in the current element , Business data storage , This is a very compact binary data structure .

Continuous memory space can also have disadvantages ?

ziplist Put in n When there are multiple elements , Then add elements to it , Need to allocate new memory space , And complete the full copy of the data . It's OK that there are few elements , But if there are many elements , It will consume performance .

Then you need to use the double ended linked list to control ziplist Size , If it exceeds a certain size , Then split into two , Make sure that each ziplist Not too big .

quicklist The bottom is quicklistNode,quicklistNode Point to ziplist.

Add some data , Like joining in ziplist Go inside , There is no need to offset the whole data , Because the data has been assigned to different ziplist It went to the , Modifying data , Just modify one ziplist That's all right. .

This is based on the optimization of double ended linked list , You can traverse from front to back , You can also traverse from back to front .

So let's see lpush Source code


First from redis The database gets the data

  • • c->db

db Is to operate redis database

  • • c->argv[1]

Such as execution lpush command

lpush alist a b c

argv[0] Namely lpush
argv[1] Namely alist

according to key Go to db Find data in

  • • key->ptr

key The string actually executed

Conduct rehash operation

-1 Indicates that there is no rehash, It's not equal to -1 It means that it is in progress rehash.

do rehash Methods

Filter non empty hash Slot

occasionally ,hash There may not be data on the slot , because hashtable After hashing, some are empty , If you find something in the loop empty_visits=10 individual hash The slot is empty , Don't do it rehash 了 .

This while After the end of the cycle , The rest is non empty hash Slot .

Get non empty hash The head node of the linked list on the slot

Start traversing the linked list , Recalculate hash value .

Here is the and operation , Instead of modulo operations , Higher performance .

  • • size

size The size is 2^n

  • • sizemask


The formula

Any number of % 2^n <=> Any number of & (2^n-1)

That is, the remainder is obtained by cumulative Division , One bit operation can calculate the result , More efficient .

To calculate the key In the new hashtable After the position on the , Will the old hash The slot index points to the new hashtable The header , And will be old hash The linked list on the slot is migrated to the new hashtable Up , Completed data migration .

After the migration , old hashtable Reduce the number of elements on the 1, New add 1.

One hash Slot one hash Migration data of the slot .

Judge the old hashtable Are there any elements in the , If not , Release the old hashtable.

Will the old hashtable(ht[0]) Point to the new hashtable(ht[1]), new hashtable(ht[1]) It was released .

thus , It's done. rehash The process of .

rehash After that , The next step is to find key The process of

Start with the old one hashtable lookup (ht[0]), Go back if you have , If not, find a new one hashtable(ht[1]), If not, return to null.

The next step is to check the type of data queried

See if it's List data type

lpush alist a b c

If alist No List It is string An exception will be reported, indicating that the type does not match

If in db No data found in , Then create a quicklist

list data type , The underlying code is quicklist.

The next step is to add the created data to the dictionary


Set each ziplist Maximum capacity of ,quicklist Data compression range of , Improve data access efficiency

  • • list-max-ziplist-size -2

Single ziplist The maximum storage capacity of a node is 8kb, If it exceeds, it splits , Store data in a new ziplist In nodes

Default 8kb The data can be divided into chains

Not recommended -5, Because the data is large, add elements to memory , It will involve copying data , Resulting in reduced performance .

  • • list-compress-depth 1

0 For all nodes , No compression ,1 Represents a step back from the head node , You don't have to compress a node forward , Everything else is compressed ,2,3,4 And so on

Double ended linked list memory is discontinuous , It will lead to memory fragmentation , After compression , You can make a whole block of memory space , Make the data more regular , Save memory space .


Each visit hashtable Do it once rehash The operation of .

rehash After that , Find the index value

Store elements : If you're doing rehash, Then put the new element into the new hashtable in

Head insertion , The latest addition , Will be visited more frequently

old hashtable The index points to the address of the newly created element ,entry Is the newly created node

hashtable It points to the head node of the linked list

Next, append the added data to list In the middle

establish quicklistNode and ziplist

Set data type

set Semantically disordered

The above integer 、 Few elements are realized through orderly data structure

Add a string element based on the above integer elements , The return is 1, Because only one element is added .

The code at this time is hashtable, Because the element of string data cannot be encoded with integer value , Will be converted back into a hashtable.

At this time, it becomes disordered , because hashtable Itself is a disordered data structure .

Set It is disorder , Set data type for automatic de duplication ,Set The underlying data structure is implemented as a value by null Dictionary (dict), When data can be represented by integers ,Set The set will be encoded as intset data structure . Either of the following two conditions is satisfied ,Set Will use hashtable Store the data

  • •  The number of elements is greater than set-max-intset-entries

set-max-intset-entries 512 intset The maximum number of elements that can be stored , If you exceed it, use hashtable code

  • •  The element cannot be represented as an integer

intset It's an array , Orderly , It's all plastic elements . Determine whether an element exists by binary search . Using integer arrays saves more space .

sadd Source code

from server.c In the document redisCommand redis Found... In the command set sadd

Inquire about redis db Whether there is set aggregate

If it doesn't exist , Create set aggregate

Determine whether the string can be converted to long, If possible, create intset

establish intset aggregate

The data type is set, The data code is intset

Otherwise create hashtable

set After the collection is created , take set Add to db in


If the data code is hashtable

Value is set to NULL


If it's a number , Add to intset In the middle



Determine the data range ,3 Array of types ,16 individual bit Bit ,32bit position ,64bit Bit , Determine which range the added value belongs to , Just use the corresponding array , This is to save memory space , Create different arrays according to the range of data , If the amount of data is very large , Just use a larger number of bits to create an array .

If the scope is larger than the existing scope , Just upgrade , If you exceed 32 Just use it 64 position , More than the 16 position , Just use 32 Bit array .

If it is less than the existing coding length

Query the data in the current intset Whether there is ,intset It will also automatically remove the weight , If it already exists , Go straight back .


The binary search is used here ,break It's over , And then determine if it exists

Moving one bit to the right is divided by 2

Equality is existence , Otherwise it doesn't exist , If it doesn't exist , Just record position, In this way, we can know where the data will be placed next .

Because to add elements , So we need to expand the capacity again

After the expansion , In the corresponding position Put the added elements in the

According to the length of data, different storage types are used to store

Set Data type: bottom layer intset code

The set of integers is an ordered , A structure for storing integer data , Integers are set in redis Can be saved in int16_t,int32_t,int64_t Integer data of type , And it can ensure that there will be no duplicate data in the set .

Hash data type

Hash The underlying implementation of the data structure is a dictionary (dict), It's also RedisDB Used to store K-V Data structure of , When the amount of data is small , Or the single element is relatively small , The underlying use ziplist Storage . The data size and element threshold can be set through the following parameters :

  • • hash-max-ziplist-entries 512

ziplist The number of elements exceeds 512, Change to hashtable code

  • • hash-max-ziplist-value 64

The size of a single element exceeds 64byte when , Change to hashtable code

Access in the order in which they are placed

Hash data type ziplist code

use hash De storage K-V When , such as name、age, A key value pair is disassembled into 2 Put elements into one ziplist Inside , So as to make more effective use of memory .

More than a single element 64 byte , It will become disordered hashtable Encoding storage

Orderly ZSet

zset/sorted_set data type

ZSet Is ordered , Set data type for automatic de duplication ,ZSet The bottom layer of the data structure is implemented as a dictionary (dict)+ Jump watch (skiplist), When the data is small , use ziplist The encoding structure stores

  • • zset-max-ziplist-entries 128

The number of elements exceeds 128, Will use skiplist code

  • • zset-max-ziplist-value 64

More than a single element 64byte, Will use skiplist

0,2 Indicates the ranking ;withscores Indicates that the score is printed

When the amount of data is small , Coding is a compact continuous space , High memory utilization , Automatic weight removal .

When the amount of data is large , The underlying coding becomes skiplist Jump watch

skiplist Jump table data structure

This is an ordered linked list , Looking for elements 79, Traverse one by one to find , The time complexity is O(n).

Put forward an index layer based on the data layer , Every other element is indexed , Use index to help query data , In the index layer until you find the ratio 79 Big data , You don't look back, that is, the index layer finds 78, Sink to the data layer 78, Looking for another one is 79 了 , This will skip a lot of elements , Improve query efficiency .

Another index layer 2

Another index layer 3

Time complexity analysis

data :n

index 1 n/2

index 2 n/2^2

index 3 n/2^3

index k n/2^k

Every time an index layer is added, the amount of data is reduced by half, that is, the third index layer is half of the second index layer , The second layer of the index is half of the first layer of the index , The first index layer is half the data layer .

Each layer only needs constant level query times , The number of queries is the floor height .

According to the data induction , Suppose the top layer of the index layer has 2 Data , namely n/2^k=2 => 2^k=n/2 => k= log_2 n

With 2 Bottom ,n The logarithm of is log n Time complexity of .

zadd Source code

Inquire about zset, without , Create a jump table or zskiplist

  • •  Create a skip table zskiplist

  • •  establish ziplist

zset Is a composite data structure : Dictionaries + Jump watch

Dictionaries are the fastest way to index records , The skip table stores ordered data .

zscore O(1) Time complexity of , Get points through elements , Query efficiency comes from zset Of dict Dictionary data structure .

dict Inside is the index , Not the real value , The real value is only one share , By dict and zskiplist share , Improve access speed on the premise of minimizing storage space .


  • • skiplistNode *header, *tail

bi-directional pointer

1、 You can traverse from front to back


2、 You can also traverse from back to front

In reverse order

  • • long length

Element number

  • • int level

The highest floor height of the index . stay redis The index in the index layer is not so regular , It's generated with random numbers .


  • • sds ele Elements

  • • double score The score is

  • • zskiplistNode *backward; Pointer of the same type , Point to the node behind

  • • zskiplistLevel level[] Index layer Is an array

  • • zskiplistLevel.zskiplistNode Point to the previous node

  • • long span The interval between two index layers

zskiplist Supplementary description of data structure

  • • header Point to the head node , The floor height of the head node is always 64 layer

  • • level Save the highest number of node layers in the hop table , However, the floor height of the head node is not calculated

  • • length Save the number of nodes in the hop table , But it doesn't include the head node

  • • tail Point to the tail node

Illustrate with examples :

Suppose you want to find score by 5 The elements of

(1) First, from the top of the node ( That is to say 64 layer ) Start looking for , However, because the header pointer stores the highest level of the current hop table , So directly from the beginning level Just start looking for layers . If zkiplistLevel Of forward by NULL, Then continue down

(2) Until... In the header node 5 Layer of zskiplistLevel Of forward Not empty , It points to the 6 The second... Of nodes 5 layer , But the of this node score by 6, Greater than 5, So we have to continue to search from the node to the lower level

(3) The... In the header node 4 Layer of zskiplistLevel Of forward Point to 3 The second... Of nodes 4 layer , Of this node score by 3, Less than 4, Continue from this node forward Traverse backward . its forward Point to 6 The second... Of nodes 4 layer , Of this node score by 6, Greater than 5, So I have to go back to the previous one score by 3 The node continues to search to the lower level

(4) The first 3 Node No 3 Layer of zskiplistLevel Of forward Point to 5 The second... Of nodes 3 layer , And its score Just for 5, Find success

Seek element 79 Ranking , Every time you pass a node, calculate the distance between them span, You can calculate its ranking .

Will generate a good zset Put in db in

First rehash, Then stored in hashtable In the middle .

Then calculate the score and go to zset Add elements to it

  • •  Go to ziplist Additive elements

  • •  Add elements to the jump table

Look up records from the jump table Dictionary

The time complexity is O(1).

If de Not empty , Indicates that the element already exists before , Now if it is setnx, Will directly hold the exception , Otherwise, you can change its value .

If two values are not equal , Delete before inserting data .


Redis The method in is relatively simple , Find the node to be modified first , Delete directly , Then insert the modified new node .

Find the location of the new element : Traverse all index layers , Find a place .

Judge where the score is , If in the position of the original element , You don't have to move elements , Update the score .

such as

take b The score of the element is determined by 200 Modified into 201, Other elements have not changed , No change in ranking .

take b from 200 Modified into 301, The rankings have changed

Delete the original element , Insert a new

Delete logic zslDeleteNode

(1) Find the node to delete , Delete node , Update the span of the previous node and the maximum floor height of the jump table

(2) Update all layers of the new node forward as well as backward The pointer

Add logic zslInsert

(1) First , When inserting nodes , New nodes will be randomly assigned a value less than 64 The number of layers , And the smaller the number of layers, the greater the probability , The higher the number of layers, the smaller the probability

(2) Found less than to be inserted score The node with the largest score of , If score equal , Through value Compare

(3) Insert a new node after the found node . At the same time, update the span of the previous node and the maximum floor height of the jump table

(4) Update all layers of the new node forward as well as backward The pointer

Insert node , The key lies in the creation of index layer and the relationship between pointers

First step : Find the insertion position

The second step : Create index layer

Before you create a node , First create the index layer , Generated by random algorithm

while The condition of a loop is a random number random Less than P=1/4 , The condition for ending the cycle is 1-P


level=1 The generation probability of the first layer index, that is, the probability that the cycle of the first layer index will end after it is generated, is 1-P=3/4


level=2 The generation probability of the second level index, that is, the probability that the cycle of the second level index will end after it is generated, is P*(1-P)

  • •  for the first time level=1 The probability of generating the first level index is 1/4

  • •  The second time level=2 The probability of generating the second level index is 3/4


level=3 The generation probability of the third layer index is P^2 * (1-P)

  • •  for the first time level=1 The probability of generating the first level index is 1/4

  • •  The second time level=2 The probability of generating the second level index is 1/4

  • •  third time level=3 The probability of generating the third level index is 3/4

Sum up : The higher the number of layers , The lower the probability of generation , Resulting in fewer indexes at higher levels , The lower the level, the more indexes .

The higher the number of layers , The lower the probability of occurrence , The fewer nodes , More elements can be identified at one time .

After the index is created , The initialization parameters of each index are set , Include span Reset .

The third step : Create nodes

The index layer is passed as a parameter into

After the node is created , Then update the node relationship , Including the relationship between the head node and it 、 And the relationship between nodes on both sides need to be recreated 、span The recalculation of

Then put the corresponding statistical data ++, Because a new element is added

Redis Advanced data types BitMap Bitmap

bit It is the smallest storage unit of a computer , The value is 0 and 1.

bitmap Provide for the right to bit Bit setting 、 operation 、 Statistical command , It is usually used to perform bit operations with minimal space consumption (AND/OR/XOR/NOT) To realize the operation of the State , Common use scenarios :

  • •  Daily life of mass user system / Weekly statistics

  • •  Continuous login user statistics

  • •  Judging by a large number of users and members

  • •  Determine whether the status of the record , Such as check in or not

Binary data is a continuous space in memory

Randomly index to the... To be operated bit Bit efficiency is very high , The time complexity is O(1).

  • • bitmap Can judge a certain bit It is 0 still 1

  • •  To a group bit The data statistics on the stream are therefore 1 The number of

  • •  The two one. bit Stream for bit operation (& And 、| or 、 Exclusive or XO)

bitmap Basic operation

Carry out specific operations on binary sequences

key Is a binary in memory bit Sequence , Each sequence has an index , That's the order , In fact, that is offset Offset .value The value is 0 or 1.

bitmap The bottom is string Type implementation ,bitmap Is itself a string.

help @string

setbit Back to 0 The old value is 0

other bit All bits default to 0

Statistics key In the corresponding binary sequence bit Position as 1 The number of .

copyright notice
author[Trouvailless],Please bring the original link to reprint, thank you.

Random recommended