current position:Home>what the fuck! I'm flattered. He actually wrote down the answers to the redis interview questions that big companies often ask!

what the fuck! I'm flattered. He actually wrote down the answers to the redis interview questions that big companies often ask!

2022-06-24 12:41:59Java program ape

Preface

If you are a back-end person , about Redis Should not be strange ,Redis Practical notes , Now most of the projects have been applied to Redis As the construction of cache layer . During the interview process, I like to ask more and more about Redis Relevant knowledge . Today, let's talk about Redis Those in the cluster love to test , And all need to know the relevant knowledge .

1、 What is? Redis,Redis What are the characteristics of ?

Redis Its full name is :Remote Dictionary Server( Remote data services ),Redis It's a kind of support key-value A storage system with multiple data structures . Available for caching , Event publishing or subscription , High speed queue and other scenes . Support network , Provide string , Hash , list , queue , Set structure direct access , Memory based , Be persistent .

characteristic 1: Rich data types

We know that many databases can only handle one data structure :

  • Tradition SQL The database processes two-dimensional relational data ;
  • MemCached database , Keys and values are strings ;
  • Document database (MongoDB) By Json/Bson Composed documents .

Of course not. These databases are not good , But once the data structure provided by the database is not suitable for doing something , The program is very cumbersome and unnatural to write .

Redis Although it's also a key value pair database , But and Memcached The difference is :Redis The value of can be more than a string , It can also be any of the other five data institutions . By choosing different data structures , Users can use Redis Solve all kinds of problems , Use Redis, You have a problem , First of all, I will think of which kind of data structure to solve the functional problems , With a variety of data structures , It's convenient for you to solve problems .

characteristic 2: Memory storage

There are two kinds of databases : One is the hard disk database , One is the memory database .

The hard disk database stores values on the hard disk , Just store the index in memory , When the hard disk database wants to access the value of the hard disk , It first finds the index in memory , And then find the value . The problem lies in , When reading and writing to the hard disk , If you read and write more , It will put the hard disk IO The function is blocked .

Memory storage is to store all data in memory , Data read and write very fast .

characteristic 3: Persistence function

Save the data stored in memory to the hard disk , Ensure data security , Convenient for data backup and recovery .

2、Redis What are the data structures ?

Redis yes key-value database ,key The type of can only be String, however value The data types are more abundant , There are mainly five kinds of :

  • String
  • Hash
  • List
  • Set
  • Sorted Set

(1)String character string

SET KEY_NAME VALUE

string Type is binary safe . intend redis Of string Can contain any data . such as jpg Picture or serialized object . string The type is Redis Basic data types , Maximum storage of one key 512MB.

(2)Hash Hash

HSET KEY_NAME FIELD VALUE

Redis hash It's a key value (key=>value) The collection . Redis hash It's a string Type of field and value Mapping table ,hash Ideal for storing objects .

(3)List list

// stay  key  Corresponding  list  Add string elements to the header of 
LPUSH KEY_NAME VALUE1.. VALUEN
// stay  key  Corresponding  list  Add string elements to the end of 
RPUSH KEY_NAME VALUE1..VALUEN
// Corresponding  list  Delete in  count  And  value  Same element 
LREM KEY_NAME COUNT VALUE
// return  key  Corresponding  list  The length of 
LLEN KEY_NAME

Redis List is a simple list of strings , Sort by insertion order . You can add an element to the head of the list ( On the left ) Or tail ( On the right )

(4)Set aggregate

SADD KEY_NAME VALUE1...VALUEn

Redis Of Set yes string Unordered collection of type . Collections are implemented through hash tables , So add the , Delete , The complexity of searching is O(1).

(5)Sorted Set Ordered set

ZADD KEY_NAME SCORE1 VALUE1.. SCOREN VALUEN

Redis zset and set The same is true. string Collection of type elements , And duplicate members are not allowed . The difference is that each element is associated with a double Score of type .

redis It's the scores that sort the members of a collection from small to large .

zset Members of are unique , But fractions (score) But it can be repeated .

3、 What is the maximum storage capacity of a string type value ?

Check official documents (https://redis.io/topics/data-types) You can see String Type of value The maximum supported length of value is 512M, So the correct answer is 512M.

4、 Can you tell me Redis Use scenarios for each data structure

(1)String Usage scenarios of

String type usage scenarios : Information caching 、 Counter 、 Distributed locks and so on .

Common commands :get/set/del/incr/decr/incrby/decrby

Actual combat scene 1: Record the number of visits per user , Or record the number of visits to each product

programme :

Common key names : userid:pageview perhaps pageview:userid, If a user's id by 123, That corresponds to redis key for pageview:123,value It's the number of user visits , Increase the number of times you can use the command :incr.

Reason for use : The number of visits or product views per user is changed frequently , If you use mysql Frequent changes to this file system can cause mysql pressure , It's also inefficient . While using redis There are two benefits : Using memory , Soon ; Single thread , So there's no competition , The data won't be scrambled .

Actual combat scene 2: Cache reads frequently , But infrequently modified information , Such as user information , Video information

programme :

Business logic : First from redis Read , If it's worth it, start with redis Read , No, from mysql Read , And write a copy to redis As a cache , Pay attention to setting the expiration time .

Key value design :

Direct users to a mysql Record serialization ( Serialization is usually json) As value ,userInfo:userid As key, Key names such as :userInfo:123,value Storing the corresponding user information json strand . Such as key by :"user:id :name:1", value by "{"name":"leijia","age":18}".

Actual combat scene 3: Limit a certain ip The number of visits in a specific period of time

programme :

use key Record IP,value Record number of visits , meanwhile key The expiration time for is set to 60 second , If key If it is overdue, reset it , Otherwise, judge , When visiting more than in a minute 100 Time , Access to .

Actual combat scene 4: Distributed session

We know session It is stored in the server as a file ; If your application is load balanced , Put your project on multiple servers , When the user is on the server A Login on ,session The document will be written in A The server ; When a user jumps to a page , The request is assigned to B On the server , I can't find this session file , The user has to log in again .

If you want multiple servers to share one session, Can be session Store in redis in ,redis It can be independent of all load balancing servers , It can also be placed on one of the load balancing servers ; But all applications are connected to the same server redis The server .

(2)Hash Usage scenarios of

Take the shopping cart as an example , user id Set to key, Then all the goods in the shopping cart are the users key The corresponding value is , Every product has id And the number of purchases , Corresponding hash The structure of is commodity id by field, The quantity of goods is value. As shown in the figure :

If the goods are id And the number of goods sequenced into json character string , Then you can also use the above mentioned string Type storage . Let's compare the two data structures :

|

Contrast item

|

string(json)

|

hash

|

To sum up :

When an attribute of an object needs to be modified frequently , res string+json, Because it's not flexible enough , Every time you make a change, you need to serialize and assign the whole object again ; If you use hash type , Then you can modify a property separately , No serialization , There's no need to modify the entire object . such as , prices for goods 、 sales 、 Pay attention to several 、 Evaluation number and other attributes that may change frequently , Is suitable for storage in hash In type .

(3)List Usage scenarios of

The essence of a list is an ordered , Element repeatable queues .

Actual combat scene : Timing charts

list Type of lrange The command can be paged to view the data in the queue . You can store the leaderboard that is calculated every once in a period of time in list Type in the , Such as QQ Mainland music charts , It is calculated once a week and stored in list Type in the , Access the interface through page and size Pagination is converted to lrange Command to get leaderboard data .

however , Not all leaderboards work list Type implementation , Only timed rankings are suitable for use list Type storage , Corresponding to the leaderboard of timing calculation is the leaderboard of real-time computing ,list Type does not support leaderboard for real-time computing , Here's an introduction to ordered sets sorted set The implementation of the real-time computing leaderboard will be introduced in detail .

(4)Set Usage scenarios of

Set is characterized by disorder and certainty ( No repetition ).

Actual combat scene : favorites

for example QQ If you like a song in music , Order one 『 like 』 You'll put the song in your favorites , Each user makes a collection of collections , Each collection stores the songs that users have collected id.

key For the user id,value For the song id Set .

(5)Sorted Set Usage scenarios of

The characteristic of ordered set is order , No repetition value . And set The difference is sorted set Each element is associated with a score attribute ,redis It is through score To sort the members of a collection from small to large .

Actual combat scene : Real time leaderboard

QQ There are many real-time lists in music , Like the skyrocketing list 、 Hot song list 、 New song list , It can be used redis key Storage list type ,score For the number of hits ,value For the song id, Every time a user clicks a song, it updates redis data ,sorted set Will be based on score That is, the number of hits will be the song id Sort .

5、Redis How to do persistent ? Can you tell me RDB and AOF How does it work ?

What is persistence ?

Persistence (Persistence), That's data ( Like objects in memory ) Save to a permanent storage device ( Disk ). The main application of persistence is to store objects in memory in database , Or stored in a disk file 、XML Data files and so on .

You can also understand persistence from the following two levels :

  • application layer : If shut down (shutdown) Your app then restarts and the previous data still exists .
  • System level : If shut down (shutdown) Your system ( The computer ) Then restart, and the previous data still exists .

Redis Why persistence ?

Redis Is a memory database , To ensure efficiency, all operations are done in memory . The data is cached in memory , When you restart or shut down the system , The data previously cached in memory will be lost and can never be retrieved again . So to avoid this ,Redis Persistence is needed to store the data in memory .

Redis How to achieve persistence ?

Redis The authorities offer different levels of persistence :

  • RDB Persistence : Be able to snapshot your data at a specified time interval .
  • AOF Persistence : Record every write to the server , These commands are reexecuted when the server restarts to restore the original data ,AOF Command to redis Protocol append saves each write to the end of the file .Redis Also able to AOF The file is rewritten in the background , bring AOF The size of the file should not be too large .
  • Don't use persistence : If you only want your data to exist when the server is running , You can also choose not to use any persistence method .
  • At the same time open RDB and AOF: You can also turn on two persistence methods at the same time , In this case, when redis It will be loaded prior to restart AOF File to restore the original data , Because in general AOF The data set saved by the file is better than RDB The data set of the file should be complete .

With so many persistence methods, how should we choose ? Before choosing, we need to understand the difference between each persistence method and its advantages and disadvantages .

RDB Persistence

RDB(Redis Database) Persistence is the process of generating a snapshot of the current memory data and saving it to the hard disk , Trigger RDB The persistence process is divided into manual and automatic triggers .

(1) Manual trigger

Manual trigger corresponds to save command , Will block the current Redis The server , until RDB Until the process is complete , For instance with large memory, it will cause long-term blocking , Online environment is not recommended .

(2) Automatic triggering

Auto trigger corresponds to bgsave command ,Redis Process execution fork Action create subprocess ,RDB The persistence process is the responsibility of the subprocess , It will automatically end when it is finished . The blockage only happens in fork Stage , The average time is very short .

stay redis.conf You can configure... In the configuration file :

save <seconds> <changes>

Express xx Second data change xx It will be triggered automatically next time bgsave. If you want to turn off auto trigger , Can be in save Add an empty string after the command , namely :

save ""

There are other common triggers bgsave, Such as :

  • If full replication is performed from a node , Automatic execution of master node bgsave Generate RDB File and send to slave .
  • Execute by default shutdown On command , If it's not on AOF The persistence function is Automatic execution bgsave.

bgsave Working mechanism

(1) perform bgsave command ,Redis The parent process determines whether there are currently executing child processes , Such as RDB/AOF Subprocesses , If there is ,bgsave The command returns directly to .

(2) Parent process execution fork Action create subprocess ,fork Parent process will block during operation , through too info stats Command view latest_fork_usec Options , You can get the latest fork Operation time , In microseconds

(3) The parent process fork After completion ,bgsave Command return “Background saving started” Information doesn't block the parent process anymore , You can continue to respond to other commands .

(4) Subprocess creation RDB file , Generate a temporary snapshot file based on the memory of the parent process , After the completion of the original file for atomic replacement . perform lastsave Command to get the last build RDB Of Time , Corresponding info The statistical rdb_last_save_time Options .

(5) The process signals the parent process that it's done , Parent process update statistics , Specific view info Persistence Under the rdb_* Related options .

-- RDB After persistence --

AOF Persistence

AOF(append only file) Persistence : Each write is logged as a separate log , Reexecute on reboot AOF The command in the file restores the data .AOF Its main function is to solve the real-time problem of data persistence , So far Redis The mainstream way of persistence .

AOF Persistence working mechanism

Turn on AOF Functions need to be configured :appendonly yes, Not on by default .

AOF file name adopt appendfilename Configuration settings , The default filename is appendonly.aof. Save path is the same as RDB Consistent persistence , adopt dir Configuration assignment .

AOF The workflow operation of : Command write (append)、 File synchronization (sync)、 File rewriting (rewrite)、 Restart loading (load).

(1) All write commands are appended to aof_buf( buffer ) in .

(2)AOF The buffer synchronizes the hard disk according to the corresponding policy .

AOF Why add a command to aof_buf in ?Redis Responding to commands with a single thread , If you write every time AOF File commands are all appended directly to the hard disk , The performance depends on the current hard disk load . Write buffer first aof_buf in , There is another benefit ,Redis Can provide a variety of buffer synchronization hard disk strategy , Balance performance and security .

(3) With AOF The files are getting bigger , It needs to be done regularly AOF File rewriting , Achieve the purpose of compression .

(4) When Redis When the server restarts , Can be loaded AOF File for data recovery .

AOF rewrite (rewrite) Mechanism

The purpose of rewriting :

  • Reduce AOF Files take up space ;
  • Smaller AOF Files can be made faster by Redis Load recovery .

AOF Rewriting can be divided into manual triggering and automatic triggering :

  • Manual trigger : Call directly bgrewriteaof command .
  • Automatic triggering : according to auto-aof-rewrite-min-size and auto-aof-rewrite-percentage Parameter determination of automatic trigger time .

auto-aof-rewrite-min-size: Said to run AOF Minimum file size on rewrite , Default by 64MB.

auto-aof-rewrite-percentage: On behalf of the current AOF File space (aof_current_size) And after the last rewrite AOF File space (aof_base_size) The ratio of the .

Automatic trigger time

When aof_current_size>auto-aof-rewrite-minsize also (aof_current_size-aof_base_size)/aof_base_size>=auto-aof-rewritepercentage.

among aof_current_size and aof_base_size Can be in info Persistence See... In statistics .

AOF Why will the small file be rewritten ?

(1) old AOF File contains invalid command , Such as :del key1, hdel key2 etc. . Rewrite a write command that retains only the final data .

(2) Multiple commands can be combined , Such as lpush list a,lpush list b,lpush list c It can be directly transformed into lpush list a b c.

Data recovery process description :

(1)AOF Persistence is on and exists AOF When you file , Priority load AOF file .

(2)AOF Close or AOF When the file does not exist , load RDB file .

(3) load AOF/RDB After the document is successful ,Redis Successful launch .

(4)AOF/RDB When there is an error in the file ,Redis Failed to start and print error message .

-- AOF After persistence --

RDB and AOF Advantages and disadvantages

RDB advantage

  • RDB It's a very compact file , It holds a data set at a certain point in time , Very suitable for data set backup , For example, you can save the past in every small times 24 Data in hours , And save the past every day 30 Days of data , In this way, even if there is a problem, you can recover to different versions of datasets according to your needs .
  • RDB It's a compact single file , It's easy to transfer to another remote data center , Ideal for disaster recovery .
  • RDB In preservation RDB The only thing the parent process needs to do when it files is fork Make a sub process , The next work is all done by subprocesses , The parent process doesn't need to do anything else IO operation , therefore RDB Persistence can maximize Redis Performance of .
  • And AOF comparison , When recovering large datasets ,RDB The way will be faster .

AOF advantage

  • You can use different fsync Strategy : nothing fsync、 Per second fsync 、 Every time I write fsync . Use the default per second fsync Strategy , Redis The performance is still very good ( fsync It is processed by the background thread , The main thread will try its best to handle client requests ), In case of failure , The most you lose 1 Second data .
  • AOF A file is a log file that only appends , So there's no need to write seek, Even for some reason ( Disk space is full , Downtime during writing, etc ) The full write command was not executed , You can also use redis-check-aof Tools to fix these problems .
  • Redis Can be in AOF When the file size becomes too large , Automatically in the background AOF Rewrite : The rewritten new AOF The file contains the minimum set of commands required to recover the current dataset . The whole rewrite operation is absolutely safe , because Redis Creating a new AOF In the process of documentation , Will continue to append the command to the existing AOF In the document , Even if there is a outage during the rewrite , The existing AOF Documents will not be lost . And once it's new AOF File creation complete ,Redis From the old AOF File switch to new AOF file , And start on the new AOF File to append .
  • AOF The file holds all writes to the database in an orderly manner , These write operations to Redis The format of the protocol is saved , therefore AOF The contents of the document are very easy to read , Analyze the document (parse) It's easy too . export (export) AOF The documents are also very simple : for instance , If you don't execute it carefully FLUSHALL command , But as long as AOF The file has not been rewritten , So just stop the server , remove AOF At the end of the document FLUSHALL command , And restart Redis , You can restore the dataset to FLUSHALL Status before execution .

RDB shortcoming

  • Redis It's a lot of work to save the whole data set completely , You usually do it every 5 Make a complete save in minutes or more , In case of Redis Unexpected downtime , You may lose a few minutes of data .
  • RDB Need to often fork Subprocess to save the data set to the hard disk , When the data set is large , fork The process is very time consuming , May lead to Redis Can't respond to client requests in milliseconds .

AOF shortcoming

  • For the same dataset ,AOF The volume of the file is usually larger than RDB Volume of file .
  • Data recovery (load) when AOF Than RDB slow , Usually RDB More guaranteed maximum delay time .

RDB and AOF A simple comparative summary

RDB advantage :

  • RDB It's a compact binary file , More suitable for backup , Full scale replication and other scenarios
  • RDB Recover data much faster than AOF

RDB shortcoming :

  • RDB No real-time or second level persistence ;
  • The old and new versions are not compatible RDB Format .

AOF advantage :

  • Better protection against data loss ;
  • appen-only Mode writing performance is relatively high ;
  • It's suitable for catastrophic accidental deletion and emergency recovery .

AOF shortcoming :

  • For the same document ,AOF Documents are better than RDB Big snapshot ;
  • AOF After opening , I'll write about it QPS Have an impact , be relative to RDB Come on Write QPS To go down ;
  • Database recovery is slow , Not suitable for cold standby . redis Internal use of file event handler file event handler, This file event handler is single threaded , therefore redis It's called a single threaded model . It uses IO Multiplexing mechanism listens to multiple channels at the same time socket, according to socket Select the corresponding event handler to handle the event on .

If the interviewer continues to ask why redis Single thread model can also be so efficient ?

  • Pure memory operation
  • The core is based on non blocking IO Multiplexing mechanism
  • On the contrary, single thread avoids the frequent context switching problem of multithreading

In the actual production environment, cache penetration is sometimes encountered 、 Cache breakdown 、 Cache avalanche and other abnormal scenarios , In order to avoid the huge loss caused by the abnormality , We need to understand the cause of each exception and the solution , Help improve system reliability and high availability .

(1) Cache penetration

What is cache penetration ?

Cache penetration refers to that the data requested by the user does not exist in the cache, that is, there is no hit , And it doesn't exist in the database , As a result, every time the user requests the data, he / she has to query the database again , Then return to empty .

If a malicious attacker keeps asking for data that doesn't exist in the system , It will result in a large number of requests falling on the database in a short time , Cause too much database pressure , Even the database system .

Common solutions for cache penetration

(1) The bloon filter ( recommend )

The bloon filter (Bloom Filter, abbreviation BF) from Burton Howard Bloom stay 1970 in , It is a probabilistic data structure with high spatial efficiency .

The bloom filter is designed to detect the presence of a specific element in a collection .

If we usually want to judge whether an element is in a set , We usually use the method of search and comparison , The following analysis of different data structure search efficiency :

  • Use linear table to store , Search time complexity is O(N)
  • Using a balanced binary sort tree (AVL、 Red and black trees ) Storage , Search time complexity is O(logN)
  • Using hash table to store , Considering hash collisions , The overall time complexity also needs to be Olog(n/m)

When you need to determine whether an element exists in a massive data set , Not only is the search time slow , It also takes up a lot of storage space . Let's take a look at how the bloon filter solves this problem .

The design idea of Bloom filter

The bloon filter consists of a length of m The bit set of bits (bit array) And k Hash functions (hash function) Data structure of composition . The initialization of bit group is 0, All hash functions can hash the input data as evenly as possible .

When you want to insert an element into a bloom filter , The element passes through k Hash functions are computed to produce k Hash values , Use the hash value as the subscript in the digit group , Will all k The corresponding bit values are determined by 0 Set as 1.

When you want to query an element , The hash function is also used to calculate the hash value , Then check the corresponding k Bit values : If any one of the bits is 0, Indicates that the element must not be in the collection ; If all bits are 1, The set indicates that there is possibility in the set . Why not necessarily in the set ? Because different elements may have the same hash value , There will be hash collisions , As a result, a nonexistent element may correspond to bits of 1, That's what's called “ False positive ”(false positive). relatively ,“ false negative ”(false negative) stay BF China will never appear .

To sum up : The bloon filter doesn't think it's there , It must not be in the collection ; The bloon filter thinks it's in , It may or may not be in the set .

for instance : Here's a bloon filter , share 18 A bit ,3 Hash functions . Three elements in the set x,y,z Hash to different bits through three hash functions , And the bit position is 1. When querying elements w when , Through three hash functions , It is found that there is a bit whose value is 0, It is certain that the element is not in the collection .

The advantages and disadvantages of the bloon filter

advantage :

  • Save a space : There is no need to store the data itself , Just store the data corresponding to hash bits
  • Low time complexity : The time complexity of insertion and search is O(k),k Is the number of hash functions

shortcoming :

  • There is a false positive : The bloom filter determines that there is , It is possible that the element is not in the set ; The accuracy depends on the number of hash functions
  • Cannot delete element : If an element is deleted , But it can't be removed from the bloom filter , This is also the cause of false positive

The application scenarios of the bloom filter

  • Reptile system url duplicate removal
  • Spam filtering
  • The blacklist

(2) Return empty object

When cache misses , Query persistence layer is also empty , The returned empty object can be written to the cache , So next time you ask for the key Query directly from the cache and return empty objects , Requests don't fall into the persistence layer database . To avoid storing too many empty objects , An expiration time is usually set for an empty object .

There are two problems with this approach :

  • If there's a lot of key through , Caching empty objects takes up valuable memory space .
  • Empty object's key Set expiration time , During this period, there may be scenarios in which the cache and persistence layer data are inconsistent .

(3) Cache breakdown

What is cache breakdown ?

Cache breakdown , It means a key Very hot , Constantly carrying big concurrency , Large concurrent centralized access to this point , When this key At the moment of failure , Continuous large concurrency breaks through the cache , Direct request database , It's like cutting a hole in a barrier .

Cache breakdown hazard

The transient pressure of the database suddenly increases , Causing a lot of requests to block .

How to solve ?

Scheme 1 : Use mutexes (mutex key)

This idea is relatively simple , Let a thread write back to the cache , Other threads wait for the write back cache thread to execute , Reread the cache .

At the same time, only one thread reads the database and writes back to the cache , All other threads are blocked . If it's a high concurrency scenario , A lot of thread blocking is bound to reduce throughput . How to solve this situation ? You can discuss it in the message area .

If it's a distributed application, you need to use distributed locks .

Option two : Hot data never expires

Never expired actually has two meanings :

  • Physics doesn't expire , For hot spots key Do not set expiration time
  • Logical expiration , Save the expiration time key Corresponding value in , If it's found to be overdue , Build the cache through a background asynchronous thread

From the actual combat point of view, this method is very performance friendly , The only drawback is when building the cache , The rest of the threads ( Threads that are not building the cache ) Maybe it's old data , It is acceptable for systems that do not pursue strict strong consistency .

(3) Cache avalanche

What is a cache avalanche ?

Cache avalanche refers to the time from bulk to expiration of data in cache , And the amount of query data is huge , The request goes directly to the database , Cause too much pressure on the database and even downtime . Unlike cache breakdown , Cache breakdown refers to concurrent query of the same data , The cache avalanche is that different data has expired , A lot of data can't be found to look up the database .

Cache avalanche solution

Common solutions are :

  • Even expiration
  • Add mutex lock
  • Cache never expires
  • Double layer cache strategy

(1) Even expiration

Set different expiration times , Make the cache failure time as uniform as possible . It is usually possible to add a random value to the validity period or to plan the validity period uniformly .

(2) Add mutex lock

It's consistent with the solution of cache breakdown , Let only one thread build the cache at the same time , Other threads block queuing .

(3) Cache never expires

It's consistent with the solution of cache breakdown , Caches never expire physically , Update the cache with an asynchronous thread .

(4) Double layer cache strategy

Use the primary and secondary layer cache :

Main cache : The validity period is set according to the experience value , Set the cache for primary reads , Load the latest value from the database after the primary cache fails .

Backup cache : It has a long validity period , Cache read when lock failed , When the primary cache is updated, the backup cache needs to be updated synchronously .

(4) Cache preheating

What is cache preheating ?

Cache preheating After the system goes online , Load the relevant cache data directly into the cache system , In this way, users can avoid , Query database first , And then write the data back to the cache .

If you don't preheat , that Redis The initial state data is empty , In the early stage of system launch , For high concurrency traffic , Will access the database , Pressure on database traffic .

How to warm up the cache

  • When the amount of data is small , Load the cache when the project starts ;
  • When there's a lot of data , Set up a timed task script , Refresh the cache ;
  • When there's too much data , The priority is to ensure that the hot data is loaded into the cache in advance .

Cache degradation refers to cache failure or cache server down , No access to the database , Directly return the default data or access the memory data of the service .

In project practice, some hot data is usually cached in the memory of the service , In this way, once the cache is abnormal , The memory data of the service can be used directly , To avoid huge pressure on the database .

Degradation is generally a lossy operation , So try to reduce the impact of the downgrade on the business .

8、Redis Memory obsolescence mechanism of

Redis Memory elimination strategy refers to when cache memory is insufficient , By eliminating old data, processing new data selection strategies .

How to configure maximum memory ?

(1) Configure through profile

modify redis.conf The configuration file

maxmemory 1024mb // Set up Redis The maximum occupied memory size is 1024M

Be careful :maxmemory Default configuration is 0, stay 64 Bit operating system redis The maximum memory is the remaining memory of the operating system , stay 32 Bit operating system redis The maximum memory is 3GB.

(2) Configure through dynamic commands

Redis It supports dynamic modification of memory size by command at runtime :

127.0.0.1:6379> config set maxmemory 200mb // Set up Redis The maximum occupied memory size is 200M
127.0.0.1:6379> config get maxmemory // Get the set Redis The maximum memory size that can be used 
1) "maxmemory"
2) "209715200"

The classification strategy of elimination

Redis After running out of memory , If you continue to add data , How to deal with this situation ? actually Redis Officials have defined eight strategies to deal with this situation :

noeviction

The default policy , Direct return error for write request , No elimination .

allkeys-lru

lru(less recently used), Recently at least use . From all of key Use approximation LRU The algorithm is eliminated .

volatile-lrulru(less recently used), Recently at least use . From the set expiration time of key Use approximation LRU The algorithm is eliminated .

allkeys-random

From all of key In random elimination .

volatile-random

From the set expiration time of key In random elimination .

volatile-ttl

ttl(time to live), After setting the expiration time key According to the key The expiration time for elimination , The earlier they expire, the better they will be eliminated .

allkeys-lfu

lfu(Least Frequently Used), Minimum frequency of use . From all of key Use approximation LFU The algorithm is eliminated . from Redis4.0 Start supporting .

volatile-lfu

lfu(Least Frequently Used), Minimum frequency of use . From the set expiration time of key Use approximation LFU The algorithm is eliminated . from Redis4.0 Start supporting .

Be careful : When using volatile-lru、volatile-random、volatile-ttl These three strategies are , If no expired key Can be eliminated , And noeviction Return error as well .

LRU Algorithm

LRU(Least Recently Used), Least recently used , Is a cache replacement algorithm . When using memory as a cache , The size of the cache is generally fixed . When the cache is full , At this time, continue to add data to the cache , We need to eliminate some old data , Free up memory to store new data . It can be used at this time LRU The algorithm . The central idea is this : If a data hasn't been used in the last period of time , So the possibility of being used in the future is very small , So it can be eliminated .

LRU stay Redis In the implementation of

Redis It's an approximation LRU Algorithm , It's like the regular LRU The algorithm is not quite the same . The approximate LRU The algorithm uses random sampling to eliminate data , Every time you randomly come out 5 individual ( Default )key, Get rid of the least recently used key.

Can pass maxmemory-samples Parameter changes the number of samples , Such as :maxmemory-samples 10

maxmenory-samples The larger the configuration , The closer the elimination result is to the strict LRU Algorithm , But it costs CPU It's also very high .

Redis In order to achieve approximation LRU Algorithm , For each key Added an extra one 24bit Field of , Used to store the key Last time visited .

Redis3.0 To approximate LRU The optimization of the

Redis3.0 To approximate LRU The algorithm has been optimized . The new algorithm maintains a pool of candidates ( The size is 16), The data in the pool is sorted according to the access time , For the first time key Will be put into the pool , And then each time I randomly selected key Only when the access time is less than the minimum time in the pool will it be put into the pool , Until the candidate pool is full . When it's full , If there's a new one key Need to put in , The last access time in the pool will be the maximum ( Recently interviewed ) The removal of .

When it comes to elimination , Then select the least recent access time directly from the pool ( The longest time I haven't been interviewed ) Of key Just get rid of it .

LFU Algorithm

LFU(Least Frequently Used), yes Redis4.0 Add a new strategy of elimination , Its core idea is based on key The frequency of recent visits is eliminated , Rarely visited priorities are eliminated , Many of the people interviewed were left behind .

LFU The algorithm can better represent a key The heat of being interviewed . If you use LRU Algorithm , One key I haven't been interviewed for a long time , Just once in a while , So it's considered hot data , Will not be eliminated , And some of them key What is likely to be visited in the future will be eliminated . If you use LFU This is not the case with algorithms , Because using one at a time doesn't make one key Become hot data .

9、Redis Is there a transaction mechanism ?

  • There's a transactional mechanism .Redis Transaction lifecycle : Open transaction : Use MULTI Start a transaction
  • The command is in the queue : The command for each operation is added to a queue , But the command is not actually executed at this time
  • Commit transaction : Use EXEC Order to commit a transaction , Start executing the commands in the queue sequentially

10、Redis Is a transaction atomic or not ?

Let's look at relational databases first ACID The definition of atomicity in : Atomicity : A business (transaction) All operations in , Or it's all done , Or not at all , It doesn't end in the middle . An error occurred during the execution of the transaction , Will be restored (Rollback) Status to the beginning of the transaction , It's like this transaction has never been executed .

The official document defines a transaction :

  • A transaction is a separate isolation operation : All commands in the transaction are serialized 、 To execute in order . Transaction is in the process of execution , Will not be interrupted by command requests from other clients .
  • A transaction is an atomic operation : The commands in the transaction are either all executed , Or none of it .EXEC The command is responsible for triggering and executing all commands in the transaction : If the client is using MULTI After opening a transaction , But failed to execute because of disconnection EXEC , Then all the commands in the transaction will not be executed . On the other hand , If the client successfully executes after opening the transaction EXEC , Then all the commands in the transaction will be executed .

Officials think Redis A transaction is an atomic operation , This is from the perspective of implementation or not . But from ACID By definition of atomicity , Strictly speaking Redis Transactions are nonatomic , Because during the execution of the command sequence , Once a command execution error occurs Redis It doesn't stop execution and then roll back the data .

11、Redis Why rollback is not supported (roll back)?

 While the transaction is running Redis Commands may fail to execute , however Redis The remaining life in the transaction will still be executed 
 The rollback operation will not be performed . If you are familiar with mysql Relational database transactions , You'll be very confused about this ,
Redis The official reasons are as follows :  Only when called Redis When the command has a syntax error , This command will be executed 
 Defeat ( During putting this command on the transaction queue ,Redis Be able to detect such problems ), Or do not... On a key 
 Operations that match their data types : actually , This means that only program errors can cause Redis The execution of the order is wrong 
 Defeat , This kind of error is likely to be found during program development , It is seldom found in production environment .  Support transaction rollback 
 Capability can lead to design complexity , This is related to Redis Against the original intention of ,Redis Our design goal is to simplify the function and ensure more 
 Fast running speed .

There is a general objection to this official argument : The program has bug What do I do ? But actually regression can't solve the problem of procedure bug, For example, a careless programmer plans to update keys A, In fact, the key was updated at the end of the day B, There is no way to solve this kind of human error mechanism . Because of this kind of human error, it is unlikely to enter the production system , So the authorities are designing Redis Choose a simpler and faster method , There is no mechanism to implement rollback .

12、Redis What are the transaction related commands ?

(1)WATCH It can be for Redis Services provide check-and-set (CAS) Behavior . By WATCH Will be monitored , And will find out whether these keys have been changed . If there is at least one monitored key in EXEC It was modified before implementation , Then the whole business will be cancelled , EXEC return nil-reply To indicate that the transaction has failed .

(2)MULTI

Used to start a transaction , It always comes back OK.MULTI After performing , The client can continue to send any number of commands to the server , These orders will not be executed immediately , It's put in a queue , When EXEC When the command is called , All the commands in the queue will be executed .

(3)UNWATCH

Cancel WATCH Command to all key Surveillance , Commonly used in DISCARD and EXEC Before the command . If in execution WATCH After the command , EXEC Order or DISCARD If the order is executed first , Then there's no need to execute UNWATCH 了 . because EXEC The command will execute the transaction , therefore WATCH The effect of the command has been produced ; and DISCARD The command cancels the transaction and cancels all of the key Surveillance , So after these two commands are executed , There's no need to implement UNWATCH 了 .

(4)DISCARD

When executed DISCARD On command , The business will be abandoned , The transaction queue will be emptied , And the client will exit from the transaction state .

(5)EXEC

Responsible for triggering and executing all commands in the transaction :

If the client successfully opens the transaction and executes EXEC, Then all the commands in the transaction will be executed .

If the client is using MULTI After opening the transaction , But failed to execute because of disconnection EXEC, Then all the commands in the transaction will not be executed . Here's the thing to watch out for : Even if there's something in the transaction / Some commands failed to execute , Other commands in the transaction queue will continue to execute ,Redis Does not stop executing commands in a transaction , It doesn't roll back like the relational database we usually use .

13、 What is? Redis Master slave copy ?

Master slave copy , It means to put one Redis Server data , Copy to other Redis The server . The former is called the main node (master), The latter is called the slave node (slave); Data replication is one-way , From master to slave only .

The role of master-slave replication

  • data redundancy : Master-slave replication realizes hot backup of data , It's a way of data redundancy beyond persistence .
  • Fault recovery : When there is a problem with the master node , Can be served by a slave node , Fast fault recovery ; It's actually a redundancy of services .
  • Load balancing : On the basis of master-slave replication , Cooperate with the separation of reading and writing , Write service can be provided by the master node , Read service provided by slave node , Share server load ; Especially in the situation of less writing and more reading , Sharing read load through multiple slave nodes , Can be greatly improved Redis Concurrency of servers .
  • High availability cornerstone : Master slave replication is also the foundation for sentinels and clusters to implement , So master-slave replication is Redis High availability Foundation .

Principle of master-slave replication

The master-slave replication process can be divided into 3 Stages : Connection establishment phase 、 Data synchronization phase 、 Command propagation stage .

Connection establishment phase

The main role of this stage is to establish a connection between the master and slave nodes , Get ready for data synchronization .

step 1: Save master node information

slaveof The command is asynchronous , Execute... On the slave node slaveof command , From the node to the client immediately ok, Two fields are maintained from within the node server , namely masterhost and masterport Field , Used to store the master node ip and port Information .

step 2: establish socket Connect

From node per second 1 Call the copy timer function replicationCron(), If it is found that there is a master node that can be connected , According to the master node ip and port, establish socket Connect .

The slave node is the socket Set up a file event handler to handle replication work , Responsible for the follow-up replication work , If received RDB file 、 Receive command propagation, etc .

Master node receives from slave node socket After connection ( namely accept after ), For the sake of socket Create the corresponding client state , And think of the slave as a client connected to the master , The next steps will take the form of sending command requests from the slave node to the master node .

step 3: send out ping command

After the slave node becomes the client of the master node , send out ping Command to make the first request , The purpose is : Check socket Is the connection available , And whether the master node can currently handle requests .

Send... From node ping After the command , May appear 3 In this case :

(1) return pong: explain socket The connection is normal , And the master node can currently process requests , The replication process continues .

(2) Overtime : After a certain period of time, the slave node has not received the reply from the master node , explain socket Connection not available , Then disconnect from the node socket Connect , And again .

(3) return pong It's not : If the primary node returns other results , Such as processing scripts that run out of time , Indicates that the master node is currently unable to process commands , Then disconnect from the node socket Connect , And again .

step 4: Authentication

If you set from the node masterauth Options , Then the slave node needs to authenticate with the master node ; This option is not set , There is no need to verify . The slave node authenticates by sending... To the master node auth Order to proceed ,auth The parameters of the command are... In the configuration file masterauth Value .

If the master node sets the password status , And slave nodes masterauth In the same state ( Consistency means that there is everything , And the password is the same , Or none of them ), Through authentication , The replication process continues ; If it's not consistent , Then disconnect from the node socket Connect , And again .

step 5: Send slave port information

After authentication , The slave node will send its listening port number to the master node ( In the previous example 6380), The master node saves the information to the client corresponding to the slave node slave_listening_port Field ; The port information is executed in the primary node info Replication It's not shown when , Nothing else .

Data synchronization phase

After the connection between master and slave nodes is established , You can start data synchronization , This stage can be understood as the initialization of slave node data . The specific way of implementation is : Send from node to master node psync command (Redis2.8 It used to be sync command ), Start syncing .

The data synchronization phase is the core stage of master-slave replication , According to the current state of the master-slave node , It can be divided into full replication and partial replication , We'll talk about these two ways of replication and psync Command execution process , No more details here .

Command propagation stage

After the data synchronization phase , The master-slave node enters the command propagation stage ; At this stage, the master node sends the write command executed by itself to the slave node , Receive command from node and execute , So as to ensure the consistency of master-slave node data .

It should be noted that , Command propagation is an asynchronous process , That is, the master node will not wait for the reply from the slave node after sending the write command ; Therefore, it is difficult to maintain real-time consistency between master and slave nodes , Delays are inevitable . The extent of data inconsistency , Network status between master and slave nodes 、 How often does the master write commands 、 And... In the master node repl-disable-tcp-nodelay Configuration, etc .

14、Sentinel( Sentinel mode ) Can you tell me something about it ?

Redis In master-slave replication mode , Once the primary node fails to provide service , You need to manually promote from node to master node , At the same time, the client should be informed to update the address of the master node , This kind of fault handling method is unacceptable to some extent .

Redis 2.8 After that Redis Sentinel Sentinel mechanism to solve this problem .

Redis Sentinel yes Redis High availability implementation .Sentinel It's one to manage multiple Redis Examples of tools , It can be realized to Redis Monitoring of 、 notice 、 Automatic failover .

Redis Sentinel The architecture is as follows :

The principle of sentinel mode

The main function of sentinel mode is that it can automatically complete fault detection and fault transfer , And inform the client , To achieve high availability . Sentinel mode usually consists of a group of Sentinel Nodes and a set of ( Or multiple groups ) The master-slave replication node consists of .

heartbeat

(1)Sentinel And Redis Node

Redis Sentinel It's a special one Redis node . When sentinel mode is created , It needs to be specified by configuration Sentinel And Redis Master Node The relationship between , then Sentinel The information of all the slave nodes will be obtained from the master node , after Sentinel It will send regular messages to the master and slave nodes info Command to get its topology and state information .

(2)Sentinel And Sentinel

be based on Redis The subscription and publishing function of , Every Sentinel The node will turn to the master node sentinel:hello Send this on channel Sentinel Node's judgment on the primary node and current Sentinel Node information , At the same time, each Sentinel The node will also subscribe to the channel , To get other Sentinel Node information and their judgment of the master node .

All through the above two steps Sentinel Nodes and they and all of Redis The nodes are already aware of each other , After each Sentinel The node will move to the master node 、 From the node 、 And the rest Sentinel The node sends it regularly ping Command as heartbeat detection , To confirm whether these nodes are reachable .

Fail over

Every Sentinel The heart rate will be checked regularly , When the heartbeat detection time-out occurs in the master node , At this point, it is considered that the master node is no longer available , This kind of decision is called subjective offline .

After that Sentinel Node will pass sentinel ismaster-down-by-addr Command to other Sentinel The judgment of node query to master node , When quorum( A quorum ) individual Sentinel When all nodes think that the node fails , Then execute objective offline , That is to say, the node is no longer available . It also explains why a group of Sentinel node , Because of a single Sentinel It is easy for nodes to misjudge the fault state .

 here  quorum  The value of is specified when we set up sentinel mode , It will be explained later , Usually it is  Sentinel Total number of nodes /2+1, More than half of the offline nodes can make a subjective judgment .

Because there's only one need for the job of a fail over Sentinel Node to complete , therefore Sentinel There will be another election between the nodes , be based on Raft The algorithm selects one Sentinel Leaders come to do the work of failing over .

Elected Sentinel The specific steps for leaders to fail over are as follows :

(1) Select a node from the list of nodes as the new master node

  • Filtering unhealthy or unsatisfactory nodes ;
  • choice slave-priority( priority ) The highest slave , Return if present , If it doesn't exist, continue ;
  • Select the slave node with the largest copy offset , Return if present , If it doesn't exist, continue ;
  • choice runid The smallest slave .

(2)Sentinel The leader node will execute the selected slave node slaveof no one Command to make it the master .

(3)Sentinel The leader node will send commands to the remaining slave nodes , Let them copy data from the new master node .

(4)Sentinel The leader will update the original master node to the slave node , And monitor it , When it is restored, command it to replicate the new master .

15、Cluster( colony ) Can you tell me something about it ?

introduce Cluster The reason for the pattern : Either the sentinel mode or the master mode can only be used master Writing data , In the high concurrency scenario of massive data , When a node writes data, it is easy to have a bottleneck , introduce Cluster The mode can realize multiple nodes to write data at the same time .

Redis-Cluster Use a no center structure , Each node holds data , Nodes are connected to each other to know the state of the whole cluster .

As shown in the figure Cluster A pattern is actually a combination of multiple master-slave replication structures , Each master-slave replication structure can be regarded as a node , So the one above Cluster There are three nodes in the cluster .

16、Memcache And Redis What are the differences ?

storage Memecache Store all the data in memory , It hangs when the power goes out , Data cannot exceed memory size .

Redis Some of it is stored on the hard drive , This ensures data persistence .

Data support type

Memcache Support for data types is relatively simple .

Redis There are rich data types .

Using the underlying model is different

The underlying implementation between them And the application protocol of communication with the client is different .

Redis Build it yourself VM Mechanism , Because the normal system calls system functions , It's a waste of time moving and requesting .

17、 If Redis There are 1 One hundred million key, Among them is 10w individual key It starts with a fixed known prefix , If you find them all ?

 Use keys The command can scan out the key list :
keys pre*

18、 If this redis Providing services for online business , That use keys What's wrong with the instructions ?

redis Of single thread .keys The command causes the line It's blocked for a while , Online services will pause , Until the command is executed , Service can be restored . This time You can use scan Instructions ,scan The instruction can extract the specified pattern without blocking key list , But there is a certain probability of repetition , In the client to do a duplicate can , But the overall time spent It's better than using it directly keys Commander .

19、 If there's a lot of key You need to set the expiration date at the same time , What should we pay attention to in general ?

If a large number of key Expiration time is set too centrally , By the time it expires ,Redis There may be a short-term stumbling ( because redis It's single threaded ). If it is serious, it may lead to server avalanche , So we usually add a random value to the expiration time , Try to spread the expiration time as much as possible .

20、Redis What are the common clients ?

Jedis: It's old school Redis Of Java Implement client , Provides a more comprehensive Redis Command support .

Redisson: Distributed and extensible Java data structure .

Lettuce: senior Redis client , For thread safe synchronization , Asynchronous and response usage , Support clusters ,Sentinel, Pipes and encoders .

advantage :

Jedis: More comprehensive Redis Operating characteristics of .

Redisson: Enable users to Redis Separation of concerns , Provide many distributed related operation services , for example , Distributed lock , Distributed collection , It can be done by Redis Support delay queue .

Lettuce: be based on Netty Event driven communication layer of framework , Its method calls are asynchronous .Lettuce Of API It's thread safe , So you can do it Make a single Lettuce Connect to complete various operations .

summary

This is the end of the article ! above Redis If you can't brush your interview questions well , You can try the summary of the interview Redis The real question of the interview

copyright notice
author[Java program ape],Please bring the original link to reprint, thank you.
https://en.chowdera.com/2022/175/20210526184508072X.html

Random recommended