What does the term ‘Redis’ actually mean? It means REmote DIctionary Server.
Redis, which is an open source in memory data structure store, is a very popular selection among developers used for caching purposes, as a message broker and also mainly used as a NoSQL Key-Value database for different use cases.
Additionally, I want to highlight that the Redis documentation is very informative and it’s ‘the go to place’ if you need any further clarification on Redis.
Moving forward to Redis basics; Redis is a in-memory database, simply which means Redis runs on RAM. Also, you need to know that Redis supports several data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps. Furthermore, it also supports atomic operations such as, appending to a string, incrementing the value in a hash, pushing an element to a list and etc.
Well, Let’s get things started with Redis High Availability.
How Redis offers High Availability and Automatic Failover ?
Redis sentinel is the high availability solution offered by Redis. In case of a failure in your Redis cluster, Sentinel will automatically detects the point of failure and bring the cluster back to stable mode without any human intervention.
What really happens inside Redis Sentinel ?
Sentinel always checks the MASTER and SLAVE instances in the Redis cluster, checking whether they working as expected. If sentinel detects a failure in the MASTER node in a given cluster, Sentinel will start a failover process. As a result, Sentinel will pick a SLAVE instance and promote it to MASTER. Ultimately, the other remaining SLAVE instances will be automatically reconfigured to use the new MASTER instance.
Sentinel acts as a configuration provider or a source of authority for clients service discovery.
What does that means ? Simply, application clients connect to the Sentinels and Sentinels provide the latest Redis MASTER address to them.
Furthermore, Sentinel is a robust distributed system, where multiple sentinels need to agree to about the fact a given master is no longer available. Then only the failover process starts a select a new MASTER node. This sentinel agreement is done according to the quorum value.
What is quorum ?
The quorum value is the number of Sentinels that need to agree about the fact the master is not reachable. However the quorum is only used to detect the failure. In order to actually perform a failover, one of the Sentinels need to be elected leader for the failover and be authorized to proceed. This only happens with the vote of the majority of the Sentinel processes.
Let’s get our hands dirty with Redis Sentinel.
Here, we’ll stick to the basic setup with 3 server instances.
Please refer the above diagram which illustrates the 3 server instance basic setup. First of all make sure your Ubuntu instances are up to date with relevant build dependencies.
Following shows the steps to install Redis on your server instances.
sudo apt-get update sudo apt-get install build-essential tcl sudo apt-get install libjemalloc-dev (Optional) curl -O http://download.redis.io/redis-stable.tar.gz tar xzvf redis-stable.tar.gz cd redis-stable make make test sudo make install
Now in the redis directory you should be able to see both redis.conf and sentinel.conf configuration files.
Before we run Redis let’s do some necessary basic configurations to up and run a Redis cluster. Following are the IP addresses of this setup.
10.52.209.46 (Initial Master Node)
10.52.209.47 (Initial Slave Node)
10.52.209.49 (Initial Slave Node)
10.52.209.47 (Initial Slave Node)
10.52.209.49 (Initial Slave Node)
In Our Case, port for Redis server is 6377 and Sentinel is 5000. Hence make sure you open up these port using,
The Redis configurations (both in redis.conf and sentinel.conf) in the above servers should be configured as follows (Example).
For the basic setup above configurations will be enough but for production level please consider the tips mentioned in the latter part of this post. The only difference in redis.conf files in 3 servers is that, all the slaves must have the following config. 10.52.209.46 is the Master IP address.
slaveof 10.52.209.46 6379
slaveof tells Redis cluster to make this particular server instance as a SLAVEinstance of the given MASTER node (10.52.209.46).
In sentinel.conf, following config notify Sentinels to start the cluster monitoring with following initial settings. Afterwards this config setting may automatically updated accordingly.
sentinel monitor mymaster 10.52.209.46 6379 2
(This tells sentinel to monitor the master node. And the last argument which is 2 is the quorum value)
(This tells sentinel to monitor the master node. And the last argument which is 2 is the quorum value)
Further, sentinel.conf includes following config values as well.
sentinel down-after-milliseconds mymaster 5000
(Means server will unresponsive for 5 seconds before being classified as +down and consequently activating a +vote to elect a new master node.)
(Means server will unresponsive for 5 seconds before being classified as +down and consequently activating a +vote to elect a new master node.)
sentinel failover-timeout mymaster 10000
(Specifies the failover timeout in milliseconds.)
(Specifies the failover timeout in milliseconds.)
Okie Dokie! Now… Let’s run Redis.
Service Redis start Service redis-sentinel start
After that, you can simply check the Redis processes via ps -ef | grep redis command. Each server instance should running both a Redis process and a Sentinel process. If all goes to the plan, there should be 2 processes running as follows.
Now connect to Redis client via one of the following command and test whether Redis is working fine.
redis-cli ping or redis-cli -h IP_Address ping
You should get a output of PONG.
Awesome! Now you have Redis up and running. Let’s focus on the Master/Slave replication.
Now Open Redis Config file & Change the Replication timeout settings:
Nano /etc/redis.conf Change Repl-timeout 120
Now if you check the redis.log (which is located in the place we defined in the redis.conf) of each instances, you can get to see the Master — Slave synchronization occured.
Master node — redis.log
Slave node — redis.log
Checking Replication Status
You can check the replication information via info replication command in Redis CLI as well. Under the role attribute it mentions whether that particular node is a MASTER or a SLAVE (yellow box).In addition, in the Master node, it displays the details of all the connected slaves. (green box)
Now let’s examine what sentinel.log indicate. (which is located in the place we defined in the sentinel.conf)
Furthermore, if you check the sentinel.conf file, you can get to see that conf file is automatically updated with the latest configs, including sentinel known-slave and sentinel known-sentinel values.
Cool! Now let’s create a sample value in all nodes.
127.0.0.1:6739>setdemokey ”Amila”
As you can see in the above diagram, SLAVES are READ ONLY hence you can only write data to Master. Since Redis asynchronously replicates with all the remaining slaves, you can retrieve the inserted value from any Redis slaves using the same given key. In addition via,KEYS* you can list all the keys inserted. Above diagram clearly describes what we just talked about
Now let’s check how Redis Sentinel Automatic Failover works.
Redis Sentinel Automatic Failover
Okiee! Let’s simulate an automatic failover scenario. In order to simulate a failover scenario you can simply stop the Redis server or kill the Redis process in the MASTER instance. Even you can sleep the Redis process as well. You can choose whatever the way that you desire.
kill -9 <process id>
or
redis-cli -p 6379 DEBUG sleep 30
or
redis-cli -p 6379 DEBUG SEGFAULTAs illustrated in the above diagram, in a failover scenario, if MASTER node fails then the 2 remaining Sentinels will determine the failover and if both agrees (Since quorum value is 2 ), then a new MASTER will be elected from those 2 remaining nodes.
or
redis-cli -p 6379 DEBUG sleep 30
or
redis-cli -p 6379 DEBUG SEGFAULTAs illustrated in the above diagram, in a failover scenario, if MASTER node fails then the 2 remaining Sentinels will determine the failover and if both agrees (Since quorum value is 2 ), then a new MASTER will be elected from those 2 remaining nodes.
Following shows the log tail for this failover scenario.
redis.log of Slave nodes.
sentinel.log of Slave nodes
Now let’s check for replication status via info replication command.
Further elaborating the log tail,
Each Sentinel detects the master is down with an +sdown event. (+sdownmeans the specified instance is now in Subjectively Down state.)
+new-epoch means the current epoch was updated.
+sdown event is later escalated to +odown, which means that multiple Sentinels agree about the fact the master is not reachable. (+odown means that the specified instance is now in Objectively Down state.)
Sentinels +vote a Sentinel that will start the first failover attempt.
The failover happens.
Further, following shows upstart jobs.
Upstart for Redis
description "Redis Server"
start on runlevel [2345]
stop on runlevel [!2345]
stop on runlevel [!2345]
script
echo $$ > /var/run/redis.pid
su - amila -c "cd /home/amila/redis-stable/src/; redis-server ../redis.conf"
end script
echo $$ > /var/run/redis.pid
su - amila -c "cd /home/amila/redis-stable/src/; redis-server ../redis.conf"
end script
post-stop script
rm -f /var/run/redis.pid
end script
rm -f /var/run/redis.pid
end script
Upstart for Sentinel
description "Redis Sentinel Server"
start on runlevel [2345]
stop on runlevel [!2345]
stop on runlevel [!2345]
script
echo $$ > /var/run/redis.pid
su - amila -c "cd /home/amila/redis-stable/src/; redis-server ../sentinel.conf --sentinel"
end script
echo $$ > /var/run/redis.pid
su - amila -c "cd /home/amila/redis-stable/src/; redis-server ../sentinel.conf --sentinel"
end script
post-stop script
rm -f /var/run/redis.pid
end script
rm -f /var/run/redis.pid
end script
OK. So thank you so much for reading this article and I look forward to come up with another interesting article soon, sharing my experience. Till then, Cheers! and Happy Coding!
Nice , thanks for sharing the information
ReplyDeleteor any other related information please follow these
ReplyDeletehttps://partheniumprojects.com/