Ruby on Rails is one of the strong framwork in developing web applications appications. As we all know, with large applications we will face scaling challenges. Today I will share how Redis stores data structures in memory, this is one of the outstanding methods to increase the performance and expansion of the application. First we need to install Redis, via commands like brew
, apt-get
or using docker
. And of course we have to have Ruby on Rails application already. For the example below, we will build an online event management application. Let’s look at the following model relations:
1 2 3 4 5 6 7 8 9 10 11 | class User < ApplicationRecord has_many :tickets end class Event < ApplicationRecord has_many :tickets end class Ticket < ApplicationRecord belongs_to :user belongs_to :event end |
Redis as a cache
The first requirement of the application is to indicate how many tickets are sold and the total proceeds from them. We will have the following methods:
1 2 3 4 5 6 7 8 9 | class Event < ApplicationRecord def tickets_count tickets.count end def tickets_sum tickets.sum(:amount) end end |
The code above will trigger SQL statements to access the database to retrieve data. The problem that can occur here is that as the application expands and the data grows larger, these statements can become slow, delayed. To improve this we can caching the results into methods. First of all, we need to enable Redis caching for the application. Add redis-rails
gem to Gemfile and run bundle install
. Environment configuration:
1 2 3 4 5 6 7 | #config/environments/development.rb config.cache_store = :redis_store, { expires_in: 1.hour, namespace: 'cache', redis: { host: 'localhost', port: 6379, db: 0 }, } |
Specify cache
namespace as optional (optional). The above code will set the default to expire in 1 hour. During the cache lifetime (time-to-live) the old data will be purged. Now we can wrap our methods into blocks called cache
blocks.
1 2 3 4 5 6 7 8 9 10 11 12 13 | class Event < ApplicationRecord def tickets_count Rails.cache.fetch([cache_key, __method__], expires_in: 30.minutes) do tickets.count end end def tickets_sum Rails.cache.fetch([cache_key, __method__]) do tickets.sum(:amount) end end end |
Will Rails.cache.fetch
check if a specific key already exists in Redis? If it exists, it will return a value associated with that key to the application, otherwise it will execute the code inside and store the data into Redis. cache_key
is a method that allows rails to associate a model name with a primary key and last updated timestamp to generate a unique Redis key. We have to add __method__
to use the name of the method to continue building the unique key. It is possible to specify different expirations for each method, of course. Data stored in Redis will have the following form:
1 2 3 4 5 6 7 8 9 10 11 12 | {"db":0,"key":"cache:events/1-20180322035927682000000/tickets_count:","ttl":1415, "type":"string","value":"9",...} {"db":0,"key":"cache:events/1-20180322035927682000000/tickets_sum:","ttl":3415, "type":"string","value":"127",...} {"db":0,"key":"cache:events/2-20180322045827173000000/tickets_count:","ttl":1423, "type":"string","value":"16",...} {"db":0,"key":"cache:events/2-20180322045827173000000/tickets_sum:","ttl":3423, "type":"string","value":"211",...} |
In this result event 1 sells 9 tickets for a total of $ 127, event 2 sells 16 tickets for a total of $ 221
Cache busting
So what happens if another ticket is sold right after we cache the data? The website will display cached content until the redis key is removed. This might be fine in some cases, but what we really need here is to show current data realtime. This is where the last updated timestamp I mentioned above is used. We will assign a touch: true
callback touch: true
from child model (ticket) to parent model (event). Rails will mark updated_at
timestamp, thereby creating a new cache_key
for the event model.
1 2 3 4 5 6 7 8 9 10 | class Ticket < ApplicationRecord belongs_to :event, touch: true end # data in Redis {"db":0,"key":"cache:events/1-20180322035927682000000/tickets_count:","ttl":1799, "type":"string","value":"9",...} {"db":0,"key":"cache:events/1-20180322035928682000000/tickets_count:","ttl":1800, "type":"string","value":"10",...} ... |
The model will be: When creating a combination of cache keys and content, we won’t change it. Creating a new content with a new key and previously cached data stored in Redis will be removed. This can lead to RAM
but it reduces the trouble in the code, we do not need to write a callback to remove and regenerate cache.
Also note in TTL (time-to-live) settings if the data is frequently changed and if the TTL is large, it will lead to a lot of unnecessary data usage. On the other hand, if the data changes little but TTL is too short, regenerate cache becomes wasteful when nothing changes the data.
A note of caution : Caching is not a major solution. Instead, you should find the best solution to optimize code, database indexes. However, sometimes caching is a necessary and quick solution instead of spending a lot of time on complex restructuring.
Redis as a queue
The next requirement is to generate reports for one or more events such as showing statistics of the proceeds for each ticket and buyer information.
1 2 3 4 5 6 7 8 | class ReportGenerator def initialize event_ids end def perform # query DB and output data to XLSX end end |
Collecting data from multiple tables leads to sluggish performance. Instead of making users wait and download spreadsheets, we can place them in the background job and send emails and attachments when it’s done.
Ruby on Rails has an Active Job framework that allows the use of many different queues. In the example below, we will take advantage of the Sidekiq library, which is used to store data in Redis.
Add gem 'sidekiq'
to Gemfile and run ‘bundle install’. Use the ‘sidekiq-cron’ gem to schedule recurring jobs.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | # in config/environments/development.rb config.active_job.queue_adapter = :sidekiq # in config/initializers/sidekiq.rb schedule = [ {'name' => MyName, 'class' => MyJob, 'cron' => '1 * * * *', 'queue' => default, 'active_job' => true } ] Sidekiq.configure_server do |config| config.redis = { host:'localhost', port: 6379, db: 1 } Sidekiq::Cron::Job.load_from_array! schedule end Sidekiq.configure_client do |config| config.redis = { host:'localhost', port: 6379, db: 1 } end |
Note that we are using another Redis database for Sidekiq. This is not required however, it is useful to store the store cache in another Redis database (or even on another server) in case we need to flush it.
It is also possible to create another config file to specify which queues should be watched. We should not have too many queues, but having only one queue can lead to bottlenecks with low priority jobs and high priority delay jobs. To understand more, please see the following code:
1 2 3 4 5 6 | --- :queues: - [high, 3] - [default, 2] - [low, 1] |
Now we can create jobs and set prioriy queues
1 2 3 4 5 6 7 8 | class ReportGeneratorJob < ApplicationJob queue_as :low self.queue_adapter = :sidekiq def perform event_ids # either call ReportGenerator here or move the code into the job end end |
You can optionally set different queue adapters. Active Job allows you to use different queues for different jobs in the same application. There may be jobs that need to run millions of times every day. Redis can handle that but we can think of other services like the AWS Simple Queue Service (SQS).
Sidekiq makes use of many Redis data types. It uses Lists to store jobs, Sorted Sets to delay job execution, Hashes to store statistics about how many jobs are executed and how long they execute, recurring jobs are also stored in Hashes .
Redis as a database
The final requirement is to keep track of how many visits on each page, thus indicating their popularity. To do that we can use Sorted Sets . You can create REDIS_CLIENT to call directly to native Redis commands or use the Leaderboard gem
1 2 3 4 5 6 | # config/initializers/redis.rb REDIS_CLIENT = Redis.new(host: 'localhost', port: 6379, db: 1) # config/initializers/leaderboard.rb redis_options = {:host => 'localhost', :port => 6379, :db => 1} EVENT_VISITS = Leaderboard.new('event_visits', Leaderboard::DEFAULT_OPTIONS, redis_options) |
Now we can call it from the controller show action:
1 2 3 4 5 6 7 8 9 10 11 | class EventsController < ApplicationController def show ... REDIS_CLIENT.zincrby('events_visits', 1, @event.id) # or EVENT_VISITS.change_score_for(@event.id, 1) end end # data in Redis {"db":1,"key":"events_visits","ttl":-1,"type":"zset","value":[["1",1.0],...,["2",4.0],["7",22.0]],...} |
We can use the Sorted Set to determine the rank
and score
of events. Or can display top 10 events with REDIS_CLIENT.zrange('events_visits', 0, 9)
.
With the application of Redis to store different data types (cache, jobs, …), we need to pay attention to their use of RAM.
My article has introduced the usefulness of using Redis with different purposes in Ruby on Rails application. I wish you a happy working day.
References: How to scale Ruby on Rails with Redis