Cache Management in Rails – Static Pages
Introduction
As mentioned in our About page, mtbguru.com started its early life as a solution looking for a problem, the solution being Ruby On Rails (RoR), a neatly designed web development framework that makes it a pleasure to program highly dynamic websites such as mtbguru.com.
After countless hours of quality time with RoR, I still think it offers very elegant solutions to a lot of problems, but that’s doesn’t mean it covers all possible situation right out of the box.
In the Tech Corner posts of this blog, I would like to cover some of the problems we ran into and how they were solved. This is a way of giving back to the community who created the RoR. It also will help me to remember why particular techniques were used when I have to make changes to them later. And finally, nothing kicks other developers into action than articles that explain how to do something incorrectly. So if you notice that I missed some obvious or not-so-obvious techniques to get around a problem, it’d be really nice if you correct me.
Built-in caches_page support
MTBguru.com is running on a server with a rather weak CPU. Each trip page requires quite a bit of lookups in the database. There’s a bunch of XML to be generated to display maps etc. All this requires a lot of CPU cycles when all pages are created on the fly. Diskspace is cheap, so we try to cache as much as possible.
The most efficient way is to let the web server serve a cached page. In this case, when the web server detects that a particular file is present at a specific path under your ./public directory, it will simply transmit this file to the user. Your Rails process won’t even kick in, so no CPU cycles are wasted to run your Ruby program.
This technique is intended to be used for all content that is identical for all users. A significant restriction, but that’s the price to pay for efficiency and, as we will see later, there are ways around that.
To use server based page caching, you first have to enable caching by setting
config.action_controller.perform_caching = true
in <rails_root>/config/environments/production.rb.
Then you specify which pages to cache by putting the following at the top of your controller:
caches_page :show
This will add an after_filter to the controller that triggers at the end of the show method, as can be seen in the Rails source code:
# File vendor/rails/actionpack/lib/action_controller/caching.rb, line 95
95: def caches_page(*actions)
96: return unless perform_caching
97: actions.each do |action|
98: class_eval "after_filter { |c| c.cache_page if c.action_name == '#{action}' }"
99: end
100: end
The cache_page (without ‘s’!) takes the rendered content of the page and saves it to a file with a path that’s equal to the path of the ./public directory + the url.
E.g. The html rendered for the page http://www.mtbguru.com/trip/show/1 will be stored in the file <rails_root>/public/trip/show/1.html.
Only Cache Correcty Rendered Pages
Most of the URL checking for MTBGuru.com was initially done in the controller action methods itself. This resulted in a number of interesting problems: Rails will create a cache file for anything that reaches the action method.
This was highly annoying: you only want to cache pages that are completely correct. Once you render the wrong page, you cannot easily undo this, because the server simply won’t start your Rails program as long as it keeps hitting the file in the cache.
Consider the following situation: http://www.mtbguru.com/trip/show/3.
When trip number 3 doesn’t exist, you will have to redirect the browser to a different page. If you issue the redirect_to command inside the action method, Rails will store the redirection message in <rails_root>/public/trip/show/3.html.
This is most definitely not what you want, because when the same link is accessed in the future, the webserver will pick up the redirect .html file but won’t have Rails around to do the actual redirect!!!
Solution 1: after_filter – The Clunky Way
To work around this problem, I abandoned the caches_page method and replaced it with the following piece of code:
after_filter(:only => :show) { |c| c.cache_page if c.response.headers["Status"] == "200 OK" }
This is basically mimicing the original code of caches_page above, with the added switch that it only calls cache_page when the response header contains the All OK message, which won’t be true in case of a redirect.
This worked very well and can be used to do all kinds of additional magic, if ever needed, but it feels too clunky to be the Right Way.
Solution 2: before_filter – The Rails Way
Eventually, it dawned on me that the real solution is much simpler: I already had a number of methods in the controller to do parameter validation, but I called them from within the action methods. I was using a before_filter to do user validation, but it took a while before I realized that all parameter validation should be done in before filters.
That made the code suddenly much cleaner and orthogonal:
Limit action method functionality to render a page when parameters have already been verified. And use before_filter for anything else. And use caches_page as originally intended by the Rails designers.
Tom
Comments? Email them to tom@mtbguru.com
