The Caching Gap(tm) ... mixin' it with Cron

August 10th, 2007

So the plugin i mentioned in my last post is alive and kicking, it has totally solved the problem we were having.

A new problem arose today, now that the pages are begin re-cached as part of the sweeping process, it was taking ages. Ages being up to 60 seconds to re-cache them all, also ages with the boss sitting there waiting for thing to happen. Not so good from a Usability perspective.

Again, I was throwing the idea of using DRB around on IRC and Lachie Cox suggested good old cron. We are already using cron to expire the front page on the hour to keep the info fresh so I thought i’d expand it a little.

I created a new model called CacheQueue that had two columns, cache_type and cache_data. The two types of cache I hose now are the regular esxpressions that get handed into an expire_fragment method, or the urls that get the special param handed in and get re-cached up.

The sweeper now looks like this

1
2
3
4
5
6
7
8
9
10
## The  Sweeper
class TournamentSweeper < ActionController::Caching::Sweeper
  observe ...models...

  def after_save(record)
      ...  
      CacheQueue.create({:cache_type => "url", :cache_data => '/?special_param_to_hose_cache=1'})
      CacheQueue.create({:cache_type => "regex", :cache_data => "tournaments/#{tournament.id}-*"})
      ...
  end

So we’re creating lots of entires in the cache_queues table, all having an expired = 0 column by default.

The next step was to write a method that could be called from a rake task that expires the caches, here it is

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
## The  CacheQueue Class
class CacheQueue < ActiveRecord::Base
  def self.expire
    begin
      require 'action_controller/integration'
      # wite a disc lock
      lockfile = RAILS_ROOT + "/log/cache.expire"
      if File.exists? lockfile
        $stderr.puts "Locked since " + File.open(lockfile,"r").atime.to_s
        return
      end
      FileUtils.touch lockfile
      caches_to_be_hosed = CacheQueue.find(:all,:conditions => {:expired => false}, :order => "cache_type DESC, id ASC")
      hosed_url_caches = []
      hosed_regex_caches = []
      sess = ActionController::Integration::Session.new
      sess.host! LIVE_HOST
      c = ActionController::Base.new 
      skipped = 0
      caches_to_be_hosed.each do |cache|
        # update the db
        cache.expired = true
        cache.save!
        case cache.cache_type
          when "url"
            if hosed_url_caches.include? cache.cache_data
              skipped += 1
              next 
            end
            sess.get cache.cache_data
            hosed_url_caches << cache.cache_data
          when "regex"
            if hosed_regex_caches.include? cache.cache_data
              skipped += 1
              next
            end
            c.expire_fragment(Regexp.new(cache.cache_data))
            hosed_regex_caches << cache.cache_data
        end
      end
      FileUtils.rm lockfile
    rescue
      FileUtils.rm lockfile
    end
    $stderr.puts ((hosed_regex_caches.size + hosed_url_caches.size).to_s + " caches removed") if (hosed_regex_caches.size + hosed_url_caches.size > 0)
    $stderr.puts skipped.to_s + " caches skipped" if skipped > 0
  end
end
1
2
3
4
5
6
## The  Rake task
namespace :cache do
  task :expire =>  :environment do
    CacheQueue.expire
  end
end

This write out to $stderr, when it actually does something it send me an email. It also implements locking so it doesn’t run over itself.

One of the cool parts of this is that it’ll actually skip over a cache if it already hosed it, so if in the space of the 4 minutes when it was run last the boss had edited the same article twice, which would schedule the re-caching of the front page and other busy pages, this method would only do each one once, your machine will thank you.

Sorry, comments are closed for this article.