Advertise here!

Rails tip: Get Sneaky and cache pages with many parameters

So you're the coolest cat in town. You've been using Rails for like, at least a month now. Heck, you've even started getting caching working on your coolest new site. But wait a second. Your zeitgeist page has 4 fields which are passed around as parameters and you need to start caching the results badly, because Slashdot just posted about how brilliant your new site is. So what do you do?

You watch as your site goes down in flames and weep uncontrollably.


No! Well, OK, maybe the first time. But after reading this you may be one step closer to avoiding the shame of sobbing like you just watched Dancer in the Dark and thought no one would be home for at least another hour.

We'll use Typo and my own site, pervwatch.org as examples to see how it's been tackled so far. Let's start with Pervwatch since that implementation is a little simpler. First, some context. Pervwatch is a Google Maps mashup. It combines Google Maps, geocoder.us data, and state sex offender registries to plot the location of sex offenders on a map. It allows you to see them all in one glance visually, where most sex offender registry websites show a single offender at a time on a map. The problem I had to tackle was that the pages for a number of the most populated counties in large states like California had too many results coming back. The pages became huge and the map was filled with too many markers. It slowed down users browsers, ate up browser memory, and took a long time to render on the server side. I could cache the results to eliminate the server side slowdown, but clients would still have problems. The easy solution is to paginate the results, right?

Yes. But then what happens? If you paginate the results with Rails handy pagination methods, and combine that with simple caching the first page rendered (and cached) will always be shown, no matter what. Why? Well Rails caches based on the URL, which doesn't include the extra parameter that pagination introduced (yet!), so requests for each of the following will resolve to the same cached file:

http://www.pervwatch.org/counties/map/152
http://www.pervwatch.org/counties/map/152?page=2
http://www.pervwatch.org/counties/map/152?page=12523

Not so great. So how can I get around this? Easy, use Rails named routes! The default route is to assume URLs of the form:

'/:controller/:action/:id'

We can add a new route which is smarter and realizes that my county map pages may or may not be paginated. To accomplish this, we add the following to our RAILS_ROOT/config/routes.rb file:

  map.connect 'counties/map/:id/:page', 
    :controller => 'counties', 
    :action => 'map', 
    :requirements => { :page => /\d+/},
    :page => nil,
    :id => /\d+/

Fancy, huh? Here's what it means. Assume any URLs that match the format http://www.pervwatch.org/counties/map/id/page get resolved to the controller 'counties_controller', the action 'map'. Make sure the third part of the URL, is made up of digits only, and at least one (so make sure its a positive integer) - then assign that the the :id key in the params hash the 'map' method receives. The fourth part of the URL is optional (:page => nil), and if available, is also made only of digits - assign it to the :page key in the params hash passed to 'map'. Phew. That was a lot. Let's take a peek at the 'map' method to see what that means.

  def map
    @county = County.find(params[:id])
    @pages, @offenders = paginate_collection(@county.offenders.find(:all, :include => :address, :order => 'addresses.latitude DESC'), :page => params[:page], :per_page => 250)
    # ... some more code
  end

Counties don't have any easy unique field to identify them, so we stick with passing around the auto-generated primary key id (in params[:id]), as you can see in the first line. That seems to be the same as we've always operated. Now check the next line. Ignore the :include, :order stuff. See what we're doing? We're explicitly passing the params[:page] value into our paginate method. So now you see how the route looks, and how the action looks. But how does this help us any? We just want to cache all our pages uniquely.

Take another look at the routes. You see, now when we pass in a 'page' parameter, Rails will compare against our routes and will automatically match our new rules, expanding the page parameter value right into the URL if it exists! Lets do a side by side comparison of what the URLs used to look like and how they'll look now.

Resolution of parameters to URLs with different routes.

params => {:controller => 'counties', :action => 'map', :id => 152}

  • Old: http://www.pervwatch.org/counties/map/152
  • New: http://www.pervwatch.org/counties/map/152

params => {:controller => 'counties', :action => 'map', :id => 152, :page => 2}

  • Old: http://www.pervwatch.org/counties/map/152?page=2
  • New: http://www.pervwatch.org/counties/map/152/2

params => {:controller => 'counties', :action => 'map', :id => 152, :page => 12523}

  • Old: http://www.pervwatch.org/counties/map/152?page=12523
  • New: http://www.pervwatch.org/counties/map/152/12523

Aha! Nirvana. Now when Rails caches all the different pages for a given county, they won't all resolve and refer to one file! So hopefully by now you get the idea behind this. To be able to cache pages with more than the normal :controller / :action / :id parameters, you need to push the parameters into the URL. Now, let's look at a slightly more complicated example from Typo.

Typo is blogging software built on Rails. In fact, this blog is running Typo. It's great, I love it. Typo had to tackle the pagination issue itself with routes, but they've also done some more complex neater things, such as finding all articles given a date, and to be able to cache the resulting page. They also can do this at a year level, monthly level and down to the exact date. So we could find everything in 2005, or everything in August 2005, or everything on August 18th, 2005. Neat idea, but we want to cache the results. This means we need to push those params into the URL. The side benefit allows for neat looking URLs that users could guess if they wanted to try and poke around.

Here's what Typo stuck in their routes for this:

  map.connect 'articles/:year/:month/:day',
    :controller => 'articles', :action => 'find_by_date',
    :year => /\d{4}/, :month => /\d{1,2}/, :day => /\d{1,2}/
  map.connect 'articles/:year/:month',
    :controller => 'articles', :action => 'find_by_date',
    :year => /\d{4}/, :month => /\d{1,2}/
  map.connect 'articles/:year',
    :controller => 'articles', :action => 'find_by_date',
    :year => /\d{4}/

This is a truncated version, they have more dealing with paginated results for the URLs like above, as well as many other different types of routes. So in their scheme, they're allowing for someone to specify year, month, and day (in that order) in the URL for the first route, and doing checking that the year must be four digits, the month one or two digits, and the day one or two digits. The second route allows the user to drop off the month parameter from the URL and still enforce the four digit year, and one or two digit month. The last allows just entering a year with four digits in the URL.

I'll spare you the implementation code in the action and model. The idea behind it remains the same as with our pagination example. When you want to cache the results of a page which has important parameters that aren't :controller, :action or :id, then make a nice new route for it and specify the parameters inside the route. It's not only the easy way to get caching working right again, it also tends to lead you to better looking URLs for your users.

How does this affect url_for, link_to, etc?

Good question. Now when you write your link_to's you can switch your style just a little. Before you'd probably right like so:

<%= link_to 'Page 2', :controller => 'counties', :action => 'map', :id => @county.id, {:page => 2} %>

Now that can become:

<%= link_to 'Page 2', :controller => 'counties', :action => 'map', :id => @county.id, :page => 2 %>

Think of it as a parameter getting "promoted" to URL member. It gets mentioned in the URL when we define the route, it gets grouped in with the other big guys like :action in our link_to, and it gets reflected in our cache. The parameter is getting a promotion.

Posted at 9pm on 11/02/05 | Posted in , , | 2 responses | read on

Breaking the Golden Rule

I know, I'm breaking the golden rule and blogging about blogging. So sue me. Anyhow, I just wanted to let the feed readers know that today they should actually view the page in their browser. The Typo theme contest that Geoffrey announced earlier has started receiving some nice submissions, so far they have 4 new themes for Typo. I've swapped over to using the theme titled: Laughing at You (with some minor edits).

For other themes that aren't contest entries you can always check Typo's Trac Tickets. Last count I saw 7 themes there...

And for people who want to make their own Typo themes, contest or no, there's a nice tutorial up on Nuby on Rails about creating Typo themes.

Posted at 3pm on 10/27/05 | Posted in , , | no responses | read on

Web Design for Dummies

I recently stumbled across a few pages that are handy for choosing and designing websites in terms of color schemes.

First up is the Web Color Schemes from Return of Design. Nothing too fancy, but it has a set of color schemes that a beginning designer can pick up and use without too much thought. Not exactly rocket science…

But, when you combine these color schemes with the new Stock Repository Experimental Colr Pickr you should be able to find yourself some flickr pictures by color roughly. It’s almost a beginner’s website on a box!

Posted at 2am on 06/30/05 | Posted in | no responses | read on