Recent Projects

Random Records in Rails

There are a number of different ways to retrieve random items from a database in Rails, most of which have been discussed at length on the Rails wiki. According to this article, there are 3 preferred methods:

  1. Select a Record by Random Offset
  2. Randomize with the Database
  3. Randomize with Ruby

For an in-depth look into these options, you can peruse this discussion page, which details some of the pros and cons of different strategies for randomization.

After trying out a few of the techniques that abound in this area, I stumbled across an article from Jamis Buck, where he discusses a RESTful way to approach the creation of custom finders. Although randomization isn’t the focus of the article, he does provide a bit of guidance in that regard. His strategy uses Ruby for the randomization, and employs 2 light-weight queries. The first query gathers a list of valid ids. The second simply selects a single item using that (randomized) id.

So, I did a bit of cargo-culting and repurposing to achieve an efficient, database agnostic way to retrieve random items from a database using ActiveRecord. Simply add the following to the model from which you’d like to be able to pull random records:

class Widget < ActiveRecord::Base

  # ...

  def self.random
    ids = connection.select_all("SELECT id FROM widgets")
    find(ids[rand(ids.length)]["id"].to_i) unless ids.blank?
  end

end

Then, you can use the following bit of code in a controller like so:

class SomeController < ApplicationController

  # ...

  def some_action
    @widget = Widget.random
  end

end

And you've got a small and efficient "random finder" for use throughout your app. Lovely.

Update I just came across a comment from Rails Core member Koz that provides an alternative that seems compelling at fist glance:

19 Responses to “Random Records in Rails”

  1. It could be me, but isn’t retrieving _all_ ID’s from the database and only using one the opposite of efficient?

    I have no clue how to do this in a more efficient way, but that’s just my 0.02$ :)

  2. Trevor says:

    I’ve heard that this may not be the most efficient way if you have lot of data, but it seems to work well enough for me, and it’s database agnostic, which I like.

  3. Preston Lee says:

    That’s definitely not going to fly for large data sets.

  4. Trevor says:

    I’ve heard that before, but I have yet to see a better solution :)

  5. Karl says:

    Not to add to the fracas (I agree large sets could be unwieldy), but this is a little more agnostic:

    def self.random_50
    all_ids = find(:all, :select => :id)
    random_ids = 50.times.collect{|t| rand(all_ids.length)}
    find(random_ids)
    end

    You could end up with repeats, but so can your method.

  6. Karl says:

    Opps, sorry, copied an older version. This is much better:

    def self.random_50
    all_ids =find( :all, :select => :id )
    random_ids = 50.times.collect{ |t| all_ids[rand(all_ids.length)].id }
    find(random_ids)
    end

  7. Yogi Patel says:

    When trying to get a random value from a large table, grabbing all the IDs like that is a _bad idea_. Instead, use something like this:

    Widget.first(:conditions => “id >= floor(rand() * (select max(id) from widgets))”, :limit => 1)

    That will execute much, much faster for large tables.

  8. Gordy says:

    What is wrong with:

    def self.random
    self.find(:first, :offset => rand(self.all.size-1))
    end

    or am i missing something?

  9. Filipe Abreu says:

    Yogi Patel’s solution can return deleted records.

    Trevor’s solution is surely heavier, but it only find recorded ids.

    I have pretty much searched the web on this matter, and this is still a difficulty to many programmers. It seems there is no way of doing this without searching the whole table (but in only one column of course).

    Trevor’s solution is one of the best I have found.

  10. Sean says:

    I know this is quite awhile after the original post, but I came across this discussion while searching for a good, database agnostic method of retrieving a random record.

    My solution is an adaptation of Gordy’s:

    def self.random
    self.find(:first, :offset => rand(self.count()))
    end

    It still takes two database queries, but the first one only gets the total number of records instead of actually retrieving the id’s for all of them, which should take less time.

  11. drailskid says:

    for a more universal and safer way to implement this, I’d suggest using

    class ActiveRecord::Base
    def self.random
    ids = connection.select_all(“SELECT #{self.primary_key} FROM #{self.table_name}”)
    find(ids[rand(ids.size)][self.primary_key].to_i) unless ids.blank?
    end
    end

    The original solution will only work if the primary key on the table is “id”, but this will work no matter what it is.

  12. Benson says:

    Found this in another forum:

    def self.random
    find(:first, :order => “RAND()”)
    end

  13. itchy says:

    To add to Benson’s short solution, I needed to scope by “active = 1″, so:

    named_scope :active, :conditions => {:active => true}

    def self.random
    self.active.find(:first, :order => ‘RAND()’)
    end

  14. Dan says:

    I use a named_scope:

    named_scope :random, lambda { |random| {:order => “RAND()”, :limit => random }}

    Article.random(4) would return 4 random objects

  15. Glenn says:

    Dan has a cool use of named_scopes, but unfortunately I think the caching system might come up to bite you. If you ask for Article.random(1) twice for instance, I believe it returns the same record.

  16. [...] from a baseline of the code found here at almosteffortless.com I’ve extended a ‘random record grabber’ to get a specific number of unique [...]

  17. Damir says:

    How about this:

    @posts = Post.find(:all).sort_by{rand}

  18. Chris says:

    If you are wooried about caching couldn’t you add a simple date into the lookup?

    named_scope :random, lambda { |*args| {:order => “RANDOM()”, :limit => (args.first || 1), :conditions => ["created_at < ?", Time.now.utc] }}

    changes:
    1) (postgres uses random() not rand(), so I changed it to random())
    2) conditions created_at RANDOM(). that really helped me finish this in my project. :)

  19. Chris says:

    The bottom of my message looks foobar-ed.
    1) RANDOM() for postges. I prefer postges
    2) conditions created_at < Time.now SHOULD make sure DB caching will not mess with you, (the SQL will be different on every call)
    3) *args used so that the value it optional (for people not using Ruby 1.9 yet.)

Leave a Reply