When an association is being referred in some complex nested loops, a query executed on calling an association can cause performance issues. This is called as N+1 query problem.
What is N+1 query problem?
Let's take an example model Post
as given below.
class Post < ApplicationRecord
belongs_to :user
end
Consider the code given below to print title
of the post and name
of the user for all the posts.
Post.all.each do |post|
puts post.title
puts post.user.name
end
If we see the query log for the code above, we can see that there are many queries executed.
User Load (3.6ms) SELECT "posts".* FROM "posts"
SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 1]]
SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 2], ["LIMIT", 1]]
SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 3], ["LIMIT", 1]]
SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 2], ["LIMIT", 1]]
As we can see there is 1 query on posts table and many queries on the users table.
The query in the loop is happening because of the line:
puts post.user.name
When we call post.user.name, it fires a query on the users table to get the user row corresponding to user_id for the post being iterated.
Assuming user_id for one of the post is 2. The query would be something like:
SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 2], ["LIMIT", 1]]
This query is being executed for each post. The problem of multiple queries being fired is called N+1 query problem, where
- N is the number of posts in this case
- N queries are fired for getting the users + 1 query to get all posts
Use of includes OR associated_cached?
This can be avoided by checking if the association is cached in Rails.
Suppose, we have a model Post
and a model Attachment
as given below.
The relation between the two models is defined as Post ->
has_many -> attachments.
class Post < ApplicationRecord
has_many :attachments, dependent: :destroy
end
Suppose, we have a method that receiveds posts argument and returns some metadata.
def posts_metadata(posts)
posts.map do |post|
attachments = post.attachments.order(:created_at)
{
id: post.id,
title: post.title,
attachments: attachments.map do |attachment|
id: attachment.id,
url: attachment.url
end
}
end
end
As the method received posts argument, we are not sure whether attachments are eagerly loaded or not. Ideally, attachments if eagery loaded, won't cause N+1 query problem.
But, the order(:created_at) and then map would cause a query in the loop for each iteration.
Here we can check if the attachments association is already cached using method association_cached?.
if post.association_cached?(:attachments)
attachments = post.attachments.sort_by(&:created_at)
else
attachments = post.attachments.order(:created_at)
end
The method association_cached? will check if the attachments association is already cached. If it is, the code above will return the attachments in sorted order. If it is not, it will perform a database query to fetch attachments of the post in an ordered manner.
Conclusion
The optimization discussed in this article will only work for has_many associations. Moreover, make sure this is a performance concern of your application before optimizing. It may not have a significant impact on your application if the data iterated is not large.