Rails: Polymorphic Associations With Referential Integrity
On Instagram, posts have comments; stories too have comments. A comment can belong to either a post or a story, not neither or both at a time. There are several ways to model such relationships.
1. Separate associations/foreign keys
One way to model the relationship is separate optional associations.
class Comment < ApplicationRecord
belongs_to(:post, optional: true)
belongs_to(:story, optional: true)
end
class Post < ApplicationRecord
has_many(:comments, dependent: :destroy)
end
class Story < ApplicationRecord
has_many(:comments, dependent: :destroy)
end
And nullable foreign keys:
t.bigint(:story_id, null: true)
t.bigint(:post_id, null: true)
add_foreign_key("comments", "stories")
add_foreign_key("comments", "posts")
There is one problem with this: Both columns are nullable, but a comment must belong to either a story or a post. This rule can be enforced with a custom model validation:
class Comment < ApplicationRecord
belongs_to(:story, optional: true)
belongs_to(:post, optional: true)
validate { validate_exactly_one_present(:story, :post) }
end
class ApplicationRecord < ActiveRecord::Base
private
# Reusable "exactly one present" validator
def validate_exactly_one_present(*attributes)
attributes.count { |attribute| public_send(attribute) }.then do |presence_count|
unless presence_count == 1
errors.add(:base, "Exactly one of #{attributes.to_sentence} must be set")
end
end
end
end
Model validations enforce rules in the application layer; they help ensure data entering the database from the application is valid, but they could be circumvented (e.g., with comment.save(validate: false)), and there could be other applications without the validation writing to the same database, so they don't guarantee the database stays free of invalid data. Only database constraints can guarantee absolute correctness. We can use a check constraint to address this:
t.check_constraint(
# num_nonnulls is a Postgresql function, so might not be available in other dbs.
"num_nonnulls(story_id, post_id) = 1",
name: "belongs_to_exactly_one_of_post_or_story"
)
The overall approach accurately establishes the relationships and guarantees both data correctness and consistency, but it has several flaws:
1. It'll require separate queries for each association. E.g.,
Comment.where.not(post: nil) # Fetch all post comments
Comment.where.not(story: nil) # Fetch all story comments
Comment.where(post: post) # Fetch comments for a specific post
Comment.where(story: story) # Fetch comments for a specific story
Comment.where(user: user).where.not(story: nil) # Fetch story comments from a specific user
Comment.where(user: user).where.not(post: nil) # Fetch post comments from a specific user
2. As a consequence of the previous point, you might also need separate indexes to optimize various query patterns for each association. And more indexes increase the cost of writes.
3. REST API resources typically map to models. Having separate mutually exclusive foreign keys may require mutually exclusive fields in the request body (exactly one must be set, the rest left null). This can be both awkward for API consumers and tricky to document clearly; while OpenAPI supports oneOf natively, most API reference renderers don't surface the constraint in a way that's immediately obvious to consumers.
The flaws listed above amplify with more associations.
2. Standard Polymorphic Association
Polymorphic associations let a model belong to multiple other models through a single association. Rather than maintaining multiple mutually exclusive foreign keys, you get one association, represented by a type and id column that cleanly handles all relationships.
Rails has native support for polymorphic relationships. Setting it up is simple.
Declare the polymorphic relationship in the models:
class Comment < ApplicationRecord
belongs_to(:commentable, polymorphic: true)
end
class Post < ApplicationRecord
has_many(:comments, as: :commentable, dependent: :destroy)
end
class Story < ApplicationRecord
has_many(:comments, as: :commentable, dependent: :destroy)
end
And add supporting columns (<association>_type and <association>_id) in the database:
t.string(:commentable_type, null: false)
t.bigint(:commentable_id, null: false)
The belongs_to polymorphic declaration provides a single interface for accessing the parent record: commentable (which could be any model) and from the parent records, an interface for accessing a collection of children through the has_many declaration .
We might want to limit the accepted parent models to Post and Story. We could do that with an inclusion validation on commentable_type:
class Comment < ApplicationRecord
TYPE_POST = "Post"
TYPE_STORY = "Story"
TYPES = [TYPE_POST, TYPE_STORY].freeze
belongs_to(:commentable, polymorphic: true)
validates(:commentable_type, inclusion: { in: TYPES })
end
Rails' polymorphic association addresses all the flaws of the "separate associations" approach:
1. There is a consistent, parent-agnostic query interface; it doesn't require separate queries for each parent type.
Comment.where(commentable: post | story)
2. As a result, indexing is simpler. You only need composite indexes that include commentable_type and commentable_id.
3. A simpler REST API: `commentable_type` and `commentable_id` are both required fields for creating a comment, and there is no mutual exclusivity among fields. This makes the API easier to use and document.
But it has a flaw:
It's not possible to have foreign keys for the association because foreign keys in relational databases reference only a single table (you can't have a foreign key reference both posts and stories). And without foreign keys, data consistency isn't guaranteed. If a Post or Story is deleted using the standard ActiveRecord interface (post.destroy), the associated comments will be deleted due to the dependent: :destroy option in the has_many declaration. But it's possible to skip call backs (using post.delete for example); it's also possible to run an SQL command directly: DELETE FOM posts WHERE id = 1. Deleting with any of those methods won't delete any associated records, leaving orphaned comments.
3. Hybrid: Separate Foreign keys + Polymorphic Association
This approach aims to address the flaws of both the separate foreign keys and the polymorphic association approach. The idea is simple:
- Have Rails polymorphic associations to provide a simple query interface.
- Have foreign keys for each parent model solely for data consistency. The foreign key columns shouldn't be used in queries.
- Synchronize both interfaces
We can write a macro that implements this:
class ApplicationRecord < ActiveRecord::Base
private
def self.belongs_to_one_of(*associations, as:, optional: false, dependent: nil, foreign_keys: true)
associations = associations.map(&:to_s)
# Standard polymorphic declaration:
belongs_to(as, polymorphic: true, optional: optional, dependent: dependent)
# Validation for accepted types:
validates(:"#{as}_type", inclusion: { in: associations.map(&:classify), allow_nil: optional })
# A class method to return all the supported types, e.g: Comment.commentable_types => ["Post", "Story"]
define_singleton_method(:"#{as}_types") do
associations.map(&:classify)
end
return unless foreign_keys
# before_save callback that sets the foreign keys (commentable_post_id | commentable_story_id)
# based on the polymorphic columns (commentable_type and commentable_id)
assign_foreign_keys_method_name = :"assign_#{as}_foreign_keys"
define_method(assign_foreign_keys_method_name) do
associations.each do |association|
if send(:"#{as}_type") == association.classify
send(:"#{as}_#{association}_id=", send(:"#{as}_id"))
else
send(:"#{as}_#{association}_id=", nil)
end
end
end
before_save(assign_foreign_keys_method_name)
end
end
Then use the macro in the model like so:
class Comment < ApplicationRecord
belongs_to_one_of(:post, :story, as: :commentable)
end
# Relationship declaration in parent models remain the same:
class Post < ApplicationRecord
has_many(:comments, as: :commentable, dependent: :destroy)
end
class Story < ApplicationRecord
has_many(:comments, as: :commentable, dependent: :destroy)
end
This requires the following:
- Like the basic Rails polymorphic association expects, a
<as>_typeand<as>_idcolumn MUST exist in the database table, e.g.,commentable_typeandcommentable_id. - A composite index on the
<as>_typeand<as>_idcolumns SHOULD be defined. - If foreign keys are enabled (the default), foreign key columns MUST exist for the different types in this form:
<as>_<type>_id, e.g.,commentable_post_idandcommentable_story_id. - If foreign keys are enabled, foreign key constraints MUST be defined for the columns.
- If foreign keys are enabled, there SHOULD be indexes for each foreign key column. The foreign key indexes ensure efficient destruction of parent records. To reduce index size, the foreign key index SHOULD be a partial index that filters
NOT NULLcolumns.
The database schema for this setup would look like this:
t.string(:commentable_type, null: false)
t.bigint(:commentable_id, null: false)
t.bigint(:commentable_post_id, null: true)
t.bigint(:commentable_story_id, null: true)
t.check_constraint(
# num_nonnulls is a Postgresql function, so might not be available in other dbs.
"num_nonnulls(commentable_story_id, commentable_post_id) = 1",
name: "belongs_to_exactly_one_of_post_or_story"
)
t.index([:commentable_type, :commentable_id], name: "index_comments_commentable")
t.index(:commentable_post_id, name: "index_comments_commentable_post_id", where: "commentable_post_id IS NOT NULL")
t.index(:commentable_story_id, name: "index_comments_commentable_story_id", where: "commentable_story_id IS NOT NULL")
add_foreign_key("comments", "posts", column: "commentable_post_id")
add_foreign_key("comments", "stories", column: "commentable_story_id")
The foreign key columns exist solely for referential integrity enforcement; they MUST NOT be used in application queries. All queries MUST use the polymorphic interface. Using the individual foreign key columns in queries might require additional composite indexes when the query also filters other columns, and we want to limit the number of indexes since more indexes affect writes.
# avoid
Comment.where(commentable_post_id: 1)
Comment.where(commentable_story_id: 1)
# good
Comment.where(commentable: post)
Comment.where(commentable: story)
This approach isn't flawless; it results in more columns in the table and more indexes than the basic polymorphic association approach. However, it addresses most of the flaws with the previous approaches:
- It provides a simple, consistent, and parent-agnostic query interface.
- It results in simpler REST APIs
- It'll potentially need fewer indexes compared to the "separate associations" approach, especially when there are queries that filter by both the parent and other columns.
- The foreign keys guarantee data consistency at the database level.
NOTE: This approach is unsuitable for polymorphic associations with indefinite parent model types. It's only suitable for polymorphic associations with a defined list of accepted model types; just like the Comment-> Post | Story example.
Thanks for reading! Have thoughts or questions?
More articles