5 Things to Avoid in Rails Migrations

Wade Winningham
Tech at Power
Published in
6 min readApr 16, 2021

--

Brand new projects and most small projects don’t typically have many headaches in regards to database migrations.

Once your database has grown, either in size or volume of transactions, migrations can create panic. Your database can stop responding because it’s more focused on executing a migration than responding to production requests.

That’s not good.

At Power Home Remodeling, our database is now well over one terabyte in size. For some that may seem large, but there are other Rails projects with much larger databases.

Percona Toolkit / Departure gem

We have a five-server MySQL database cluster. We use the departure gem which leverages Percona Toolkit to perform most migration operations in a non-blocking way. Github’s gh-ost gem is similar so it’s worth comparing tools to see what may be best for you.

While the list below comes from our experience working with MySQL servers and clusters, they should apply to most relational databases.

1. Process data in small batches

Rails has update_all and delete_all methods to easily assign values for many rows of data.

These wrap the entire operation in a transaction, which will lock a table and freeze all other interaction with it until the migration is complete. With a database cluster, it can interfere with flow control which could shut down your entire cluster!

Severalnines’ Top Mistakes to Avoid in MySQL Replication and Percona’s Galera Flow Control in Percona XtraDB Cluster for MySQL articles have quite a bit of detail on how and why this occurs.

It’s safer to update each row individually. This does take much longer, but the safety of only locking a single row at a time allows for normal application behavior to continue uninterrupted.

Photo by Gabriella Clare Marino on Unsplash

Another option is to perform an update_all or delete_all in small batches of rows using .in_batches or similar method. This will still lock the related table, but for much shorter amounts of time.

2. Avoid foreign key constraints

Foreign key constraints are generally a good thing. They can prevent bad data from getting into your database. Since it’s at the database level, it’s better than Rails’ own validates_associated method which only exists within your application.

They start to make life difficult for you, however, as your database grows, particularly if you have a database cluster.

I mentioned that we use Percona Toolkit with MySQL. Its pt-online-schema-change tool is what performs non-blocking schema changes. Without it, your database would completely block interaction while your change is in progress. The tool works like this:

  1. Create a temporary copy of the table and alter it.
  2. Capture changes via triggers to keep data in sync from the original table to the temporary table.
  3. Copy all data from the original table to the temporary table.
  4. Swap the original with the temporary table, making the temporary table the live table.
  5. Clean up the now-stale original version of the table.

Since the schema change is occurring on the temporarily table, interaction with the original one isn’t blocked.

Foreign key constraints do not work well with this process

MySQL is smart enough to re-point foreign key constraints if a table is renamed.

Photo by Markus Spiske on Unsplash

If we alter the users table, the process would be to create a temporary _users_new table. Then, when ready, rename users to _users_old prior to deleting it.

In this context, MySQL will keep any foreign keys pointing to the original table with the renamed table. Not the new one we’d like it to be pointing to. The cleanup would fail, aborting the entire process.

Github’s gh-ost tool explicitly does not support foreign key constraints. By default, it will abort if it senses them with a table before even starting.

All this working around foreign keys will mean you’ll need to do a lot more manual work. Possibly forcing you into some downtime since you cannot rely on tools like pt-online-schema-change.

It’s worth mentioning that within Rails migrations, the references method can add foreign key constraints when you may not know it would.

.references can create a foreign key constraint — be careful with it

3. Avoid long-running migrations

Photo by Agê Barros on Unsplash

Our deployment process will abort if it’s taking longer than we anticipate a normal deployment should take. If a migration is expected to take a long time, we make alternative plans for it so it doesn’t hold up our deployment pipeline:

  • Have our database administrator perform the task using percona-toolkit. We'll opt for this usually if we prefer to have things closely monitored during execution.
  • Make the expensive operation a background job and trigger it within a migration. This way the job is started, but the deployment itself doesn’t wait on it to finish.
  • Make it a rake task that we can execute manually during a convenient time.

4. Don’t use external classes in migrations

It’s common to use existing models when updating data within migrations. This can sometimes have unintentional side effects. A model could potentially get updated with a validation or change which could cause the update to break.

You could fall back to using execute and use raw SQL. But, you can also define a version of the model solely for use within the migration. This strips out any application-level restrictions and validations resetting the class to the basic ActiveRecord functionality.

Sometimes you do want your model’s logic in play, but this is a technique which can be useful in many cases to avoid unintentional side effects.

5. When renaming a column, add the new name, then delete the old later

Photo by Ashim D’Silva on Unsplash

Deployments do not refresh code in every server in your environment simultaneously. Unless you shut your entire environment down during deployments, a web or transaction server will very likely be handling requests with the old version of your code for a short period of time.

Your app will break if that old code accesses a column that’s now been renamed or deleted.

First, create a new column with the name you want to use.

Even if you want to prevent NULL values eventually, it's safer for now to exclude null: false in the initial migration to avoid potential issue while we're moving data over. You can alter the table later accordingly.

The migration addresses existing data, but we probably need to allow for a graceful cutover. You should consider updating your code to refer to the new field exclusively or use something like asset.unassigned_at || asset.ended_at to handle issues where you want to defer to the old value if the new one hasn't been set yet.

Before you’re ready to actually delete the new column, you can remove all references to the old column name and use ignored_columns to make sure Rails is intentionally excluding it, as well.

Once that's all done and you've confirmed that things are working as expected you can clean up. Finally, remove the old column and don't forget to remove the ignored_columns line since it won't be needed further.

--

--