Reducing downtime in Django with django-safemigrate

A common hurdle to zero-downtime deployments is migrating the database. If you desire zero-downtime, your deployment strategy will likely involve swapping the new app servers in for the old ones. During this time, you will have servers running your old code and servers running the new code. Both sets will be interacting with your other services, such as your database. Often, this doesn’t pose a problem. However, changing the columns on a table can.

This is because your new/old Django application code may have to work with your old/new table schema. See the following graphic for a better explanation:

Diagram of a request being routed to new/old Django apps and querying a database Diagram of a request being routed to new/old Django apps and querying a database

When a migration is applied as part of a deployment, we need to consider the following possibilities, as shown in the tables. The tables are split depending on whether the migrations run before or after the application servers are swapped. It’s possible to do it while it’s swapping, but then we’d have to deal with both sets of scenarios.

SQL Operation Django Model Version Table Version Compatible?
Select Old New  
Delete Old New  
Update Old New  
Insert Old New  

Compatibility table for running migrations before deployment

SQL Operation Django Model Version Table Version Compatible?
Select New Old  
Delete New Old  
Update New Old  
Insert New Old  

Compatibility table for running migrations after deployment

When working through these tables, consider what would happen if the SQL operation was performed for the given model and table versions. Would it succeed or would it error?

Keep in mind that Django’s SELECT, UPDATE and INSERT queries will specifically list the columns that the model is aware of. DELETE queries aren’t concerned with the columns.

Example scenarios

Let’s focus on two scenarios that are common, but also show the difficulties of zero-downtime deployments.

  1. Adding a field
  2. Removing a field

Let’s start with adding a field.

Adding a field with migrations running before deployment

SQL Operation Django Model Version Table Version Compatible?
Select Old New ✅ Yes
Delete Old New ✅ Yes
Update Old New ✅ Yes
Insert Old New ❓ Depends on NULL

Compatibility table for running migrations to add a field before deployment

These generally all work because the old version of your model will list a subset of the columns on the table. That works fine for all operations, except for INSERT queries. For INSERT queries, one of two things must be true:

  1. The column is nullable, so the default is NULL (preferred option1)
  2. The column on the table has a default value

A default, NULL or otherwise, is necessary because the table needs to know what to use for that column; Django won’t specify that column in the INSERT query because the model doesn’t have that field.

As shown in the compatibility table, when adding a field, it’s a good idea to run migrations before deployment, as long as your field is marked as nullable or adds a default to the table’s column.

Adding a field with migrations running after deployment

SQL Operation Django Model Version Table Version Compatible?
Select New Old ✖ No
Delete New Old ✅ Yes
Update New Old ✖ No
Insert New Old ✖ No

Compatibility table for running migrations to add a field after deployment

If we consider that Django specifies all columns it knows about, it makes sense why using the new version of the model with the old version of the table won’t work. It would be possible to use .only() and .save(update_fields=[]) to limit which columns are used in the queries, but they aren’t the defaults. And it only takes one case of Model.objects.get() in your codebase to cause an error during deployment.

Removing a field with migrations running before deployment

SQL Operation Django Model Version Table Version Compatible?
Select Old New ✖ No
Delete Old New ✅ Yes
Update Old New ✖ No
Insert Old New ✖ No

Compatibility table for running migrations to remove a field before deployment

This matches adding a field when migrating the database after deployment. In short, if your table has fewer columns than what your model says it has, you’re going to run into problems.

Removing a field with migrations running after deployment

SQL Operation Django Model Version Table Version Compatible?
Select New Old ✅ Yes
Delete New Old ✅ Yes
Update New Old ✅ Yes
Insert New Old ❓ Depends on NULL

Compatibility table for running migrations to remove a field after deployment

Again, this matches adding a field when migrating the database before deployment. This is because it reduces the fields on the model, so it’s a subset of the columns on the table.

Additionally, when removing a field from a model, you should create a migration that sets null=True first on the field. Marking the column as nullable is what allows inserts into that table to continue to succeed. Otherwise, the table will expect that column to be specified in the INSERT query, but the model won’t include it, leading to an error. Plus, if you need to migrate it backwards, the column will be created as nullable first1.

General approach for adding/removing fields

Reviewing the compatibility tables, if you add a field, you should run the migration before the updated application is deployed. If the table you’re changing is large, then you should add it as a nullable field.

If you remove a field, you should run the migration after the updated application is deployed.

Simplifying with django-safemigrate

The django-safemigrate package allows you to define your migrations and run them safely during your deployment process.

Before you use this package, your deployment process must allow for a Django management command to be run before the application is deployed. Most PaaS providers support this2:

This pre-deployment/release command will need to run python manage.py safemigrate.

Defining migrations for django-safemigrate

For our previous cases, adding a field and removing a field, we would end up with the following migrations:

from django_safemigrate import Safe

class Migration(migrations.Migration):
    # A future version of django-safemigrate will 
    # change this to Safe.before_deploy()
    safe = Safe.before_deploy
    operations = [
        migrations.AddField(
            model_name='mymodel',
            name='new_field',
            field=models.BooleanField(null=True),
        ),
    ]
from django_safemigrate import Safe

class Migration(migrations.Migration):
    # A future version of django-safemigrate will 
    # change this to Safe.after_deploy()
    safe = Safe.after_deploy
    operations = [
        migrations.RemoveField(
            model_name='mymodel',
            name='old_field',
        ),
    ]

Running migrations with django-safemigrate

When you run python manage.py safemigrate, it will run the migrations based on the safe attribute. If you run it before deployment, it will run the migrations marked as Safe.before_deploy. If you run it after deployment, it will run the migrations marked as Safe.after_deploy.

By running python manage.py safemigrate in the pre-deployment/release command, you can control what migrations will be applied before and after the deployment process. This allows you to include more migrations in a single deployment and avoid structuring your commits around deployments3.

One sticking point with the library is that it doesn’t support automatically running a migration with Safe.after_deploy. Those migrations must be applied manually. This results in occasions where that step is forgotten and then the next deployment with a Safe.before_deploy fails because safemigrate will refuse to apply a Safe.after_deploy migration, even if a Safe.before_deploy depends on it. However, there has been some movement to improve the developer experience. You can follow the feature here.

Conclusion

zero-downtime deployments aren’t always possible, even with django-safemigrate, but they are easier to do. By understanding what the migrations are doing to the database and considering when the migration should be applied, you can significantly reduce the deployment downtime.

If you have thoughts, I’d love to hear them. You can find me on the Fediverse, Django Discord server, or reach me via email.

  1. A nullable column is preferred because it doesn’t require the database to lock the entire table to add a new value to every row. If you have large tables, adding a column that is nullable is practically a requirement.  2

  2. The Essential Django Deployment Guide provides a nice explanation of this: https://www.saaspegasus.com/guides/django-deployment/#build-and-release-commands

  3. Well, not entirely. But it does reduce it significantly.