Safe Migration Patterns¶

This guide shows safe patterns for common migration operations that can cause downtime or data loss if done incorrectly.

Adding a NOT NULL Column¶

The Problem¶

Adding a NOT NULL column without a default requires the database to:

Lock the entire table
Add the column to every row
Verify no NULL values exist

On large tables, this can lock the table for minutes or hours.

Unsafe Pattern¶

# This will lock the table and fail if rows exist
migrations.AddField(
    model_name='user',
    name='email',
    field=models.CharField(max_length=255),  # NOT NULL, no default!
)

Safe Pattern¶

Split into three migrations:

# Migration 1: Add nullable field
migrations.AddField(
    model_name='user',
    name='email',
    field=models.CharField(max_length=255, null=True),
)

# Migration 2: Backfill data
def backfill_emails(apps, schema_editor):
    User = apps.get_model('myapp', 'User')
    # Batch update to avoid memory issues
    batch_size = 1000
    while User.objects.filter(email__isnull=True).exists():
        ids = list(
            User.objects.filter(email__isnull=True)
            .values_list('id', flat=True)[:batch_size]
        )
        User.objects.filter(id__in=ids).update(email='unknown@example.com')

migrations.RunPython(backfill_emails, migrations.RunPython.noop)

# Migration 3: Add NOT NULL constraint
migrations.AlterField(
    model_name='user',
    name='email',
    field=models.CharField(max_length=255),  # Now NOT NULL is safe
)

Creating Indexes (PostgreSQL)¶

The Problem¶

Standard index creation locks the table for writes during the entire operation.

Unsafe Pattern¶

# Locks table for writes
migrations.AddIndex(
    model_name='user',
    index=models.Index(fields=['email'], name='user_email_idx'),
)

Safe Pattern¶

Use AddIndexConcurrently with atomic = False:

from django.contrib.postgres.operations import AddIndexConcurrently

class Migration(migrations.Migration):
    atomic = False  # Required for CONCURRENTLY

    operations = [
        AddIndexConcurrently(
            model_name='user',
            index=models.Index(fields=['email'], name='user_email_idx'),
        ),
    ]

Note

AddIndexConcurrently is PostgreSQL-specific. For other databases, consider creating indexes during low-traffic periods.

Adding Unique Constraints (PostgreSQL)¶

The Problem¶

Adding a unique constraint requires a full table scan and locks the table.

Unsafe Pattern¶

# Locks table, scans all rows
migrations.AddConstraint(
    model_name='user',
    constraint=models.UniqueConstraint(fields=['email'], name='unique_email'),
)

Safe Pattern¶

First create a unique index concurrently, then add the constraint using that index:

from django.contrib.postgres.operations import AddIndexConcurrently

class Migration(migrations.Migration):
    atomic = False

    operations = [
        # Step 1: Create unique index concurrently
        AddIndexConcurrently(
            model_name='user',
            index=models.Index(
                fields=['email'],
                name='user_email_unique_idx',
            ),
        ),
    ]

# Step 2: Add constraint using the index (in a separate migration)
migrations.RunSQL(
    sql='ALTER TABLE myapp_user ADD CONSTRAINT unique_email UNIQUE USING INDEX user_email_unique_idx;',
    reverse_sql='ALTER TABLE myapp_user DROP CONSTRAINT unique_email;',
)

Adding Foreign Keys¶

The Problem¶

Adding a foreign key validates all existing rows, which can be slow on large tables.

Unsafe Pattern¶

# Validates all existing rows
migrations.AddField(
    model_name='order',
    name='user',
    field=models.ForeignKey('auth.User', on_delete=models.CASCADE),
)

Safe Pattern (PostgreSQL)¶

Add the FK without validation, then validate separately:

# Migration 1: Add FK without validation
migrations.RunSQL(
    sql='''
        ALTER TABLE myapp_order
        ADD CONSTRAINT order_user_fk
        FOREIGN KEY (user_id) REFERENCES auth_user(id)
        NOT VALID;
    ''',
    reverse_sql='ALTER TABLE myapp_order DROP CONSTRAINT order_user_fk;',
)

# Migration 2: Validate FK (can run concurrently)
migrations.RunSQL(
    sql='ALTER TABLE myapp_order VALIDATE CONSTRAINT order_user_fk;',
    reverse_sql=migrations.RunSQL.noop,
)

Removing Columns¶

The Problem¶

During rolling deployments, old code may still reference the column.

Unsafe Pattern¶

# Old code will crash trying to SELECT this column
migrations.RemoveField(
    model_name='user',
    name='legacy_field',
)

Safe Pattern¶

First: Remove all code references to the field
Deploy: Wait for all servers to have the new code
Then: Remove the field in a migration

# Only after code is deployed everywhere
migrations.RemoveField(
    model_name='user',
    name='legacy_field',
)

Tip

Consider using a two-phase approach:

Migration 1: Make field nullable
Wait for full deployment
Migration 2: Remove the field

Renaming Columns¶

The Problem¶

Renaming breaks all existing code referencing the old name.

Unsafe Pattern¶

# Old code will crash
migrations.RenameField(
    model_name='user',
    old_name='name',
    new_name='full_name',
)

Safe Pattern¶

Add the new column
Sync data between columns
Update code to use new column
Deploy everywhere
Remove old column

# Migration 1: Add new column
migrations.AddField(
    model_name='user',
    name='full_name',
    field=models.CharField(max_length=255, null=True),
)

# Trigger or application code syncs data

# Migration 2: After code deployed, remove old column
migrations.RemoveField(
    model_name='user',
    name='name',
)

Changing Column Types¶

The Problem¶

Changing a column type often requires a full table rewrite.

Unsafe Changes¶

VARCHAR(100) → VARCHAR(50) (truncation)
INTEGER → VARCHAR (table rewrite)
VARCHAR → INTEGER (validation + rewrite)

Safe Changes¶

VARCHAR(100) → VARCHAR(200) (increasing size is usually safe)
VARCHAR → TEXT (safe on PostgreSQL)

Safe Pattern for Type Changes¶

Add new column with new type
Backfill data in batches
Update code to use new column
Remove old column

Adding CHECK Constraints¶

The Problem¶

Adding a CHECK constraint validates all existing rows.

Unsafe Pattern¶

# Scans and locks table
migrations.AddConstraint(
    model_name='order',
    constraint=models.CheckConstraint(
        check=models.Q(amount__gte=0),
        name='positive_amount',
    ),
)

Safe Pattern (PostgreSQL)¶

# Add constraint as NOT VALID, then validate
migrations.RunSQL(
    sql='''
        ALTER TABLE myapp_order
        ADD CONSTRAINT positive_amount
        CHECK (amount >= 0)
        NOT VALID;
    ''',
    reverse_sql='ALTER TABLE myapp_order DROP CONSTRAINT positive_amount;',
)

# Validate in separate migration
migrations.RunSQL(
    sql='ALTER TABLE myapp_order VALIDATE CONSTRAINT positive_amount;',
    reverse_sql=migrations.RunSQL.noop,
)

RunSQL Best Practices¶

Always Provide Reverse SQL¶

# Reversible
migrations.RunSQL(
    sql='CREATE INDEX user_email_idx ON myapp_user(email);',
    reverse_sql='DROP INDEX user_email_idx;',
)

# Not reversible - migration cannot be rolled back
migrations.RunSQL(
    sql='CREATE INDEX user_email_idx ON myapp_user(email);',
)

Use State Operations¶

When using RunSQL, tell Django about the schema change:

migrations.RunSQL(
    sql='ALTER TABLE myapp_user ADD COLUMN temp_field VARCHAR(100);',
    reverse_sql='ALTER TABLE myapp_user DROP COLUMN temp_field;',
    state_operations=[
        migrations.AddField(
            model_name='user',
            name='temp_field',
            field=models.CharField(max_length=100, null=True),
        ),
    ],
)

RunPython Best Practices¶

Always Provide Reverse Code¶

def forward(apps, schema_editor):
    User = apps.get_model('myapp', 'User')
    User.objects.filter(status='').update(status='active')

def reverse(apps, schema_editor):
    # Can't truly reverse, but provide a no-op or best effort
    pass

migrations.RunPython(forward, reverse)

Batch Large Operations¶

def backfill_in_batches(apps, schema_editor):
    User = apps.get_model('myapp', 'User')
    batch_size = 1000

    while True:
        # Get a batch of IDs
        ids = list(
            User.objects.filter(new_field__isnull=True)
            .values_list('id', flat=True)[:batch_size]
        )
        if not ids:
            break

        # Update batch
        User.objects.filter(id__in=ids).update(new_field='default')

migrations.RunPython(backfill_in_batches, migrations.RunPython.noop)

PostgreSQL-Specific: Enum Values¶

The Problem¶

Adding enum values inside a transaction fails on PostgreSQL.

Unsafe Pattern¶

# Fails: cannot add enum value inside transaction
migrations.RunSQL("ALTER TYPE myenum ADD VALUE 'new_value';")

Safe Pattern¶

class Migration(migrations.Migration):
    atomic = False  # Required!

    operations = [
        migrations.RunSQL(
            sql="ALTER TYPE myenum ADD VALUE 'new_value';",
            reverse_sql=migrations.RunSQL.noop,
        ),
    ]

Warning

You cannot remove enum values in PostgreSQL. Plan your enums carefully.

Changing null=True to null=False¶

The Problem¶

Changing a field from nullable to NOT NULL requires PostgreSQL to scan all rows to verify no NULL values exist. If NULL values exist, the migration fails.

Unsafe Pattern¶

# Will fail if any NULL values exist
migrations.AlterField(
    model_name='user',
    name='nickname',
    field=models.CharField(max_length=100, null=False),  # Was null=True
)

Safe Pattern¶

# Migration 1: Backfill NULL values
def backfill_nicknames(apps, schema_editor):
    User = apps.get_model('myapp', 'User')
    User.objects.filter(nickname__isnull=True).update(nickname='')

migrations.RunPython(backfill_nicknames, migrations.RunPython.noop)

# Migration 2: Now safe to add NOT NULL
migrations.AlterField(
    model_name='user',
    name='nickname',
    field=models.CharField(max_length=100, null=False, default=''),
)

For large tables on PostgreSQL, use the NOT VALID pattern:

# Add constraint as NOT VALID
migrations.RunSQL(
    sql='''
        ALTER TABLE myapp_user
        ADD CONSTRAINT nickname_not_null
        CHECK (nickname IS NOT NULL)
        NOT VALID;
    ''',
    reverse_sql='ALTER TABLE myapp_user DROP CONSTRAINT nickname_not_null;',
)

# Validate constraint (can run while table is in use)
migrations.RunSQL(
    sql='ALTER TABLE myapp_user VALIDATE CONSTRAINT nickname_not_null;',
    reverse_sql=migrations.RunSQL.noop,
)

# Now add the actual NOT NULL (instant since constraint already validated)
migrations.RunSQL(
    sql='ALTER TABLE myapp_user ALTER COLUMN nickname SET NOT NULL;',
    reverse_sql='ALTER TABLE myapp_user ALTER COLUMN nickname DROP NOT NULL;',
)

Adding unique=True via AlterField¶

The Problem¶

Adding unique=True via AlterField creates a unique index while holding a lock, blocking writes for the duration of index creation.

Unsafe Pattern¶

# Locks table during index creation
migrations.AlterField(
    model_name='user',
    name='email',
    field=models.EmailField(unique=True),
)

Safe Pattern (PostgreSQL)¶

from django.contrib.postgres.operations import AddIndexConcurrently

class Migration(migrations.Migration):
    atomic = False  # Required for CONCURRENTLY

    operations = [
        # Step 1: Create unique index concurrently
        AddIndexConcurrently(
            model_name='user',
            index=models.Index(
                fields=['email'],
                name='user_email_uniq_idx',
            ),
        ),
    ]

# Step 2: Add unique constraint using the index
migrations.RunSQL(
    sql='''
        ALTER TABLE myapp_user
        ADD CONSTRAINT user_email_unique
        UNIQUE USING INDEX user_email_uniq_idx;
    ''',
    reverse_sql='ALTER TABLE myapp_user DROP CONSTRAINT user_email_unique;',
)

Adding ManyToMany Fields¶

The Problem¶

Adding a ManyToManyField creates a new junction table. While generally safe, it's good to be aware of what's happening.

Pattern¶

# This creates a new table: myapp_article_tags
migrations.AddField(
    model_name='article',
    name='tags',
    field=models.ManyToManyField('tags.Tag', blank=True),
)

Considerations¶

Junction table is empty initially (safe)
No locks on existing tables
May want to add indexes on the junction table for performance

# Optional: Add index for reverse lookups
migrations.RunSQL(
    sql='''
        CREATE INDEX CONCURRENTLY myapp_article_tags_tag_idx
        ON myapp_article_tags(tag_id);
    ''',
    reverse_sql='DROP INDEX myapp_article_tags_tag_idx;',
)

ForeignKey Without Index¶

The Problem¶

By default, Django creates an index on ForeignKey fields. If you disable this with db_index=False, JOIN queries can become very slow on large tables.

Pattern to Avoid¶

# No index means slow JOINs
migrations.AddField(
    model_name='order',
    name='customer',
    field=models.ForeignKey(
        'customers.Customer',
        on_delete=models.CASCADE,
        db_index=False,  # Don't do this unless you have a good reason
    ),
)

When db_index=False is OK¶

The table will always be small
You're creating a covering index that includes this field
The field is never used in WHERE or JOIN clauses

# OK if you have a covering index
migrations.AddField(
    model_name='order',
    name='customer',
    field=models.ForeignKey(
        'customers.Customer',
        on_delete=models.CASCADE,
        db_index=False,
    ),
)

# Add a covering index instead
migrations.AddIndex(
    model_name='order',
    index=models.Index(
        fields=['customer', 'created_at'],
        name='order_customer_created_idx',
    ),
)

Expensive Default Callables¶

The Problem¶

Using callables like timezone.now as defaults means Django calls the function once per row during backfill, which can be slow for large tables.

Pattern to Consider¶

# This is fine for new tables
migrations.AddField(
    model_name='user',
    name='created_at',
    field=models.DateTimeField(default=timezone.now),
)

For Large Existing Tables¶

# Migration 1: Add with static default
migrations.AddField(
    model_name='user',
    name='last_login',
    field=models.DateTimeField(null=True),  # Nullable initially
)

# Migration 2: Backfill in batches with explicit value
def backfill_last_login(apps, schema_editor):
    from django.utils import timezone
    User = apps.get_model('myapp', 'User')
    now = timezone.now()  # Single value for all rows
    batch_size = 10000

    while User.objects.filter(last_login__isnull=True).exists():
        ids = list(
            User.objects.filter(last_login__isnull=True)
            .values_list('id', flat=True)[:batch_size]
        )
        User.objects.filter(id__in=ids).update(last_login=now)

migrations.RunPython(backfill_last_login, migrations.RunPython.noop)

Data Migrations with Large Tables¶

The Problem¶

Using .all() without .iterator() loads the entire table into memory.

Unsafe Pattern¶

# Loads all rows into memory at once
def migrate_data(apps, schema_editor):
    User = apps.get_model('myapp', 'User')
    for user in User.objects.all():  # OOM on large tables!
        user.name = user.name.title()
        user.save()

Safe Pattern¶

# Process in batches
def migrate_data(apps, schema_editor):
    User = apps.get_model('myapp', 'User')
    batch_size = 1000

    while True:
        batch = list(
            User.objects.filter(name_migrated=False)[:batch_size]
        )
        if not batch:
            break

        for user in batch:
            user.name = user.name.title()
            user.name_migrated = True

        User.objects.bulk_update(batch, ['name', 'name_migrated'])

Or use .iterator():

# Uses server-side cursor
def migrate_data(apps, schema_editor):
    User = apps.get_model('myapp', 'User')
    for user in User.objects.all().iterator(chunk_size=1000):
        # Process one at a time, memory efficient
        user.name = user.name.title()
        user.save(update_fields=['name'])

Summary Table¶

Operation	Risk	Safe Pattern
Add NOT NULL column	Table lock	Add nullable → backfill → add NOT NULL
Create index	Write lock	`AddIndexConcurrently` (PostgreSQL)
Add unique constraint	Table scan + lock	Create unique index first
Add foreign key	Validates all rows	`NOT VALID` then `VALIDATE`
Remove column	Code breaks	Remove code first, then column
Rename column	Code breaks	Add new → migrate data → remove old
Change column type	Table rewrite	Add new column → migrate → remove old
Add CHECK constraint	Validates all rows	`NOT VALID` then `VALIDATE`
Add enum value	Transaction fails	`atomic = False`
null=True → null=False	Scan + possible fail	Backfill NULLs first
Add unique via Alter	Table lock	Create index concurrently first
Add ManyToMany	Creates new table	Generally safe, consider indexes
FK without index	Slow JOINs	Keep default index or add covering
Expensive default	Slow backfill	Use static default, backfill manually
Large data migration	Memory issues	Use `.iterator()` or batch processing