14

I want to move the field honk and it's data from one model to another using South:

class Foo(models.Model):
    foofield = models.CharField()
    honk = models.PositiveIntegerField()

class Bar(models.Model):
    barfield = models.CharField()

I've done this before, using 3 separate migrations:

  1. A schema migration, adding honk to Bar
  2. A data migration, copying all Foo.honk data to Bar.honk
  3. Another schema migration, dropping honk from Foo

Can I do these three steps in a single migration?

I've already learnt that there isn't much of a difference between schema and data migrations in South, so I figured perhaps something like this might work (which is the three migrations above just munged into one):

class Migration(DataMigration):
    def forwards(self, orm):
        # add column
        db.add_column('myapp_bar', 'honk', self.gf('django.db.models.fields.PositiveIntegerField')(default='0'), keep_default=False)

        # copy data
        for foo in Foo.objects.all():
            # find the right bar here and then ...
            bar.honk = foo.honk
            bar.save()

        # remove old column
        db.delete_column('myapp_foo', 'honk')

Will this work or will it fail because my (South frozen) orm doesn't know about Bar.honk yet? Or am I doing it wrong and there's a nicer way to do this sort of thing in a single migration?

Community
  • 1
  • 1
Ingmar Hupp
  • 2,409
  • 18
  • 22

4 Answers4

15

Since this question earned me a Tumbleweed badge, I've dug in and tried it myself. Here's what I've found out.

No you can't merge these migrations

Because the ORM freeze only contains the schema you are migrating to. So in the above example, foo.honk wouldn't be accessible during the data migration (the for loop), because it is deleted during the schema migration, so it isn't in the frozen ORM. Additionally you'll get DatabaseError exceptions if you try to access data because the columns in the database don't yet match the ones of the model (i.e. if you try to access anything before a db.add_column).

Looks like there's no simple shortcut and doing something like this does take the 3 migrations mentioned above.

Community
  • 1
  • 1
Ingmar Hupp
  • 2,409
  • 18
  • 22
  • 2
    I would add that this is what `symmetrical = True` is for in a data migration. The assumption is that a data migration has a consistent schema before and after, and so you can use `orm` to access your frozen models in your `forwards` and `backwards` migrations. Because this isn't the case in a schema migration, no data should be moved in schema migrations. Also, *do not* import model names in data migrations; instead, assign to these names from the frozen models from `orm`. The manual isn't super clear about this, so I thought I'd include the info here. – acjay Jan 28 '13 at 01:48
  • I have just merged migrations like those by using db.execute method of south for data manipulation. Also you need to use db.start_transaction and db.commit_transaction to divide schema and data changes into separate transactions. – clime Apr 15 '13 at 13:40
  • @clime it seems like it might be dangerous even with transactions; isn't there the possibility that half the migration is committed, in the case that one of them errors out? if that's the case, then the "migration stack" could be left in an inconsistent state, and you wouldn't be able to cleanly reverse the migration. – mpontillo Jul 22 '15 at 02:08
1

The documentation is lacking from this point of view but if you modify the freezed part of the ORM in the migration adding yourself the missing field then it will be accessible: I mean, during a south migration, you must use the freezed ORM since when in the future you'll migrate, the Foo model may have lost the honk field.

I think that if you modify the freezed ORM declaration like the following

models = {
    'app.foo': {
        'Meta': {'object_name': 'Foo'},
        'id': ('django.db.models.fields.AutoField', [], {'primary_key': 'True'}),
        'foofield': ('django.db.models.fields.CharField', [],{'max_length':666}),
        'honk': ('django.db.models.fields.PositiveIntegerField', [], {}),
    },
    'app.bar': {
        'Meta': {'object_name': 'Bar'},
        'id': ('django.db.models.fields.AutoField', [], {'primary_key': 'True'}),
        'barfield': ('django.db.models.fields.CharField', [],{'max_length':666}),
        'honk': ('django.db.models.fields.PositiveIntegerField', [], {}),
    },
}

complete_apps = ['app']
symmetrical = True

all is gonna work :)

The trick is the defition of the field honk in each model, obviously the column in the database must be present

class Migration(DataMigration):
    def forwards(self, orm):
        # add column
        db.add_column('myapp_bar', 'honk', self.gf('django.db.models.fields.PositiveIntegerField')(default='0'), keep_default=False)

        # copy data
        for foo in Foo.objects.all():
            # find the right bar here
            bar = orm.Bar.objects.get(**whatever)
            bar.honk = foo.honk
            bar.save()

        # remove old column
        db.delete_column('myapp_foo', 'honk')

PS: as pointed out by @acjohnson55 symmetrical = True is really important

gipi
  • 2,432
  • 22
  • 25
0

this works for me:

def migratedata(orm):
   # copy data, need to lookup model through orm.
   for foo in orm['myapp.foo'].objects.all():
      # find the right bar here and then ...
      bar.honk = foo.honk
      bar.save()

class Migration(SchemaMigration):
   def forwards(self, orm):
      # add column
      db.add_column('myapp_bar', 'honk', self.gf('django.db.models.fields.PositiveIntegerField')(default='0'), keep_default=False)
      # migrate data
      if not db.dry_run:
         migratedata(orm)
      # remove old column
      db.delete_column('myapp_foo', 'honk')

However, I don't recommend it because it is easy to mess up. Special care needs to be taken for uniqueness and the order of operations (IOW, don't migrate data after you've deleted the relevant fields (: )

dnozay
  • 23,846
  • 6
  • 82
  • 104
0

As Ingmar mentioned, the south ORM gets frozen at a specific point in time, which prevents you from accessing columns the ORM does not know about. However, there is actually a way around this: you do not have to use the ORM, or even any ORM at all; instead, you can execute raw SQL queries

So for example, instead of

for foo in Foo.objects.all():
    print foo.honk

you can do something like:

cursor.execute('SELECT "honk" FROM "myapp_foo"')
for honk, in cursor.fetchall():
    print honk
Jian
  • 10,320
  • 7
  • 38
  • 43