Data migrations à la Lino¶
As the maintainer of a database application that is being used on one or several production sites you will care about how these production sites will migrate their data.
Data migration is a complex topic. Django needed until version 1.7 before they adapted a default method to automating these tasks (see Migrations). Django migrations on a Lino site describes how to use Django migrations on a Lino site.
But Lino also offers a very different approach for doing database migrations.
Advantages of migrations à la Lino:
They make the process of deploying applications and upgrading production sites simpler and more transparent. As a site maintainer you will simply write a Python dump before upgrading (using the old version), and then load that dump after upgrading (with the new version). See Upgrading a production site for details.
They can help in situations where you would need a magician. For example your users accidentally deleted a bunch of data from their database and they don't have a recent backup. See Repairing data for an example.
Despite these advantages you might still want to use the Django approach because Lino migrations have one disadvantage: they are slower than Django migrations. Users cannot use the site during that time. There are systems where half an hour downtime for an upgrade is not acceptable.
Rule of thumb: If your application uses either the inject_field or BabelField features (or if it uses a plugin that uses them), then Django migrations won't work. If your site does need to use Django migrations, then you cannot use inject_field and BabelField.
General strategy for managing data migrations¶
There are two ways for managing data migrations: either by locally
restore.py script or by writing a migrator.
Locally modifying the
Locally modifying a
restore.py script is the natural way when there is
only one production site that needs to migrate and when the application
developer is also the site maintainer. It is a common situation when a
new customer project has gone into production but is being used only on that
Certain schema changes will migrate automatically: new models, new fields (when they have a default value), unique constraints, ...
If there were unhandled schema changes, you will get error messages during the
restore. And then you can just change the
restore.py script and try
again. You can run the
restore.py script as often as needed until
there are no more errors.
The code of the
restore.py script is optimized for easily applying most
database schema changes. For example if a model or field has been removed, you
can just comment out one line in that script.
TODO: write detailed docs
Designing data migrations for your application¶
Designing data migrations for your application is easy but not yet well documented.
This means that the script itself will call
method of your application before actually starting to load
any database object.
And it passes her globals() dict, which means
that you can potentially change everything.
To see real-life example, look at the source code of
A magical before_dumpy_save attribute may contain custom code to apply inside the try...except block. If that code fails, the deserializer will simply defer the save operation and try it again.
Models that get special handling¶
ContentType objects aren't stored in a dump because they can always be recreated.
Site and Permission objects must be stored and must not be re-created
Session objects can get lost in a dump and are not stored.
Writing a migrator¶
When your application runs on more than one production site, you will prefer writing a migrator.
TODO: write detailed docs