DB Deployment automation post

Confession

I'm no good with databases, not really anyway. I have a single Oracle PLSQL certificate from years ago and all I really remember from that is the term 'Pigs ear joins', which is just a a fancy nickname for a recursive join.

Most of the databases I design are built and managed in a GUI, like HeidiSQL, DBeaver or SSMS. My queries are usually forged via trial and error and then unit tested to death.

But databases are an integral part of any system, you have likely your code and your tests in Git but without any data to refer to, you don't really have anything. If you lost your database, you could lose everything.

So I need to be able to improve my database project, but all I have right now is a deployed 'production' database, how can I recreate it and test changes without breaking anything?

An Easy Fix

The solution here is easy right? Export the database schema as one big SQL script and put it under source control, done!

Except it's not, because the schema isn't enough. The database is empty, and the code relies on some default settings.

You could add some SQL scripts to your repo, and execute them when you need data. But this seems poor, clunky and error-prone.

What if a script runs twice? What if a script is missed? What if you hit transient errors during deployment? What if the schema changes and the change isn't fed back into source control?

Enter: Migrations

Databases are tricky. Incremental changes to an existing object as opposed to webservices or containers which can be blown away and redeployed without a thought.

But there's a proven solution for this problem, Migrations.

Migrations are small, self-contained changes to your database. Typically you define the migration script, alongside a rollback script which can be called if something goes wrong, undoing any changes and leaving you where you started. A table in your database is typically also created, to track the migration history, ensuring only the necessary migrations are executed.

There exists a multitude of migration tools, frameworks, etc.

I chose to use dbmate, an open source, framework agnostic tool that supports Docker, meaning it doesn't even have to be installed locally to work.

Setting up dbmate

First up, I want to be able to run dbmate, in Docker via composer. Thankfully the documentation makes this pretty easy, and I ended up with this:

"scripts": {
    ...
    "dbmate": "docker run --env-file ./.env --rm --network=host -v \"$(PWD)/db:/db\" amacneil/dbmate",
    ...
},

The above command does the following: - Runs the Dbmate docker image - Feeds in a local .env file with configuration - Binds the docker image network to the host machine - Binds the docker image to the host working directory

Now I can execute dbmate on any Docker enabled machine by simply running composer dbmate

This means I can also add further dbmate commands to composer with much less code, as composer allows self-referencing other scripts.

"scripts": {
    ...
    "dbmate": "docker run --env-file ./.env --rm --network=host -v \"$(PWD)/db:/db\" amacneil/dbmate",
    "dbmate:upgrade": "@dbmate up",
    "dbmate:newmigration": "@dbmate new",
    "dbmate:dump": "@dbmate dump",
    ...
},

Now I've added dbmate to my project, I'm going to get to work add my first migration.

Usage

Since I'm retrofitting an existing schema into migrations, I started by running composer dbmate:dump against a local copy of my database, which dumped my deployed database schema to disk.

From there I manually ran the following for each table in the dump, where tableName is the name of the table, moving the relevant SQL into the created file; composer dbmate:newmigration tableName

Now the schema dump is empty, I deleted it. Leaving me with a number of new SQL files under myProject/db/migrations. Now for the real test, do they work?

I deleted the deployed database, ran composer dbmate:upgrade and... success! I had successfully got my database schema under some sort of source control, with proper support for up/downgrades!

As one last step I added this to my apps "serve" script, meaning I always have the data at hand when I need it.

{
    ...
    "serve": [
        "composer install",     # Install composer packages
        "@docker-compose",      # build out web+sql server with empty database
        "sleep 10",             # this could probably be a dbmate:wait operation
        "@dbmate:upgrade",      # deploy latest database schema
        "@dbmate:testdata"      # deploy database testdata (see below!)
    ]
    ...
}

A single command now provisions a whole dev/test environment.

Bonus: test data

So at the end of all this, I still need some basic data when running my tests, I'm working on moving this into tests that create that data, but until then it's still a hard requirement.

I don't want this 'test data' in my normal databases, so I can't have them in as migrations, right?

But then I found the --migrations-dir option on dbmate, and now I have two migration directories, one for real migrations and one for some simple testdata, perfect.

"scripts": {
    ...
    "dbmate": "docker run --env-file ./.env --rm --network=host -v \"$(PWD)/db:/db\" amacneil/dbmate",
    "dbmate:upgrade": "@dbmate up",
    "dbmate:testdata": "@dbmate --migrations-dir ./db/testdata up",
    "dbmate:newmigration": "@dbmate new",
    "dbmate:newtestdata": "@dbmate --migrations-dir ./db/testdata new",
    "dbmate:dump": "@dbmate dump",
    ...
},

Thanks for reading.

Categories: devops

Tags: sql, webdev, php, dbmate