One common reason for downtime is the application of bad RDBMS migrations during releases.
This issue is related to RDBMS architecture, which needs to know the schema for each query you perform and uses a locking mechanism to ensure only one valid schema is in use.
A common scenario is when a migration query locks an actively used table, causing business logic queries such as SELECT
, INSERT
, UPDATE
, and DELETE
to wait until the migration is finished.
Even if no popular table is migrated, queries to it keep the connection active.
If you have a significant number of stuck connections, no other queries to any other table can be performed in the meantime.
This can take minutes, hours, or even longer, resulting in considerable downtime.
Yes, unsafe migrations can be prevented. Let’s consider two common approaches.
The safest approach is to create a new table, start writing to both the old and new tables using triggers or application logic, and then switch to the new table.
This method works perfectly but has a few drawbacks:
Safe SQL alternatives allow using an exclusive
lock for a short period and less restrictive locks for heavy operations like index creation or constraint validation.
For example:
CONCURRENTLY
.NOT VALID
and then validate them.Let’s consider some cases.
If you use ORM, it tries to provide unified usage for different databases, which can use different syntax for migrations. Adding safe SQL to ORM increases implementation complexity.
Another reason is that different use cases exist, and applying migrations in a non-safe way can be simpler and faster.
If you write migrations manually, you need to consider safe alternatives and locking operations. However, only a small amount of engineers have this knowledge and can apply it without human error.
If you use linters to find unsafe SQL, they can identify common issues, but you still have to correct your SQL manually according to the process, which can still involve human error.
Moreover, for safe migration, you sometimes need to split some operations into a few steps that combine migrations and releases.