Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethinking -allow-missing (out-of-order) migrations #523

Open
mfridman opened this issue May 20, 2023 · 6 comments
Open

Rethinking -allow-missing (out-of-order) migrations #523

mfridman opened this issue May 20, 2023 · 6 comments

Comments

@mfridman
Copy link
Collaborator

mfridman commented May 20, 2023

This issue aims to document how the out-of-order (-allow-missing) migrations feature currently works (v3.11.2), its limitations, and a proposal to improve. Much of this is based on excellent feedback from the community and other various discussions.

What is it?

Best illustrated with an example. Suppose migrations 1,4 are applied and then 2,3,5 are introduced. Prior to v3.3.0 goose would ignore migrations 2,3 and apply only 5. However, many users were not satisfied with this behaviour, summarized:

  • migrations 2,3 are "silently" ignored
  • unable to apply migrations 2,3 if newer (read higher number) versions already applied

Given that goose maintains a full history of applied migrations, we are able to resolve the database state against the on-disk state. And now, if missing (out-of-order) migrations are detected users can opt-in to apply them.

Shortcomings of the current implementation

As soon as we start supporting out-of-order migrations, there are a number of edge cases to account for. Namely,

  1. What happens when commands are NOT run consistently? In other words, some commands are applied with the -allow-missing flag, and then other commands are run without this flag.
  2. What should the order of the down migrations be?
  3. When running goose version, should it return the latest applied version or the max applied version?
  4. When running goose status, should it list migrations as they were applied, or by version number?

For many of these, the current implementation assumes users care about maintaining the applied version in both directions. But, this appears to not be the case.

If migration 1,4,2,3,5 are applied, users expect the rollback to be based on the migration versions 5,4,3,2,1 and NOT the applied versions 5,3,2,4,1.

Future improvements

This should not affect most users, because I suspect most are consistently running up commands with the -allow-missing flag.

Furthermore, down migrations are fairly infrequent, and when applying out-of-order migration chances the down ordering doesn't matter in most cases. However, as @callaingit pointed out in #402 this may result in unexpected behaviour.

So, I propose the following:

  • Down migrations are based on the migration versions and not the applied versions. Fix #402
  • Fix up migrations so they are consistent if all migrations have been applied with -allow-missing and subsequently without. Fix #521
  • goose version returns the max version from the db (not the last applied version)
  • goose status lists migrations by version

All of this is pretty much how sequential migrations already work, which means out-of-order migrations and applied versions are just implementation details not surfaced to users.

Lastly, I do wonder if there are folks out there that care to maintain the applied versions in both directions, but so far this doesn't seem to be the case. So I may have gotten that wrong in the initial implementation, and apologies if this has caused some confusion.

@mfridman
Copy link
Collaborator Author

Added a Twitter poll to get feedback from various folks familiar with databases.

Also dropped a comment in the Gophers Slack channel, in the #databases channel.

(Here is an invite link to the Gophers Slack workspace)

This is specifically in relation to applying down migrations (#402) when operating in an environment where up migrations are allowed to be applied out-of-order.

@jonseymour
Copy link

I am ok with the proposed approach (down follows migration order) as the default but I can see that it would be handy if there was an option to allow reversal according to the applied order simply because goose does have enough information to do this sensibly and there is no easy way to achieve this behaviour without support in goose itself.

The mathematician in me probably has a preference for reversing the application order but to be honest, I don't really care which one is the default, I would like the option to do either and then I can select the one that makes most sense for the kinds of migrations at hand.

Part of the reason I don't care is that we are not always religious about the rollback scripts always working in part because of the complexities of properly handling data loss across some kinds of schema change.

@jonseymour
Copy link

Actually, partly because rollbacks don't always work in our case, I might have a situation like this:

apply order

1,4,5,2,3

suppose 5 is one of those migrations that can't be rolled back, but 2 and 3 are migrations that can be rolled back.

In this case, I might be happy to rollback 3 and 2 but not 5. I could do this if goose allowed rollbacks in application order but I wouldn't be able to do this if I was forced to use migration order because the rollback for 5 is going to fail.

Of course, this still leaves me to rollback 2 and 3 in the case that I apply the sequence in the intended order 1,2,3,4,5 but so be it - that's the cost of not having fully reversible migrations.

I am not saying this case makes the ability to rollback in application order necessary, but just illustrates a use case that would occasionally be useful to me.

@mfridman
Copy link
Collaborator Author

Afaics there is no single "right" way to do down migrations when subscribing to out-of-order up migrations. And the consensus appears to be "it depends".

I was really hoping there would be one way to do things because it keeps the implementation consistent across all commands, not just the down command.


The most sensible solution would be to keep the current default but allow modifying the default behaviour. Granted this is a non-trivial amount of work.

The current default is applying down migrations in the order they were originally applied. I.e., migrations applied last are rolled back first regardless of their version number.

@callaingit
Copy link

Hi guys, long time no see (just upgraded from goose 3.6.1 to 3.15.0 - thanks for all the new stuff and fixes). I have not seen any actual changes except for the somewhat related #521 (from reading the release notes only I must admin) regarding the original #402.

Now, trying to put myself into context after a few months, when I read #402, I was mentioning down-to 0 was not actually working the same as reset. Has this changed so far or not? If so, in which version has it changed? If not so, shall the comment you added on Jun 21 say that the current default is somewhat different when using down vs using reset (unless I am lost in our conversations or have missed something important here)?

@catinapoke
Copy link

Will feature #402 be implemented? A year has passed since opening of this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants