-
Notifications
You must be signed in to change notification settings - Fork 400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workaround for the json migration caused by #407 #490
Comments
Vote for a management command and two-step upgrade process: Step 1:
Step 2:
Step 1 should be considered a breaking change. Step 2 should not be considered a breaking change because it will just affect the implementation of the model which is not public API. Step 2 should be safeguarded by a conf variable like |
@alieh-rymasheuski in your suggestion, would the steps be separate versions or one version but controller by a conf variable? |
@alieh-rymasheuski
def default_get_changes():
return self.changes
def get_changes_or_text():
if self.changes:
return self.changes
elif self._text_changes:
return json.dumps(self._text_changes)
return None
if conf.AUDITLOG_READ_FROM_TEXT_CHANGES:
get_changes = get_changes_or_text
else:
get_changes = default_get_changes |
Separate versions. |
@aqeelat Is it possible for you to create a copy/replica database from your current database and try to run the migration on it to see how long does it take? |
@aqeelat, IMO the main concern is availability of the database and/or the application for the duration of the JSON migration. A bigger service in my herd has over 100M auditlog entries, so running this migration synchronously won't be viable for this service. Last time I had to add a new index to the auditlog table on this service that operation took multiple hours, and the table was substantially smaller back then. The setting I propose will guard an inattentive user from dropping the column without running the (asynchronous) data migration script. I'm not at all concerned about reading the logs while the migration is partially done, the accessor property I described should protect the users from the moment they deploy Step 1 version and until they have a suitable moment to run the data migration script. I don't insist that the project follows my suggestion. After all, I'm using a fork of |
@hramezani I told my SRE team to create a replica and run it, and it timed-out after 1 minute. So, it does take longer than we're comfortable with. |
@aqeelat How did you run the migration? in a separate pod? where did you get the time out? request to DB? IMO, The main problem here is the database availability. |
We ran it directly from the dbms.
… On Jan 9, 2023, at 2:01 PM, Hasan Ramezani ***@***.***> wrote:
@aqeelat <https://github.com/aqeelat> How did you run the migration? in a separate pod? where did you get the time out? request to DB?
IMO, The main problem here is the database availability.
—
Reply to this email directly, view it on GitHub <#490 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABHYUSAEIY2K7ELJLDJPXULWRPVZ3ANCNFSM6AAAAAATM2RJRM>.
You are receiving this because you were mentioned.
|
I think we can add a separate documentation page for migrating to BTW, I am still open to accept a PR if it is good enough. for example having a management command ro do the migration. |
@hramezani I'll propose something this week. |
@hramezani @alieh-rymasheuski this is what I have in mind: #495 So, if developers want to opt in to the two-step migration (we can call it async migration), they need to set both new confs to True, and then after the migration, they can set The next major version of auditlog should remove This way, new users and users who don't opt-in to this process won't notice anything other that the column in their db. |
If you guys find this approach acceptable, I'll install mysql and oracle locally and get the sql migration command and add it to the PR. |
We currently have more than 71,490,242 log entry records. This migration is going to be really slow for us and could break our application.
Usually, when we have a slow non-blocking migration script, we either run it as SQL directly on the DB or create a python script and run it in a Django shell on a separate pod.
I'm currently thinking of a way we can apply the same approach to the JSON migration.
The flow I'm thinking of is:
change
field to be a json field. #407 (comment) scripts on the old columnTo make this transition smooth, maybe we can update the migration file (0015) by prepending it with a
migrations.RunPython
that only runs the pre-migration script if the confAUDITLOG_ASYNC_JSON_MIGRATION
is set to True.Then the dev can then be responsible for running step 3 (or, optionally, we can add it as a management command)
What do you think?
@hramezani @alieh-rymasheuski
The text was updated successfully, but these errors were encountered: