Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete S3 bucket data, once PR is closed #304

Open
vishnoianil opened this issue Apr 26, 2024 · 5 comments
Open

Delete S3 bucket data, once PR is closed #304

vishnoianil opened this issue Apr 26, 2024 · 5 comments
Labels
enhancement New feature or request gobot stability Issues related to bot and worker stability

Comments

@vishnoianil
Copy link
Collaborator

Once the PR is closed by triager, delete the precheck/generate data from the S3 bucket, to reclaim the resources.

@vishnoianil vishnoianil added enhancement New feature or request stability Issues related to bot and worker stability labels Apr 26, 2024
@vishnoianil
Copy link
Collaborator Author

@jjasghar @bjhargrave @russellb want to know your thoughts on it. Delete is a disruptive operation, so want to make sure that we pull the trigger on it when we think its safe. Deleting on pull request close probably is easiest, but we can also implement the mechanism, where we wait for certain duration after closure of the PR to make sure it was not accidentally closed (like we saw with migration of taxonomy to public etc).

@bjhargrave
Copy link

We may want some delay because when a triager closes a PR, it may take some time for the submitter to review the closed PR and they may want to see the data to learn/protest closure/etc. So I think we may want a 2-4 week delay to allow time for such review by the submitter. If we delete the data immediately upon close, the submitter can never review the data to learn or protest.

@russellb
Copy link
Member

The storage here is also pretty cheap for the amount of data we're talking about. If we start doing ilab train for people, I would feel different.

I definitely agree a delay is necessary before cleanup. Since we want a delay anyway, a bulk cleanup process that runs occasionally would work. I just think it's pretty low priority for now.

@jjasghar
Copy link
Member

I'm on the side of caution here, 4 weeks are probably a good number if at all. These are small text files, so I'd imagine it can't be much data at all, and in turn not much cost.

@vishnoianil
Copy link
Collaborator Author

At this point of time, I am not considering the cost as a decision factor here. I am more like "this is what we need and whatever cost we need to pay, we should pay" thought 😄 . I think we can set any relaxed policy that's we need to (even 3 months after closure should be fine), I just want some mechanism to be in place to cleanup the garbage from s3 bucket because if the PR is never going to be re-opened we have some unused data sitting in the s3 bucket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request gobot stability Issues related to bot and worker stability
Projects
None yet
Development

No branches or pull requests

4 participants