Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removal of generated files from version control #942

Open
jordy2254 opened this issue Apr 30, 2024 · 6 comments
Open

Removal of generated files from version control #942

jordy2254 opened this issue Apr 30, 2024 · 6 comments
Labels
API Related to the backend api server written in Go discussion Raises questions that are up for discussion pending response A issue or PR which is waiting for clarifications from OP UI Related to the frontend web ui written in Javascript

Comments

@jordy2254
Copy link
Member

There's a lot of generated files in version control both from the api (gql gen) and ui (apollo client) Both of these cause several issues which could be alievated from removing them from tracking

  • Bloating of the code base and reduced searchability via the basic tools available on github
  • High likley hood/guarenteed conflicts when users are working on api changes or similar front end features

To mitigate this utilising husky (Which we already have) And implementing a hook to generate the generated files would provide some benefit. Development flow, docker builds & tests would all have to change a little to factor in this change but overall I believe this would be beneficial to reduce the number of conflicts as work progresses.
@viktorstrate Can you think of any reason this shouldn't be done?

@jordy2254 jordy2254 added discussion Raises questions that are up for discussion API Related to the backend api server written in Go UI Related to the frontend web ui written in Javascript pending response A issue or PR which is waiting for clarifications from OP labels Apr 30, 2024
@viktorstrate
Copy link
Member

I think one reason for committing generated files is simplicity.

I haven't really had merge conflicts because of them, since they only change when the Graphql API changes.

Not having to worry about generating them is another reason. Eg. if I checkout another branch will just work.

Last reason is that the generated files are source code ie. Go and Typescript, and actually even quite readable. I see it as a plus to be able to read them and search in them. I often look at the generated Typescript types to understand the code. I actually think GitHub can use them to analyse the code and give type hints.

But I also see that there are benefits for removing them.

@jordy2254
Copy link
Member Author

@viktorstrate Sounds like you find it useful then, I haven't personally looked at the generated source on github at all only using it locally to check some interfaces. However, if you find it useful and feel it should stay we can close this issue. I personally have already experienced merge conflicts on the 2 issues I've done but that's because they both touch the api. We could do the below if we keep the generated files in vcs, so I think this issue can just change scope rather than content.

  • Ensure generated code is committed in a separate commit (This is only to make the rebuild process easy)
  • Run code generation as part of the code build to ensure all generated code is correct

Let me know your thoughts for going forward.

@kkovaletp
Copy link
Contributor

  • Run code generation as part of the code build to ensure all generated code is correct

@jordy2254, if you could provide me the sequence of commands for this and point me to the right place in the building backend stage of Dockerfile, I could test it and include it in the new Dockerfile in the #863. To test both cases (when the code was fine before the rebuild is called, and when it was outdated, so the rebuild updates the generated code), I'd need from you also a patch for the API code to force the outdated state of the generated code. It should be something simple, but easily verifiable from the UI after the image is built and running (to make sure that the patch was applied correctly).

I expect that running this code generation process is completely safe and couldn't break anything if the code was already in sync. If not, the process of code regeneration should also be safe and lead to overwriting some outdated code, while still the build should go on and finish correctly with no negative impact on the resulting image.

I'd like to add this step to the backend building process of the Dockerfile, as I believe that it will be an additional self-healing precaution, which contributes to the overall product quality, as it covers some types of mistakes.

@jordy2254
Copy link
Member Author

@kkovaletp
It needs to be part of the test suite and not part of the docker file, any PR's which have a diff from generation should fail the test suite. And this will only need to part of the test suite in the event that we decide to keep generated files which will depend on what Viktor says.

@viktorstrate
Copy link
Member

@jordy2254 If you have experienced merge conflicts because of them multiple times, then I certainly think that we should consider removing them from Git.

I think being able to read them locally when developing is by far the most important aspect, but that would not go away by not committing the files.

If we made sure that generated files was automatically built when running the build scripts, then no extra steps are required for the developer and changing branches would also not be much of a problem, since a rebuild would get the generated files up to date.

I am a little worried about the added build times. I'm not sure caching is possible, and I remember it to be a little slow, at least if it had to be run everytime the builds a started.


Regarding your other proposal of keeping the files and changing scope. Can you please elaborate a little as I'm not sure that I fully understand it.

Ensure generated code is committed in a separate commit (This is only to make the rebuild process easy)

Why is it more beneficial to commit the generated files in a separate commit?

Run code generation as part of the code build to ensure all generated code is correct

I think this is a really good idea, this could be added as a part of the automatic tests that run in CI.

@jordy2254
Copy link
Member Author

@viktorstrate

I am a little worried about the added build times. I'm not sure caching is possible, and I remember it to be a little slow, at least if it had to be run every time the builds a started.

We can do some analysis on the impact of build times, on my main dev machine I don't recall it taking very long at all. For local dev it will be minimal unless regularly changing branches.

Regarding your other proposal of keeping the files and changing scope. Can you please elaborate a little as I'm not sure that I fully understand it.

A good way to handle conflicts with generated files is to put them in their own separate commit, when it then comes down to resolving them you can;

  • Perform an interactive rebase git rebase -i master (replace master with the relevant branch)
  • Define the generated commit to be dropped (this will then resolve conflicts related to generated files
  • After rebasing, regenerate the files again and commit the changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Related to the backend api server written in Go discussion Raises questions that are up for discussion pending response A issue or PR which is waiting for clarifications from OP UI Related to the frontend web ui written in Javascript
Projects
None yet
Development

No branches or pull requests

3 participants