Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize repository size #142

Open
pReya opened this issue Feb 8, 2022 · 9 comments
Open

Optimize repository size #142

pReya opened this issue Feb 8, 2022 · 9 comments
Assignees
Labels
technical Improving underlying technology

Comments

@pReya
Copy link
Contributor

pReya commented Feb 8, 2022

The repo is currently 520.80 MiB (without submodules) and takes a considerable amount of time to clone.

My suggestion would be:

  • Identify large files, and see if they are still used
  • Move large binary files to Git LFS
  • Optimize all images in the project to sane default sizes
@SubOptimal SubOptimal self-assigned this Feb 15, 2022
@SubOptimal SubOptimal added the technical Improving underlying technology label Feb 15, 2022
@SubOptimal
Copy link
Member

We have an orphan branch of the old web page in the repositories history, which we decided to keep in a separate repository.
Splitting the repository will be the first step in reducing the size of the repository.

@SubOptimal
Copy link
Member

Archive of the old history has been stored at https://github.com/openbikesensor/archive.openbikesensor.github.io.

Cleanup of the current repository will follow.

@SubOptimal
Copy link
Member

All images in content/docs/hardware/v00.02/build-instructions/images have a resolution of 4128x2322 and a total size of around 370 MB. I would suggest resizing them to around 30 % (1720x968) of the original size, which would gain a size reduction of about 300 MB.

To effectively reduce the repository size, we need to replace the images at the initial commit of the main branch with their resized version. This action would rewrite the whole history and invalidate all local clones.

I would push the repository first for safety reasons into a temporary new one. Only after cross-checking I would push the changes into this repository.

@opatut Any objectives against this procedure? Or other suggestions?

@pReya
Copy link
Contributor Author

pReya commented Feb 18, 2022

@SubOptimal Have you considered moving all photos to Git LFS? Photos are binary files after all, so they do not need to be in the Git history at all (excluding text-based formats such as svg). We could move them to LFS and remove them from the History altogether? Within LFS files are just pointers and are not versioned at all. https://notiz.dev/blog/migrate-git-repo-to-git-lfs

Good explanation of LFS:
https://www.youtube.com/watch?v=9gaTargV5BY

@gluap
Copy link
Collaborator

gluap commented Feb 19, 2022

If LFS is seamless to the user I'd be in favour - Seems like we could automatically have all jpgs be in LFS - effectively keeping the repo small for the future.

@SubOptimal
Copy link
Member

@pReya As long git-lfs is not part of vanilla git, there might be some issues.

  • the user first need to install it to be able to clone files from the LFS server; otherwise, he gets only a placeholder file
version https://git-lfs.github.com/spec/v1
oid sha256:23dc...
size 65318
  • for users using some package manager, I believe git-lfs can be installed with something similar to apt-get install git-lfs, but how about Git GUI clients or embedded Git implementations

If we agree that we can handle the above, I will push the LFS migrated repo.

@pReya
Copy link
Contributor Author

pReya commented Feb 21, 2022

LFS is included in:

  • Git for Windows
  • GitHub Desktop
  • Tower
  • GitKraken

But yes, for Linux and Mac command line users, it will mean an additional installation step via their package managers.

I still think LFS is the right tool for this job. The amount of images in the repo is only gonna get bigger in the future, and every time they are changed, the git history will increase in size rapidly.

@opatut
Copy link
Member

opatut commented Feb 21, 2022

The Github LFS offer is pretty bad IMO, the quota is just tiny, even for open source projects it seems they have no exception.

https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-storage-and-bandwidth-usage

There is 1 GB of storage and 1 GB of traffic per month. This is for the whole organization. You can pay to get more, but 1 GB is not nearly enough for us. It looks like we're already using half of it somewhere:

image

You get 50 GB storage and 50 GB traffic for 5 USD/mo. That's rather expensive, if you ask me. I'd stick with the repo, it is not more than an inconvenience, and if we clean up the history it'll be much better too. Github is at least known not to flag open source repos if they grow big ;)

@pReya
Copy link
Contributor Author

pReya commented Feb 21, 2022

The Github LFS offer is pretty bad IMO, the quota is just tiny, even for open source projects it seems they have no exception.

docs.github.com/en/repositories/working-with-files/managing-large-files/about-storage-and-bandwidth-usage

There is 1 GB of storage and 1 GB of traffic per month. This is for the whole organization. You can pay to get more, but 1 GB is not nearly enough for us. It looks like we're already using half of it somewhere:

image

You get 50 GB storage and 50 GB traffic for 5 USD/mo. That's rather expensive, if you ask me. I'd stick with the repo, it is not more than an inconvenience, and if we clean up the history it'll be much better too. Github is at least known not to flag open source repos if they grow big ;)

Now that OBS is a "Eingetragener Verein", we could apply for a non-profit status at GitHub, which would give us a free "Team" license, which has more liberal usage limits.: https://support.github.com/contact/nonprofit

EDIT: Turns out, even the "team" package only offers 1 GB of data. So your point is absolutely correct. This is indeed a very stupid limitation. So let's proceed without LFS and just add images to the repo as before?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
technical Improving underlying technology
Projects
None yet
Development

No branches or pull requests

4 participants