Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Composer cache makes developers life hard #1496

Closed
markushausammann opened this issue Jan 18, 2013 · 47 comments
Closed

Composer cache makes developers life hard #1496

markushausammann opened this issue Jan 18, 2013 · 47 comments
Labels
Milestone

Comments

@markushausammann
Copy link

We're constantly annoyed by the composer cache. We constantly run into issues where some way or another one or another repo doesn't update or install properly until we delete the cache. This is on Windows7.

@Seldaek
Copy link
Member

Seldaek commented Jan 18, 2013

Do you have any details on that? Which part of the cache fails, and how? I never had any issue with the cache as far as I know, and I work on win7 too.

@markushausammann
Copy link
Author

I'm sorry for the very generic issue, not very usefull, I know. It's just that we haven't seen any clear pattern yet. One thing that happens often is that there are clearly new commits in a dev-master but the old stuff is just taken from the cache. Composer says: Installing from cache but we know that's not what we want. The only way to get the new information is to delete the cache. It would make life much easier to have a composer clear-cache method. Maybe there's something we don't do right... I wouldn't know what though.

@markushausammann
Copy link
Author

Oh, and we've had the same issue on CentOS too, btw.

@Seldaek
Copy link
Member

Seldaek commented Jan 18, 2013

IMO a clear-cache command would be just giving up and admitting failure. I'd rather fix the root cause. Cache should be transparent or it becomes a painful source of uncertainty. That's why I am surprised you have problems because I do all I can to make the caches dependent on the exact versions of things so that if anything changes it is invalidated.

So you are saying that when updating a package, it outputs that it updates and fetches from the cache. Could you please whenever you get this again do the following:

  • Collect the output and paste it here (especially interested in the from/to versions of the package).
  • run and copy output of this in cmd.exe: dir %LOCALAPPDATA%\Composer\files\<package name> to see which versions/files are in the cache exactly (if on CentOS: ls -l ~/.composer/cache/files/<package name>).

@markushausammann
Copy link
Author

ok, will do that.

@aimfeld
Copy link

aimfeld commented Jan 18, 2013

I work with Markus and I've had this issue more than once (on Windows 8 and CentOS) e.g. when I needed to add some missing files to a git tag. I did the following:

  • added files to local repo
  • recreated the tag (force overwrite)
  • pushed tag to github
  • rebuilt the composer/satis packages
  • ran php composer.phar update in my project

Even after manually deleting all dependencies before running php composer.phar update, composer installs the tag from cache instead of fetching the updated tag with the added files. After clearing the cache manually, composer installs the updated tag.

@stof
Copy link
Contributor

stof commented Jan 18, 2013

@aimfeld Git tags are not meant to be changed. When you overwrite a tag, every guy who already fetched the old tag has to delete it in their clone and to fetch again, otherwise they will still use the old tag.
So this can be an issue with the cache (which already contains the old tag) but even without cache, it can break things if you already fetched the old tag in the vendor folder (with an install from source) as you will install the old tag later when using the tag as git reference.

@markushausammann
Copy link
Author

@stof That is true in most cases, but here we're talking about a private mirror repo of a third party library which needs to maintain its tag but got some fixes.

@stof
Copy link
Contributor

stof commented Jan 18, 2013

@markushausammann even for private repos. If you delete a git tag and recreate it, any guy who already cloned the repo with the old tag has to delete the tag and fetch again. If you got some fixes, you should do a new bugfix release, not erasing the previous release

@markushausammann
Copy link
Author

Well, in the case you want/need to mirror the true version number of that library that's just not gonna be possible. I get your point but it doesn't apply to that use case.

@stof
Copy link
Contributor

stof commented Jan 18, 2013

reusing the same version number for a release but with a different content looks wrong to me.

@Seldaek
Copy link
Member

Seldaek commented Jan 18, 2013

@markushausammann @aimfeld indeed retagging is the problem. This is a really bad practice. If 1.0.0 is broken you can push the fixes and release 1.0.0.1 or 1.0.0p1 which are both accepted by composer as minor patch releases/hotfix.

The problem is it's not easily fixable because composer stores only the tag as commit reference for tagged packages, and not the exact sha1 of the commit. Because of that, if the tag references a new commit it won't be identified, so it's not possible to make the cache dependant on the sha1 of the commit for tags.

@markushausammann
Copy link
Author

Yeah, you guys are right... it was a temporary solution to make a third party library composer installable but we should probably still work with minor patch tags! I wasn't aware that composer doesn't look at the commit. Thanks for the heads up.

@markushausammann
Copy link
Author

btw. I am also under the impression that I had some cache problems outside this tag story but I may be wrong. If so, I'll open a new issue.

@Seldaek
Copy link
Member

Seldaek commented Jan 18, 2013

I'll just keep this open as reminder for now. I can try and look at whether it's fixable at some point, even though it's a bad practice I know everyone does it every now and then.

@Seldaek Seldaek reopened this Jan 18, 2013
@markushausammann
Copy link
Author

You're so constructive :)

@andrerom
Copy link
Contributor

Just hit by this bug, we are in situation where we need to update tag in cases of re builds of a version just before release. It is wrong, but that does not change the fact that the composer cache is currently blocking our builds and we lost quite some time figuring out it was caused by it.

So maybe composer should double check it's cache checksum against packagist for forced updates to tags and packages.

@tomaszdurka
Copy link

I had similiar issue when moving package from packagist into satis server. Tag stayed the same as repo didin't change, md5 checksum failed (i am not sure what is included into it). Can packagist/satis checksum of the same package vary or I am missing something?

Maybe cache should be cleared for pacakge if checksum fail?

@njam
Copy link
Contributor

njam commented Jun 15, 2013

@tomaszdurka I think satis re-packages stuff.
Is the checksum calculated of the files before or after packaging?
Maybe the re-packaging is also affected by this .gitignore emulation bug.

@tomaszdurka
Copy link

These are zips of the same repo, same ref/tag:

  • MD5 (/tmp/satis.zip) = 0b719b5685591286446476dbc6789698
  • MD5 (/tmp/packagist.zip) = 82aa4716f2523a9039507d01bbc709a2

Actually:
Seems that packagist wraps package into (<project-name>-<version>) directory, and satis does not. This is probably more of issue for https://github.com/composer/satis project.

@killerwolf
Copy link

We're facing both issues here

  • different SHASUM, on 2 satis (that share the same config)
  • cache anoying us, when updating or instaling via composer.

@tomaszdurka
Copy link

@Seldaek Did you get notifications about this?

@sergehardy
Copy link

Hi,

I used to have checksum errors while using Satis; I no longer do since I have disabled the cache on both sides:

"config:{
    "cache-files-ttl": 0
}

@njam
Copy link
Contributor

njam commented Sep 4, 2013

Since a few weeks I'm experiencing that cached ZIP-files have different checksums from the original files on satis.

Composer then says something like:

 - Installing easybook/geshi (1.0.8.11)
    Loading from cache

  [UnexpectedValueException]                                                                                                                                                  
  The checksum verification of the file failed (downloaded from http://satis.cargomedia.ch/dist/dist/easybook-geshi-2582e8134c8af41d7f367e421ee0b8b111e89f36-zip-f3a77f.zip)  

The cached file really is different from the one residing on satis:

$ md5sum ~/.composer/cache/files/easybook/geshi/1.0.8.11-1.0.8.11.zip 
f26396c9ca9f2e89e3c99edfa4a5401e  /root/.composer/cache/files/easybook/geshi/1.0.8.11-1.0.8.11.zip
$ curl -s http://satis.cargomedia.ch/dist/dist/easybook-geshi-2582e8134c8af41d7f367e421ee0b8b111e89f36-zip-f3a77f.zip | md5sum
f99a691ad94190be33f6321cb769c8e7  -

If I unpack both ZIP-files they are identical (diff -r), but the ZIP-files themselves are different:

$ diff easybook-geshi-2582e8134c8af41d7f367e421ee0b8b111e89f36-zip-f3a77f.zip 1.0.8.11-1.0.8.11.zip 
Binary files easybook-geshi-2582e8134c8af41d7f367e421ee0b8b111e89f36-zip-f3a77f.zip and 1.0.8.11-1.0.8.11.zip differ

With the difference looking like some different encoding (?):

$ diff -a easybook-geshi-2582e8134c8af41d7f367e421ee0b8b111e89f36-zip-f3a77f.zip 1.0.8.11-1.0.8.11.zip 
1c1
composer.jsonnuW+A??{
---
composer.jsonnuW+A??{
24c24
< }PKBC??n??    README.mdnuW+A??# GeSHi - Generic Syntax Highlighter #
---
> }PKy??B??n??  README.mdnuW+A??# GeSHi - Generic Syntax Highlighter #
[.....]

Any ideas what could be the cause or how to further debug?

@fauvel
Copy link

fauvel commented Sep 5, 2013

It rather looks like a Satis issue, related to the fact that the ZIP algorithm is not deterministic. Composer doesn't change anything to the ZIP file downloaded from the Satis server; the problem only arises if:

  • Composer downloads a package from Satis, puts it into the cache and writes its check sum into composer.lock
  • The Satis server generates the ZIP file for the same package again, although the package itself and its version number haven't changed. This new ZIP file differs from the original one, though – that is the actual bug in my opinion.
  • The Composer cache is being flushed or Composer is executed on an other machine where the package hasn't been downloaded yet
  • composer install downloads the new ZIP file from the Satis server

The check sum validation fails then because the check sum found in composer.lock corresponds to the original ZIP file, while the check sum of the downloaded file corresponds to the new ZIP file created by Satis in the meanwhile.

This issue isn't limited to Satis, in principle; any package manager which "refreshes" package archives although their version number haven't changed would give rise to the same problem.

As a workaround, Composer could inflate every downloaded package archive and calculate the check sum of the directory instead of the check sum of the archive itself, for instance with:

find . -type f -exec md5sum {} + | awk '{print $1}' | md5sum

(inspired by this answer on StackOverflow)

@fauvel
Copy link

fauvel commented Sep 5, 2013

@killerwolf It's annoying, but I think it's clear that you cannot work with two Satis instances. Both will re-package the original packages and generate ZIP files with different check sums, although the inflated contents are the same.

@fauvel
Copy link

fauvel commented Sep 5, 2013

I've taken a look into Composer\Satis\Command\BuildCommand::dumpDownloads(), and it seems that Satis won't overwrite existing package archives when the server is being built again, since we are using:

$archiveManager->setOverwriteFiles(false);

So the problem would only arise if you delete package archives from the Satis server manually, and then re-build it? Could anyone confirm or refute this theory?

@fauvel
Copy link

fauvel commented Sep 5, 2013

@Seldaek what about my proposition? Would it be a good idea to add the check sum of the inflated archives, maybe via a new entry shasum-inflated in composer.lock? We could then use it as a fallback in the case where the check sum of the archive itself doesn't match.

@njam
Copy link
Contributor

njam commented Sep 5, 2013

It seems the problem for us was an older composer.lock which used filenames from satis like ruflin-elastica-v0.19.8.0-v0.19.8.0-93e4c9.zip instead of what seems to be now ruflin-elastica-34a7e62a257febd5295efeacfa0209712e0ceb65-zip-f92c51.zip.
Those two files contain the same version of the library, and thus were treated as the same file by the composer cache. Since they were different ZIPs on satis the shasums differed.
Now satis correctly uses only the latter filename, so re-creating the composer.lock should solve these problems.

@Seldaek
Copy link
Member

Seldaek commented Sep 5, 2013

So it seems there are two issues at hand really:

  • If the local cache has a zip archive and satis was rebuilt, the archives may not match anymore.
  • If satis is rebuilt then the hash in composer.lock can be out of date and not match.

The first is fixable by just re-downloading the file instead of failing hard, the second however I'm not sure what we can do about it.

Ideally the best fix would be to make sure the archives are created in a predictable manner so that rebuilding them would not create hash mismatches. Alternatively someone suggested hashing the extracted contents rather than the archive itself. That is possible but will be way slower I imagine, especially on large archives and on windows where filesystem ops aren't very fast.

Any better idea?

@njam
Copy link
Contributor

njam commented Sep 5, 2013

I've just had to re-build all our satis package-caches and re-create composer.locks in all projects. The reason I think was a change either in satis or composer (that the filenames now contain shasums instead of version numbers).

To prevent this the safest would probably be to calculate the shasum from the ZIP's contents.
Downsides are as your write slower performance, and the need to re-create lots of composer.locks - so not sure..

@SvenRtbg
Copy link
Contributor

SvenRtbg commented Sep 5, 2013

The problem really is within Composer handling it's cache. It's a good thing checksums are checked, but whenever the cached file seems to be wrong, it should be disregarded, deleted, and downloaded again from the source, without aborting.

@fauvel
Copy link

fauvel commented Sep 9, 2013

How about inflating package archives in-memory to calculate the inflated check sum we'd store in composer.lock?

@Seldaek
Copy link
Member

Seldaek commented Sep 9, 2013

That would not work on systems without zip extension, and even on those
it's still a lot slower than just hashing the whole archive.

@fauvel
Copy link

fauvel commented Sep 9, 2013

That's right... As far as performance is concerned, in the "normal" case, composer install wouldn't be affected, since we would only calculate the inflated checksum in composer update in order to store it into composer.lock. We'd only calculate it on composer install too if the deflated checksum doesn't match – as a fallback. And we could update the deflated checksum in composer.lock in that case, so that it matches next time. (I just hope doing so will not open a new security issue – we'll have to be careful...)

Since this is only a convenience feature, we could also make it optional, I guess?

@SvenRtbg
Copy link
Contributor

SvenRtbg commented Sep 9, 2013

My current workaround is export COMPOSER_CACHE_DIR=/dev/null to avoid the interruption in the automated install process, but all files are available on the local network, so the additional re-download is not an issue.

@oliver-graetz
Copy link

I am experiencing the issue of the ZIP algorithm being non-deterministic in this way: I use a Linux server at the company that builds a repository with Satis. On my workstation at home I want to provide a second Satis repository of the same packages, which my Composer installation should preferably use because it is faster to access locally than over the internet.

The problem is that my local workstation is runing Windows 7 and the Zips creted by its Satis installation differ from the ones created on the Linux server. This leads to checksum problems whenever a Zip from one server is in the cache and Composer expects the checksum from the other.

I think that this is a real problem because providing the same packages from different sources is a core concept providing redundant sources for data. There are two ways how this could be handled:

  1. Ensure that package hashes do not rely on the final Zip archive but on a deterministic representation of the data in uncompressed form.
  2. Store packages in the cache not only by name, but also by origin server. This could be an interim solution for cross-server ZIP algorithm inconsistencies.

@oliver-graetz
Copy link

Today I experienced a slightly different variation of the problem that Composer calculates the hash on the compressed version rather than the original data. This time clearing the cache wasn't enough because the corresponding project had the composer.lock file checked in and so Composer insisted on getting an archive with a hash that wasn't available anymore because the compressed version of the package had changed in the meantime while the original data remained unchanged. This was verified by comparing the results in the vendor directories of two installations after forcing the install by removing the composer.lock file. Only the shasum entry of two packages was changed, everything else remained the same.

@AlexanderAllen-zz
Copy link

Adding cache ttl 0 insta-worked for me, thanks.

@giggsey
Copy link
Contributor

giggsey commented Nov 25, 2013

+1

Edit after @markushausammann comment below:
This causes an issue at least once a day for me. I do not want to have to set the cache-ttl in each of my projects composer.json's.

My satis runs every 5 minutes with a complete build, and after every push on certain projects.

@peikk0
Copy link

peikk0 commented Dec 17, 2013

👍

@markushausammann
Copy link
Author

I'm really not sure what you guys are thumbing up. So many things have been said in this issue that it could be anything from "yeah, it makes my life hard too" to "cache ttl 0 works". If you +1 or thumb-up, please explain what you support.

@peikk0
Copy link

peikk0 commented Dec 17, 2013

Well, yes this is a PITA, and "cache-files-ttl": 0 "fixes" it for now, but that's not optimal at all. It would be better to replace the previous cache when the checksum differs, composer can't tell which one is the good one of the cache or the remote archive anyway, so lets assume the new one is ok and use it, that would avoid breakage on deployments. This is a cache, not a repository consistancy check, it is meant to be transparent and no one should have to care about it.

@oliver-graetz
Copy link

@peikk0 : +1 on replacing the cached file on checksum difference.

This will by far be the most simple solution to the current annoyances.
Setting "cache-files-ttl": 0 is not a solution, this means just completely giving up on the idea of caching.

@Seldaek
Copy link
Member

Seldaek commented Dec 31, 2013

I fixed what's fixable now, i.e. "corrupted" caches that don't match the sha-1 will not be used anymore and a re-download will be forced instead. The rest of the issue has been moved to #2540.

@cxj
Copy link

cxj commented Sep 13, 2016

Cache still broken when using @dev version of Github hosted packages:

Composer version 1.2.1 2016-09-12 11:27:19

$ composer install
Loading composer repositories with package information
Updating dependencies (including require-dev)
  - Installing level-2/transphporm (dev-master 2110304)
    Cloning 2110304c512958797508ea08e0d78c4336494b41 from cache

Writing lock file
Generating autoload files
master
$ git log origin
commit 0ada294c2ecd1698782985a23f3c90c142fab7b4 (refs/remotes/origin/master, refs/remotes/origin/HEAD, refs/remotes/composer/master)
Author: Tom Butler <tom@r.je>
Date:   Tue Sep 13 15:51:36 2016 +0100

    #124 - fixed bug with empty value

commit 2110304c512958797508ea08e0d78c4336494b41 (HEAD -> refs/heads/master)
Author: Richard <solleer@hotmail.com>
Date:   Fri Aug 12 08:17:02 2016 -0400

    Removed unused class properties

Note how Composer pulled old version 2110304 from Aug 12 from cache, instead of grabbing version 0ada294c from Github.

@Seldaek
Copy link
Member

Seldaek commented Sep 13, 2016

@cxj please report a new issue rather than resurrecting an old one and spamming 15 people. I don't think this has any relation to the cache, so we'll need more details and repro steps. Locking this thread now.

@composer composer locked and limited conversation to collaborators Sep 13, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests