On 11/12/2007, Todd Zullinger wrote: > Tom Horsley wrote: > > Just for curiosity, why is it so difficult to automatically check > > elementary things like rpm signatures and metadata checksums before > > publishing new info in the master repositories? This seems to happen > > with extreme frequency. > > It takes many hours to run a repoclosure and check that everything is > sane. It's not that the folks doing the updates don't want to, it's > just that no good way to do this has been implemented yet, AFAIK. :( Tom Horsley suggests checking "elementary things" and gives two examples. Verifying package signatures, especially when packages are resigned when being moved from updates-testing to updates, is something that would not take "many hours". The infamous metadata checksum errors are not due to mistakes made on the master server. They come from a mirroring system that breaks in conjunction with Yum's caching implementation. You can observe how mirrors are out-of-sync for several days, offering data from several days ago. Everytime a Yum session with a mirror is interrupted and you are assigned a different mirror, the already downloaded repomd.xml must match all mirrors which are contacted until the repodata is downloaded. You can notice how the cache and the remote repodata get out-of-sync with Yum not downloading a fresh repomd.xml until it has the big repodata files copied. I have a repomd.xml from a few days ago, a cachecookie from a few minutes ago, but the mirrors I'm offered by the mirrorlist carry newer repodata which don't match the cached repomd.xml. Just give it a try and copy the remote repodata directory manually to verify it. "yum clean metadata" is a cure only if Yum succeeds in copying a good/matching set of repodata files. That works best if you stick to your favourite mirror. Todd Zullinger comments on different checks. Running repoclosure doesn't take "many hours". Fedora Extras repoclosure has been run frequently, processing Fedora Core and Fedora Extras, at least prior to every push but sometimes inbetween to help the packagers. It doesn't take more than 15-20 minutes for the initial and unattended run. The parts of it that take much more time are: setting up temporary repositories before they could be checked (including multilib resolving), taking action when broken deps are found (exclude pkgs, run again and hope that no further broken deps are found then, possibly add automation to assist with that). For Extras, running the modified repoclosure has only been done for Core+Extras plus the needsign repositories -- thousands of pkgs, without doing the multilib-compose dance, however, which would add to the processing time. It takes additional time to complete the update repositories (multilib resolving and updating, repoview, ...) when packages are pushed.