Sam Varshavchik: >>> With a large repository, like Fedora, even a compressed XML file is >>> going to end up being rather huge. Then, you have to uncompress it and >>> parse it. And, XML parsing is also not exactly a light task. Tim: >> I thought there was supposed to be a move towards using a proper >> database scheme, a long time ago, that would have sped up all of that? Jim Cornette: > I recall something discussed awhile back regarding implementing a more > efficient database scheme. Current discussions on development seem to be > concentrating on removing features like rpm -ivh url and instead > requiring the use of wget to download the rpm and secondarily installing > the rpm from local directory. No mention regarding changing to the more > efficient database format. I was under the impression that part of the reason for using something SQL based (see listing, below) was to do with it being faster to parse than the rather free-for-all structure of an XML file. Supposedly being able to use a pre-existing databasing technique, rather than a custom job on this special XML? [root@bigblack ~]# ll /var/cache/yum/updates/ total 38356 -rw-r--r-- 1 root root 0 2007-08-01 12:34 cachecookie -rw-r--r-- 1 root root 13102080 2007-07-27 23:43 filelists.sqlite drwxr-xr-x 2 root root 4096 2007-06-22 16:23 headers -rw-r--r-- 1 root root 21680128 2007-07-27 23:51 other.sqlite drwxr-xr-x 2 root root 20480 2007-08-01 12:42 packages -rw-r--r-- 1 root root 4373504 2007-08-01 12:34 primary.sqlite -rw-r--r-- 1 root root 1953 2007-08-01 12:34 repomd.xml The sqlite file didn't exist in the (much) older releases. And I seem to recall there was a tgz in there, though maybe the archive is now discarded once unpacked? I think whatever method you use for working out what packages are available is going to be a fair bit of data to transfer, though. Unless there's going to be some way of simply getting "new information since {date}" from the server (date being supplied by your software, as the last date it checked). Heck, it could use an NNTP server for that. ;-) There's a thought, I wonder how practical it would be to have a two or three news groups on a dedicated YUM NNTP network as the computer's way of working out what was available as an update. You have a history, ability to get things since a certain date, or an ID number, and servers can easily expunge old stuff so a complete fetch doesn't include the last ten versions of the same thing, with NNTP. fc7.updates.headers for the smallest information the system needed to know about each package to work out what to do. Your updater would drag in the new stuff, and put what it needed into its own local database. Each run of something like yum update would only drag in a few messages, rather than a new 2 meg package list for each change to the list. fc7.updates.human for people to read details per package, if they wanted, like we have the packages announce mailing list. The other update list would provide the message Id for your client to fetch particular info, if you wanted (e.g. you saw updates available for "this", "that", and "the other", and you decided you wanted to know what "that" was before proceeding, on an interactive update). fc7.updates.something.else if there needed to be more information... -- [tim@bigblack ~]$ uname -ipr 2.6.22.1-33.fc7 i686 i386 Using FC 4, 5, 6 & 7, plus CentOS 5. Today, it's FC7. Don't send private replies to my address, the mailbox is ignored. I read messages from the public lists.