Re: wget lot of files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 07, 2004 at 11:03:05AM +0800, Ow Mun Heng wrote:
> > [mailto:fedora-list-bounces@xxxxxxxxxx]On Behalf Of Tom 'Needs A Hat'
> > On Tue, Apr 06, 2004 at 09:49:15PM +0200, hicham wrote:
> > > hello
> > > 
> > > tryin to download lot of rpm files from an ftp site
> > > how to get all the files with a  wget once for all ?
> > 
> > Cannot do it.
> > Wget respects the Robot Exclusion Standard.
> > RTFM.
> > 
> > The Robot Exclusion Standard exists to avoid abuse of the services.
> > (sort of like hijacking threads is bad style ;-).
> 
> For Real?? I've not tried it.. but what alexander suggested
> 
> wget -ri ftp://ip-address/path/rpm/*.rpm
> 
> should work??

Well it depends on the site, some do and some do not have robots files.
Not all do, http://download.fedora.redhat.com/pub/fedora/linux/core/updates/1/i386/
does not at this time.

Since wget will get individual files it is near trivial to get the
index and then parse out all the individual files in that and script
things up.  Examples of the advanced versions of such scripts are yum
and up2date.

The original question was how to get it to work.  I do believe that if
he looks there is a robots file on the mirror site he picked.  That is
what keeps wget from doing what he expects.  At the point of finding a
robots file scripting to bypass it is not polite.  Example:

       http://mirrors.kernel.org/robots.txt

Since updates are dynamic by nature 'rsync' is the correct way to gather up and
maintain a full current set.   There are rsync sites out there that accept
anonymous connections.


-- 
	T o m  M i t c h e l l 
	/dev/null the ultimate in secure storage.



[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux