Ron Goulard wrote:
On Wed, 2004-04-21 at 02:38, Tom 'Needs A Hat' Mitchell wrote:I am having a full backup every day, but using tar would mean I was pulling a complete compressed copy across every night which is very inefficient..
On a local link I turn compression off. The CPU effort and latency to compress then uncompress does not justify the time saved in transfer time.
Also most digital content (images, rpms) do not compress enough to justify the cycles. Some increase in size....
Since this depends on your content and your local system capabilities I can only advise you to list all the possible environment knobs in your script and then benchmark by turning them on and off.
Of interest if compression proves to be an advantage for backups then you should check into compression on the httpd server side. It is possible to present compressed content to an aware client that is then expanded locally by the browser.
If you think about it a distant proxy server could do this and make the link look faster. Some services are apparently doing this and charging extra for it. As a content provider you should do this to save both you and your customers bandwidth. It might be interesting to make sure that precompressed content is not expanded to make a link look slow. I seriously doubt that any service would like to be caught doing this ... but ... ya never know.
The backup impact of this is that the pages on the web site are
compressed already and will not compress any more. So why bother.
Something else to consider is if you are keeping multiple copies of the backup. Tar is much faster at doing this than rsync _if_ the files being backed up do not already exist on the other end. Rsync gets its speed by moving only what has changed (it compares both ends of the connection before deciding).
Therefore, if you are keeping only a single copy of the backup, you can use rsync, but if you are keeping multiple copies, for example a complete copy of the data for every day of the past month, then I think tar may be the better choice. (there are reasons for doing it either way, but I won't get into that here, just providing an option)
Using rsync with the --link-dest option means that a hard link is created for files that haven't changes and only changed files are brought across... Add compression to that (the reason for my initial question) and you effectively have a full backup everyday with only the incremental changes being compressed and shipped across the wire.. To my mind this is about as efficient as it can hope to be in terms of both backup performance and bandwidth conservation..
Later..