Re: Filesystem problems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This topic was handled off-list, which was probably a mistake.  Thus,
for you edification and enjoyment....


Basically, files can temporarily "disappear" from the file-system when a
program:
        - opens a file with fopen/open/etc, and then
        - unlinks the file (i.e.: removes the file, see unlink(2)) while
          keeping the the file open.

Note that when this happens, the disk space isn't really available
for reallocation until the program ends and/or closes the file
(fclose/close/etc).  However, because of the way df and du calculate used
and free file-system space, they can disagree under these conditions.

Also note that USED + AVAIL != SIZE for df because some of the space on
a disk file-system is reserved for use by the system (for superblocks,
"reserved", etc).

So how did you get here?
        - downloads that got canceled
        - some program that's messed up and poops all over its log or
          output file (that's been unlinked but is still open)
        - a log rotation that went awry
        - processes listed as "zombie" in ps/top output
        - a make where the user hit cntl-C (unlikely, make's pretty
          good about this kind of thing)
	- a daemon or long running program that poorly manages files
        - etc

If you can find and zap the PID that's tying up the disk space, you'll
probably see the disk space reappear.  If not, you can reboot the server.
I'd suggest rebooting and then repeating the df/du immediately afterward.

The quickest way to find PIDs that have deleted (i.e.: NLINK == 0) but
still open files is
	$ su
	# lsof +L 1 /opt | grep deleted

BTW: lsof is in the lsof rpm.

If this _still_ doesn't solve the issue (to say within 5%), then you'll
need to unmount the drive and run "fsck -f /dev/cciss/c0d0p7".  Note that
while fsck will fix damaged filesystems, it won't find/fix files that lsof
reports as deleted but still open.

-S


shhgs wrote:
> 
> Hi, Dan
> 
> Maybe someone has removed a file while a process is still writing to
> it. Try fcsk.
> 
> G.S. Huang
> 
> On 2/28/07, Alexander Apprich <a.apprich@xxxxxxxxxxxxxxxxxxxx> wrote:
> > Hi Dan,
> >
> > Dan Track wrote:
> > > On 2/27/07, Steve Siegfried <sos@xxxxxxxx> wrote:
> > [snip]
> > >>
> > > Hi
> > >
> > > Thanks everyone for your replies. Here's the relevant output. Is there
> > > some tests I can run to see what is going on?
> > >
> > > df -h
> > > Filesystem            Size  Used Avail Use% Mounted on
> > > /dev/cciss/c0d0p3     3.0G  1.9G  967M  67% /
> > > /dev/cciss/c0d0p1     147M   15M  125M  11% /boot
> > > /dev/cciss/c0d0p7      59G   53G  3.3G  95% /opt
> > > none                 1007M     0 1007M   0% /dev/shm
> > > /dev/cciss/c0d0p2     3.0G  2.5G  358M  88% /var
> > >
> > > du -hs /opt/
> > > 27G     /opt
> > >
> >
> > I've seen this before on a suse box where the user had couple downloads
> > running that he canceled but the browser didn't let go of the files.
> > His filesystem was filled 100%. After he killed his browser df -hl
> > showed the correct information. Maybe you can see what's going on by
> > running
> >
> >    /usr/sbin/lsof /opt | less
> >
> > on your box.
> >
> > Alex


[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux