Re: 2.6.12.2 dies after 24 hours

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> We're also applying the attached patch.  There's a bug in reiserfs that
> gets tickled by our huge MMAP usage (it's amazing what really busy
> Cyrus daemons can do to a server, ouch).  It's fixed in generic_write,
> so we take the few percent performance hit for something that doesn't
> break!

Interesting - When I got the problem it was on mail servers under high
load (handling 60.000 emails pr. hour) with reiserfs as file system. I
have seen this problem on 5 different servers so I am confident that
it is not hardware failure.

Sometimes the server load just rises and then the server dies other
times the load rises but the kernel manages to get it back alive
filling up syslog with messages like this

Sounds like a different issue. The patch Bron included before fixes (or at least reduces to the point where it fixes it for us) a problem where processes get stuck in D state and are unkillable. A reboot is required to remove them. Apparently this is a known bug in ReiserFS (see messages below). As noted, the same bug exists in ext3. There appears to have been some patches to try and fix it for both reiserfs and ext3, but I'm not sure if they're in the mainline kernel yet.

http://www.ussg.iu.edu/hypermail/linux/kernel/0409.0/2056.html
http://hulllug.principalhosting.net/archive/index.php/t-22774.html

Rob

----- Original Message ----- From: "Vladimir Saveliev" <[email protected]>
To: "Jeremy Howard" <[email protected]>
Cc: "Hans Reiser" <[email protected]>; <[email protected]>
Sent: Friday, October 08, 2004 4:57 PM
Subject: Re: URGENT: Need fix for problem with copy_from_user inside a transaction


Hello

On Thu, 2004-10-07 at 21:35, Jeremy Howard wrote:
On Fri, 01 Oct 2004 09:13:34 -0700, "Hans Reiser" <[email protected]>
said:
> Chris, can you comment please?

Also... can you guys suggest any ways to minimise the problem, e.g.
external vs internal journal? metadata vs full journalling? changing the
elevator? ...

Would you please check whether the attached patch for
./fs/reiserfs/file.c fixes the problem?

Quick benchmark shows that using generic_file_write does not hurt
reiserfs performance too much comparing to using original
reiserfs_file_write.

---Sequential Output (nosync)--- ---Sequential Input-- --Rnd Seek- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --04k (03)- MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU generic_file_write 256 16690 99.7 28332 29.9 11528 16.8 12411 77.2 28888 17.3 237.4 2.1 reiserfs_file_write 256 15647 86.8 30875 22.1 10822 14.1 11631 72.1 29184 16.1 250.0 2.0



> Vladimir Saveliev wrote:
>
> >Hello
> >
> >On Fri, 2004-10-01 at 01:53, Hans Reiser wrote:
> >
> >
> >>vs, this is not enough of an answer to share your understanding of > >>the
> >>problem.  Please say much more.
> >>
> >>
> >
> >Reiserfs_write_file locks set of pages, and then tries to copy data to
> >them. If it is to copy data from one of pages which are locked and if
> >that page is not uptodate, pagefault requires to lock that page, but > >as
> >it is locked already - process deadlocks with itself.
> >
> >
> Is this when copying from one file in a formatted node to another file
> in that node?
>
> >As Chris said - fix is not trivial. Also, it is known that he did
> >already something about it, so, I thought that it would make sence to
> >find first what is his state at this problem.
> >
> >

Content-Disposition: attachment; filename=file.c.diff2

--- file.c~ 2004-10-02 12:29:33.223660850 +0400
+++ file.c 2004-10-08 10:03:03.001561661 +0400
@@ -1137,6 +1137,8 @@
  return result;
     }

+    return generic_file_write(file, buf, count, ppos);
+
     if ( unlikely((ssize_t) count < 0 ))
         return -EINVAL;


----- Original Message ----- From: "Chris Mason" <[email protected]>
To: "Hans Reiser" <[email protected]>
Cc: "Vladimir Saveliev" <[email protected]>; "Oleg Drokin" <[email protected]>; "Jeremy Howard" <[email protected]>; <[email protected]>
Sent: Saturday, October 09, 2004 1:05 AM
Subject: Re: URGENT: Need fix for problem with copy_from_user inside atransaction


On Fri, 2004-10-08 at 07:46 -0700, Hans Reiser wrote:
No, this is not the right answer.

>>>--- file.c~ 2004-10-02 12:29:33.223660850 +0400
>>>+++ file.c 2004-10-08 10:03:03.001561661 +0400
>>>@@ -1137,6 +1137,8 @@
>>> return result;
>>>     }
>>>
>>>+    return generic_file_write(file, buf, count, ppos);
>>>+
>>>     if ( unlikely((ssize_t) count < 0 ))
>>>         return -EINVAL;

It's not right because ext3 has exactly the same bug.  The only real
solution is to change the order in which things happen during
file_write.

Right now, we do this:

prepare_write() allocates space, starts a transaction
copy_from_user() can deadlock here because a transaction is running
commit_write() end the transaction

This is the same in generic_file_write and reiserfs_file_write, although
reiserfs_file_write doesn't call reiserfs_prepare_write, it has the same
basic structure in terms of when the transaction starts and ends.

The solution is to move most of the work from reiserfs_prepare_write
into reiserfs_commit_write.  I've made a number of different patches for
this, all have had problems, but I'm working through it and will have a
real tested fix.

Jeremy the best thing you can do right now is to mount your filesystems
with -o nolargeio=1, and use the temporary workarounds I sent you
before.  The -o nolargeio=1 will reduce the amount of work we try to do
with each pass in reiserfs_file_write (note that Vladimir's patch above
will have very similar effects).

-chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]
  Powered by Linux