Re: 2.6.11, nfsd, log_do_checkpoint()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



  Hello!

> I woke up to a mostly-dead PE1850 this morning: 
> 
> Message from syslogd@storage at Fri Apr  1 06:19:14 2005 ...
> storage kernel: Assertion failure in log_do_checkpoint() at 
> fs/jbd/checkpoint.c:365: "drop_count != 0 || cleanup_ret != 0"
> 
> Message from syslogd@storage at Fri Apr  1 06:19:14 2005 ...
> storage kernel: invalid operand: 0000 [1] SMP 
> 
> Full error:
> 
> Assertion failure in log_do_checkpoint() at fs/jbd/checkpoint.c:365: 
> "drop_count != 0 || cleanup_ret != 0"
  Could you try running a kernel with the attached patch? Are you able to
reproduce the problem even with the patch?

> ----------- [cut here ] --------- [please bite here ] ---------
> Kernel BUG at checkpoint:365
> invalid operand: 0000 [1] SMP
> CPU 1
> Modules linked in:
> Pid: 212, comm: nfsd Not tainted 2.6.11-rc5

<snip>

> nfsd was serving out a pretty heavily-used (2.5-million page web site) ext3 
> partition on the 1850's built-in LSI/MPT controller.  I'm able to duplicate 
> this somewhat consistently by putting nfsd under heavy load (say, by deleting 
> 20,000 files from a directory).
> 
> (Please Cc me on replies, I'm not subscribed.)

								Honza

-- 
Jan Kara <[email protected]>
SuSE CR Labs
linux-2.6.11-ext3-release-race.patch:
 transaction.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

--- NEW FILE linux-2.6.11-ext3-release-race.patch ---
--- linux-2.6.9/fs/jbd/transaction.c.=K0002=.orig
+++ linux-2.6.9/fs/jbd/transaction.c
@@ -1812,10 +1812,10 @@ static int journal_unmap_buffer(journal_
 			JBUFFER_TRACE(jh, "checkpointed: add to BJ_Forget");
 			ret = __dispose_buffer(jh,
 					journal-&gt;j_running_transaction);
+			journal_put_journal_head(jh);
 			spin_unlock(&amp;journal-&gt;j_list_lock);
 			jbd_unlock_bh_state(bh);
 			spin_unlock(&amp;journal-&gt;j_state_lock);
-			journal_put_journal_head(jh);
 			return ret;
 		} else {
 			/* There is no currently-running transaction. So the
@@ -1826,10 +1826,10 @@ static int journal_unmap_buffer(journal_
 				JBUFFER_TRACE(jh, "give to committing trans");
 				ret = __dispose_buffer(jh,
 					journal-&gt;j_committing_transaction);
+				journal_put_journal_head(jh);
 				spin_unlock(&amp;journal-&gt;j_list_lock);
 				jbd_unlock_bh_state(bh);
 				spin_unlock(&amp;journal-&gt;j_state_lock);
-				journal_put_journal_head(jh);
 				return ret;
 			} else {
 				/* The orphan record's transaction has
@@ -1850,10 +1850,10 @@ static int journal_unmap_buffer(journal_
 					journal-&gt;j_running_transaction);
 			jh-&gt;b_next_transaction = NULL;
 		}
+		journal_put_journal_head(jh);
 		spin_unlock(&amp;journal-&gt;j_list_lock);
 		jbd_unlock_bh_state(bh);
 		spin_unlock(&amp;journal-&gt;j_state_lock);
-		journal_put_journal_head(jh);
 		return 0;
 	} else {
 		/* Good, the buffer belongs to the running transaction.

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux