Re: unmount oops in log_do_checkpoint

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Tue, Jan 17, 2006 at 11:23:53PM +0100, Jan Kara wrote:
> > > On Tue, Jan 17, 2006 at 05:32:35PM +0100, Jan Kara wrote:
> > > > > On Tue, Jan 17, 2006 at 12:59:45PM +0100, Nick Piggin wrote:
> > > > > 
> > > > > Maybe it is because people haven't been turning on their debugging options,
> > > > > tsk tsk ;) It only oopses when DEBUG_SLAB and DEBUG_PAGEALLOC are both
> > > > > enabled. And only then when the jbd patch is not reverted. Weird.
> > > >   Hmm, that's really strange, maybe we have some use-after-free
> > > > problem or so... I'll see what I can do :).
> > > > 
> > > 
> > > Are you able to reproduce? If not I can test patches...
> >   Hmm, I was not able to reproduce the problem even with those debug
> > options set :(. As I'm looking into the code it seems that somebody
> > managed to free the transaction but did not clear the
> > j_checkpoint_transactions pointer. It's even stranger that it's during
> > umount time when there should not be much processes playing with the JBD
> > structures on that filesystem.
> >   Attached is the patch that fixes two minor possible problems I've
> > found. Neither of them should be causing your oops but one never knows
> > :). Also turn on the JBD debugging in config. Maybe it spits something
> > useful. If you still see the same oops, I'll create some debugging
> > patch.
> 
> This patch does the trick. Survived several reboots now while without
> the patch it has oopsed 100% of the time so far. Thanks!
  Good to hear :).

> I have also attached a full jbd debug output and oops for the vanilla
> 2.6.16-rc1 case, just in case that helps.
  It helped me to verify my idea why my patch helped. Thanks. The problem was
the || instead of && in log_do_checkpoint(). When the journal checkpoint
list became empty, the pointer check in the loop failed and so because
of || we tried also the second check which was using transaction->t_tid.
If the transaction was already freed, bad luck...
  Andrew, attached is the fix with logs etc. I split my original patch
into two as they are in fact unrelated things. Please apply.

								Honza
-- 
Jan Kara <[email protected]>
SuSE CR Labs
While checkpointing we have to check that our transaction still is in the
checkpoint list *and* (not or) that it's not just a different transaction
with the same address.

Signed-off-by: Jan Kara <[email protected]>

diff -rupX /home/jack/.kerndiffexclude linux-2.6.16-rc1/fs/jbd/checkpoint.c linux-2.6.16-rc1-1-checkpoint-fix/fs/jbd/checkpoint.c
--- linux-2.6.16-rc1/fs/jbd/checkpoint.c	2006-01-17 21:44:02.000000000 +0100
+++ linux-2.6.16-rc1-1-checkpoint-fix/fs/jbd/checkpoint.c	2006-01-17 23:35:49.000000000 +0100
@@ -338,7 +338,7 @@ restart:
 	 * done (maybe it's a new transaction, but it fell at the same
 	 * address).
 	 */
- 	if (journal->j_checkpoint_transactions == transaction ||
+ 	if (journal->j_checkpoint_transactions == transaction &&
 			transaction->t_tid == this_tid) {
 		int batch_count = 0;
 		struct buffer_head *bhs[NR_BATCH];

We have to check that also the second checkpoint list is non-empty before
dropping the transaction.

Signed-off-by: Jan Kara <[email protected]>

diff -rupX /home/jack/.kerndiffexclude linux-2.6.16-rc1/fs/jbd/commit.c linux-2.6.16-rc1-1-checkpoint-fix/fs/jbd/commit.c
--- linux-2.6.16-rc1/fs/jbd/commit.c	2006-01-15 00:20:12.000000000 +0100
+++ linux-2.6.16-rc1-1-checkpoint-fix/fs/jbd/commit.c	2006-01-17 23:35:19.000000000 +0100
@@ -829,7 +829,8 @@ restart_loop:
 	journal->j_committing_transaction = NULL;
 	spin_unlock(&journal->j_state_lock);
 
-	if (commit_transaction->t_checkpoint_list == NULL) {
+	if (commit_transaction->t_checkpoint_list == NULL &&
+	    commit_transaction->t_checkpoint_io_list == NULL) {
 		__journal_drop_transaction(journal, commit_transaction);
 	} else {
 		if (journal->j_checkpoint_transactions == NULL) {

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux