Re: vm86.c audit_syscall_exit() call trashes registers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The system was otherwise completely idle. The only active task was starting the X server.

The failure is 100% reproducible on my test system.

We have not run a lot of different kernels per se. We ran 2.6.9, and it was fine. When we ran RHEL 5, it came with 2.6.18. All we really did was rebuild 2.6.18 with that chunk of code removed, and the problem went away. Mind you, when that chunk of code was removed, there were a ton of errors about multiply freed audit blocks. But at least the X server EDID transfer was successful.

As far as enabling/disabling the audit functionality: I'm clueless about it. I think RHEL turned it on by default, but I don't know how to turn it on or off myself.

I will also note that the small stand-alone utility read_edid never failed. It was only when vm86 was called from inside of the X server. So perhaps there's a race condition with memory not being where it's expected to be when a large app calls out to real mode?

-Bill

----

William Cattey
Linux Platform Coordinator
MIT Information Services & Technology

W92-176, 617-253-0140, [email protected]
http://web.mit.edu/wdc/www/


On Aug 14, 2007, at 5:28 PM, Andi Kleen wrote:

On Tue, Aug 14, 2007 at 04:52:54PM -0400, William Cattey wrote:
The corruption originally looked like a race condition.

Sometimes the EDID buffer would be all zeros.
Sometimes it would contain partial data, and then the rest of the
buffer filled with zeros.
The amount of data transferred into the buffer before going to all
zeros is non-deterministic.

When we put a known value in each byte of the buffer before making
the vm86 call, the known data would always be overwritten either with
EDID data or zeros.

Hmm, that might be consistent with something going wrong with sleeping.
Was the system under high load? Perhaps something else can thrash
some real mode state when you sleep. On the other hand vm86 in user
mode can schedule anyways, so it might have already happened.

But I think the mutex was actually added post 2.6.16 so if you saw
it in 2.6.16 already it might have been something else.

Also when audit is not enabled (did you have it enabled?) the audit
function doesn't do very much?

If you can reliably reproduce it one way might be to comment
out more and more of the audit code until you find who causes
the corruption (that might cause some corrupted audit data,
but that should be fine for testing)

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux