Re: + espfix-code-cleanup.patch added to -mm tree

Stas Sergeev wrote:

Hi Zach and thanks for you really great review!

Zachary Amsden wrote:
Overall, this looks like a great cleanup. But I think there are somedetails with the patch that need fixing. It is extremely hard tofollow the entry.S changes based on patches alone - which tree doesthis apply cleanly to?
This should cleanly apply to 2.6.18-rc2-mm1.
I would like to see the final result in entry.S, there are
I'll send it to you privately.
still some details (dealing with eliminating the push/pop of %eax asa temporary for the segment comparison) which I could not followexactly in patch format.
The reason for removal was that I also removed the xorl %eax,%eax,decl %eaxstuff from error_code (replaced with $-1, as you've seen), and sopreserving
%eax became unnecessary I think.
iret_exc:
-    TRACE_IRQS_ON
-    sti
     pushl $0            # no error code
     pushl $do_iret_error
     jmp error_code
Why did you remove the sti here?
Because I simply broke my head over that code. If you say sti should
stay - I beleive you. But please, explain me how it works, before I
went crazy! :)

I read the iret pseudo-code in the Intel manuals carefully. It first
pops the iret frame into temporaries, then checks the values for
validity, and then either faults or writes the values to the real
registers. Now, if we have a bad user's iret frame, the iret will
pop it, raise an exception and push a kernel's iret frame, while
the popped user's frame at that point should be completely lost,
as far as I can understand. Now we go into GPF handler, then to a
fixup, push $0 as an error code, but what iret frame is above that
error code? Where does it come from? Doesn't the iret loses the
user's iret frame completely? Certainly I am missing something obvious
here, but I just can't follow that magic...

Did you test this with deliberately faulting IRETs? I think this isa bug.

Yes, I tested that and it works (I have a rather extensive test-suite
for that problem - faulting iret is always tested). But since I
don't understand the magic, I can't say why it works.

You need to get a #GP or #NP on the faulting iret. Several ways to dothat -

1) Issue int $0x80 from last two bytes of an LDT code segment withtruncated limit (#GP)2) Move %ss to LDT, set %esp high. Issue write LDT system call via int$0x80 that truncates the limit below %esp (#GP)3) Move %cs / %ss to LDT. Issue write LDT system call and make cs / sssegment not present (#NP)

Note that on the path back to the iret, interrupts are disabled. Theiret faults, but doesn't pop the user return frame. The check for thefixup section will happen, and fixup_exception will return to the STIinstruction in the fixup, with the user frame still above on the stack(but note the return pops EFLAGS, which disables interrupts as they werebefore for the iret). The fixup returns, pushes a fake error code ontothe stack, followed by a push of the exception handle, do_iret_error.It then re-enters the common error_code path, with saves registers anddispatches to do_iret_error, which follows the notify path, eitherkilling the process with a SIGSEGV or delivering a signal. The wholetime the user return frame is sitting up there on the stack, so takingthis exception:


DO_ERROR_INFO(32, SIGSEGV, "iret exception", iret_error, ILL_BADSTK, 0)

makes it appear as if the user instruction to which the kernel wastrying to return caused a SIGSEGV / ILL_BADSTK.

I think you actually want interrupts enabled in that path, since if youlook at the #GP handler, it is a trap gate, not an interrupt gate - so#GPs from userspace that take a very similar signal notifier path enterwith interrupts enabled. I don't think very much would go wrong ifinterrupts were left disabled, but it makes me nervous to change thatsemantic.

Bigger problem!  (*!)  It is _unsafe_ to call C-code here.

Ow, holly shit!!! What a disaster... But you are right. Well,
this will make my new patch really big...

You did not introduce this bug, I just spotted it now, and the oldcode has the same problem.

You are wrong here - I introduced that bug as the old code is
also mine. Linus attributed it to someone else though, probably
by mistake (or did he just wanted to rescue me from the rant about
that code? :). But it was mine.


Ah, ok.

There are two options I see to fixing this problem. One is to movethe GDT fixup code into assembler. The other is to compile the GDTfixup code in a separate .o file with exactly specified CFLAGS tomake sure -fomit-frame-pointer is not passed to it.
Well, unsafe is unsafe. If we can't explicitly tell gcc to prefix the
%ebp accesses with %ds, then I'd better go for an asm version.


Sounds like a plan.

-    .quad 0x0000920000000000    /* 0xd0 - ESPFIX 16-bit SS */
+    .quad 0x00cf92000000ffff    /* 0xd0 - ESPFIX SS */
Seems a bit dangerous to allow access to full 4GB through this. Canyou tighten the limit any? I suppose not, because the high bits in%esp really could be anything. But it might be nice to try settingthe limit to regs->esp + THREAD_SIZE. Of course, this is notstrictly necessary, just an extra paranoid protection mechanism.
Since, when calculating the base, I do &-THREAD_SIZE, I guess the minimal
safe limit is regs->esp + THREAD_SIZE*2... Well, may just I not dothat please? :)
For what, btw? There are no such a things for __KERNEL_DS or anything, so
I just don't see the necessity.

It helps track down any bugs that could leak through otherwise andcorrupt random memory.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: + espfix-code-cleanup.patch added to -mm tree
  - From: Stas Sergeev <[email protected]>

References:
- Re: + espfix-code-cleanup.patch added to -mm tree
  - From: Stas Sergeev <[email protected]>

Prev by Date: Re: [PATCH 1/2] include/linux: Defining bool, false and true
Next by Date: Re: [PATCH 1/2] include/linux: Defining bool, false and true
Previous by thread: Re: + espfix-code-cleanup.patch added to -mm tree
Next by thread: Re: + espfix-code-cleanup.patch added to -mm tree
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]