Re: Compiling C++ modules

On Apr 25, 2006, at 12:46:02, Avi Kivity wrote:

Kyle Moffett wrote:
Except making exceptions work in the kernel is exceptionallynontrivial (sorry about the pun).
My experience with exceptions in kernel-like code (a distributedfilesystem) was excellent.

Well from all of the discussions about it that occurred on this list,the biggest problem with exceptions in the kernel was the overhead tokeep track of the exception state and the try/catch stack. The otherproblem was handling exceptions in atomic contexts, in the guts ofthe scheduler, and in a host of other hot-paths.

Which of the following shows the flow of code better?


Once you accept the idea that an exception can occur (almost) anywhere

Except they can't. Lots and lots of bits of kernel code explicitlydepend on the fact that certain operations _cannot_ fail, and theymake that obvious through the fact that those functions don't haveany way of returning error conditions.

the C++ code shows you what the code does in the normal casewithout obfuscating it with error handling. Pretend that afterevery semicolon there is a comment of the form:
   /* possible exceptional return */

As I pointed out before and seem doomed to point out again, this isan _implicit_ piece of code, and in kernels implicit code is veryvery bad. Because of the varying contexts and close hardwaredependence, you want to explicitly state everything that happens.

First of all, that extra TakeLock object chews up stack, at least4 or 8 bytes of it, depending on your word size.
No, it's optimized out. gcc notices that &lock doesn't change andthat 'l' never escapes the function.

GCC does not notice that when you use out-of-line functions. Let meremind you that many of the kernel's spinlocks and other functionsare out-of-line, inlining them has significant performance penalties.

Secondly with standard integer error returns you have one or twoeasily-predictable assembly instructions at each step of the way,whereas with exceptions you trade the absence of error handling inthe rest of the code for a lot of extra instructions at theexception throw and catch points.
The extra code is out of line (not even an if (unlikely())). Soyes, total code grows, but the exceptional paths can be ina .text.exception section and not consume cache or TLB space.


Total bull.  Let me give you an example.  The following C++ code:
{
	Foo my_foo;
	function_that_throws_exception();
}

Is turned by the compiler into a slight variant on this C code:
{
	Foo my_foo = Foo();
	jmp_buf exception;
	if (setjmp(exception))
		goto out;
	
	function_that_throws_exception();
	
out:
	~Foo(&my_foo);
}

There is nothing about the above that the compiler can trivially "out-of-line", not to mention the fact that you just royally screwed theCPUs chances of getting branch prediction right. There's the factthat setjmp and longjmp are kind of hard in kernelspace. Finally,the stack usage is significantly increased, I think even inuserspace, jmp_buf occupies 8 longs (32 bytes) just for criticalregisters and instruction pointer of the exception handler. Also,the simplest case of throwing an exception has turned from the threeinstructions "cmp bnz ret" (compare result or pointer, branch toexception code, return) to this mess:

{
	longjmp((int)pointer);
}

The longjmp function is simple, but not _that_ simple, and it stillbreaks branch prediction in a bad way.

This is a really _really_ bad idea for a kernel. Having simpledeclaration statements have big side effects (like the commonTakeLock object example I gave above) is bound to lead to peoplescrewing up and forgetting about the side effects. In C it'simpossible to miss the side effects of a statement; function callsare obvious, as is global memory modification.
In C++ you just have to treat declarations as executablestatements. Just as you can't compile the code with a C compiler,you can't read it with a C mindset. Once you get used to it, itisn't surprising at all.

That's all well and good, until you assume that "some_type foo = 3;"is just declaring an integer through a typedef instead of declaringan object with side effects. The thing about the linux kernel isthat basically _nobody_ understands all of it, and as a result eachand every bit of code must stand on its own and be fairly obvious asto side effects and such. In C++, most of the language features aredesigned to _hide_ those side effects and as a result it's a terriblefit for the kernel.

Let me point out _again_ how unobvious and fragile the flow ofcode there is. Not to mention the fact that the C++ compiler caneasily notice that item1 and item2 are never used and optimizethem out entirely.
Excellent! If there are no side effects, I want it out. If thereare side effects, it won't optimize them out.

How can it tell? You aren't writing all your member functionsinline, are you?

You also totally missed the "int flags" argument you're supposedto pass to object specifying allocation parameters,
There is no allocation here (both the C and the C++ code allocateon the stack.

Let's look in the kernel. How often do you find non-trivialconstructor functions that _don't_ allocate memory, hm?

Should you want to allocate from the heap, try this:

{
   spinlock_t::guard g(some_lock);

auto_ptr<Foo> item(new (gfp_mask) Foo); /* or pass akmem_cache_t */

   item->do_something();
   item->do_something_else();
   return item.release();
}


I think this code speaks for itself about its lack of readability.

   if ((r = foo_do_something(item))) {

Your kernel-idiomatic C is terrible. Please don't go around writingmuch kernel code in this style, that's disgusting. Your multiply-duplicated return statements were also bad form, see my example for amuch clearer way of doing it.

Yeah, sure, yours is 3 lines when you omit the following:
(1)  Handling allocation flags like GFP_KERNEL

done


And unreadable afterwards

(4) Reference counting, garbage collection, or another way toselectively free the allocated objects based on success or failureof other code.
Reference counting is ridiculously to do in C++. I'll spare you thedetails.

I'll assume you mean "ridiculously easy" there. The problem is thatwith the exception handling system you add refcounts to a lot ofobjects that don't need them. Here's an example that doesn't need arefcount. I'm registering the "item" with a subsystem, and if thatfails the object is immediately freed.


{
	int result;
	struct foo *item = kmalloc(sizeof(*item), GFP_KERNEL);
	if (unlikely(!item))
		return ERR_PTR(-ENOMEM);
	
	spin_lock(&item_lock);
	
	result = item_init(item, GFP_KERNEL);
	if (unlikely(result))
		goto free;
	
	result = item_register(subsystem, item);
	if (unlikely(result))
		goto destroy;
	
out:
	spin_unlock(&item_lock);
	return result;

	/* Error handling */
destroy:
	item_destroy(item);
free:
	kfree(item);
	goto out;
}

Please note that the assembly into which this optimizes is quiteefficient, the compiler can trivially order the fast-path and placeall of the exception code after the function or could theoreticallyeven put it in a different section. The side-effects are quiteobvious, and it's also simple to identify exactly how manyinstructions this function costs; there's a memory allocation, alock, and two function calls. Interspersed in there are exactly 3conditional jumps to exception-handling code. Direct stack usage forthis function is around 8 or 12 bytes, depending on word size.

The biggest advantage to this method is that we can tell C exactlywhat exceptions we expect from the various functions. Not only that,but we already know what type they are, how likely they are to occur,and exactly how to handle each kind of exception path. It's evenfairly easy to read through the fast-path based on how it's laid out.

Let me point out a few problems with the C++ ways you've described ofdoing the same things:

(1) You can't easily allocate and initialize an object in 2different steps. In the kernel you want to be able to sleep inkmalloc to increase chances of getting the memory you need, but youmay need to take a spinlock before actually initializing the datastructure (say it's on a linked list). If you split up the actualinitialization into another function then you lose all of theadvantages of C++ constructors.

(2) Your code either adds a refcount for "item" or unconditionallyreleases it at the end of the function. Yes that's fixable, but notin a way that preserves the exception-handling properties you'reespousing so much. When you get an exception, how does the code tellwhich objects to free and which ones not to? (Answer: it can't,that's a semantic decision made by the programmer with "if" statements).

(3) You still haven't explained how adding all sorts of implicitside effects is a good thing in an operating system kernel.


Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: Compiling C++ modules
  - From: Avi Kivity <[email protected]>
- Re: Compiling C++ modules
  - From: Michael Buesch <[email protected]>

References:
- Compiling C++ modules
  - From: Gary Poppitz <[email protected]>
- Re: Compiling C++ modules
  - From: Alan Cox <[email protected]>
- Re: Compiling C++ modules
  - From: Avi Kivity <[email protected]>
- Re: Compiling C++ modules
  - From: Kyle Moffett <[email protected]>
- Re: Compiling C++ modules
  - From: Avi Kivity <[email protected]>
- Re: Compiling C++ modules
  - From: Kyle Moffett <[email protected]>
- Re: Compiling C++ modules
  - From: Avi Kivity <[email protected]>

Prev by Date: Re: [PATCH 3/3] Assert notifier_block and notifier_call are not in init section
Next by Date: Re: [RFC][PATCH 0/11] security: AppArmor - Overview
Previous by thread: Re: Compiling C++ modules
Next by thread: Re: Compiling C++ modules
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]