Re: OT: Requesting C advice

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 1 Jun 2007, Les wrote:

On Fri, 2007-06-01 at 07:36 -0400, Matthew Saltzman wrote:

I know why their programs failed.  I also know that C uses a pushdown
                                                      ^some particular
                                                       implementations of
stack for variables in subroutines.  You can check it out with a very
simple program using pointers:

   #include <sttlib.h>

   int i,j,k;

   main()
   {
       int mi,mj,mk;
       int *x;
       mi=4;mj=5;mk=6;
       x=&mk;
       printf ("%d  %d  %d\n",*x++,*X++;*X++);
       x=&i;
       printf ("%d  %d  %d\n",*x++,*x++,*x++);
       i-1;j=2;k=3;
       printf ("%d  %d  %d\n",*x++,*x++,*x++);
 )

Just an exercise you understand.  compile and run this with several c
packages, or if the package you choose supports it, have it compile K&R.
and try it.

Of course, several constructs here are undefined, so there is no such
thing as "correct" or "incorrect" behavior.

After correcting obvious typos and adding #include <stdio.h> so it would
compile, I got (using gcc-4.1.1-51.fc6 with no options):

     $ ./a.out
     5  4  6
     0  0  0
     0  0  0

OOPS, forgot to reset the X pointer between the last two print
statements.  This bit of code is intended to show that globals are on a
heap and locals are on a stack.

Fixed that.  Now I get:

$ ./a.out
5  4  6
0  0  0
0  2  1

But I confess, I don't see how this code proves your point. It does demonstrate that globals are initialized by default, though.



Was that what you were expecting?



I cannot vouch for every compiler, only Microsoft, Sun, and Instant C
off the top of my head.  I have used a few other packages as well.  But
any really good programmer NEVER relies on system initialization.  It is
destined to fail you at bad times.

How much effort are you willing to expend to defend against potentially
buggy compilers (as opposed to undefined or implementation-defined
behaviors)?  The Intel fdiv bug would seem to prove that you should NEVER
rely on arithmetic instructions to provide the correct answer.  There's an
economic tradeoff between protecting yourself from all conceivable errors
and actually getting work done.


There is a difference between implementation differences and hardware
errors, which was the microsoft error.  They had
a bug in their silicon compiler that caused that IIRC.

I could just as easily reference some other obscure compiler bug or implementation-defined behavior and make the same point. The thing about a standard is that there are clear requirements about what is implementation-defined and what is not. Static initialization in ISO C is not one of those implementation-defined things.

I will concede that explicit initializations--even to default values--might be a useful self-documentation tool.


                                    One case is as has been pointed out
here, that NULL is sometimes 0, sometimes 0x80000000, and sometimes
0xffffffff.  Even NULL as a char may be 0xFF 0xFF00 or 0x8000 depending
on the implementation.  But strings always end in a character NULL or
0x00 for 8 bit ascii, if you use GNU, Microsoft, or Sun C compilers.
They may do otherwise on some others.  It can byte (;-) you if you are
not careful.

In your source code, NULL is *always* written 0 (or sometimes (void *) 0
to indicate that it's intented to stand for a null pointer value, not a
NUL character value).  The string terminator character is *always* written
'\0'.  The machine's representation of that value is immaterial.  If you
type-pun to try to look at the actual machine's representation, your
program's behavior is undefined and you deserve what you get.  It's the
compiler's responsibility to ensure that things work as expected, no
matter what the machine's representation is.  (For example, '\0' == 0 must
return 1.)


'\0' is an escape forcing the 0, so of course this will be equal.

OK. But the main point is that it doesn't matter what bit pattern represents a null pointer. Your source code will always use the value 0 to represent it. For example,

	int *p;
	/* ...code that sets p... */
	if ( p == 0 ) /* *not*  if ( p == 0x80000000 ) or
				if ( p == 0xffffffff ) */
	{ /* ...handle null pointer value... */ }



   And since that is so, how are those variables initialized? and to
what value?  What is a pointer set to when it is intialized.  Hint, on
Cyber the supposed default for assigned pointers used to the the address
of the pointer.  Again, system dependencies may get you.

Pre-ANSI/ISO compilers might have initialized static memory to
all-bits-zero even when that was not the correct representation of the
default for the type being initialized.  ANSI/ISO compilers are not
allowed to do that.  The required default initializations are well
defined.  (This is the sort of thing that motivates the creation of
standards in the first place.)


   And those systems that used the first location to store the return
address are not re-entrant, without other supporting code in the
background.  I think I used one of those once as well.

There's no requirement for re-entrancy in K&R or ANSI/ISO.  In fact
several standard library routines are known to not be re-entrant.


This is true, but knowing that the base code is not reentrant due to
design constraints or due to hardware constraints makes the difference
on modern multithreaded systems, where the same executable memory can be
used for the program (if the hardware allows that).

Sure, you need to know that you can compile re-entrant code if you need it.



   PS.  A stack doesn't necessarily mean a processor call and return
stack.  It is any mechanism of memory address where the data is applied
to the current location, then the pointer incremented (or decremented
depending on the architecture).

But usually in the context of discussions about compiler architectures,
call stacks are exactly what is meant.


I am not sure that is true, because in some implementations, the data
heap and stack are in the same segment of memory, while the runtime
stack for the processor is somewhere else.  For high security systems
running  this should be a requirement.  It prevents obvious means of
inserting malicious code through variable initialization, and then stack
manipulation.  I say should be, because it has been tossed around from
time to time, but I am unsure if it has ever been formalized.

One system I worked on looked like this:
   init jump
   heap
   variable stack (push down)
   program entrance
   program
   local libraries
   relocation table
   symbol table (if not removed)
   machine stack

   Unfortunately I no longer remember which system that was.  Just the
fact that some standard libraries at that time would not run on it
because they did manipulate the stack.

Regards,
Les H


--
		Matthew Saltzman

Clemson University Math Sciences
mjs AT clemson DOT edu
http://www.math.clemson.edu/~mjs


[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux