Re: [edk2] Common, pedantic, bug in embedded C. * (char *)0x00000000 = 1

Subject: Re: [edk2] Common, pedantic, bug in embedded C. * (char *)0x00000000 = 1

From: "Mcdaniel, Daryl" <daryl.mcdaniel@intel.com>

To: "edk2-devel@lists.sourceforge.net" <edk2-devel@lists.sourceforge.net>

Date: 2013-03-09 03:58:42

A compiler that optimizes out references to NULL is doing the wrong thing.

The ISO/IEC 9899:199409 (C95) language specification says nothing about using volatile with NULL.

The language specification does NOT prohibit reading from, writing to, or executing a function that is located at, address 0.
It does prohibit the compiler from placing data or functions at that location.

It sounds like clang is broken if it doesn't have an option to disable the behavior you are reporting.
Changing references through a NULL pointer to a trap would be a nonstandard extension.

Daryl McDaniel

-----Original Message-----
From: Andrew Fish [mailto:afish@apple.com] 
Sent: Friday, March 08, 2013 11:37 AM
To: edk2-devel@lists.sourceforge.net
Subject: Re: [edk2] Common, pedantic, bug in embedded C. * (char *)0x00000000 = 1


On Mar 8, 2013, at 10:12 AM, "Mcdaniel, Daryl"  wrote:

> A multi-platform compiler should not interfere with dereferencing NULL.
> No general purpose (cross) compiler for X86 should interfere with dereferencing NULL.

The point I was making is that dereferencing NULL is undefined behavior as far as the C standard is concerned, and compilers are free to optimize undefined behavior away. To dereference NULL in C you must use a volatile pointer to conform to the language definition. 

It looks like clang and GCC will optimize out reads of a known NULL pointer. For clang a store or call through a null pointer are converted to a __builtin_trap() aka X86 ud2 instruction, and this is an implementation choice to help programmers remove undefined behavior from their code. For the edk2 we tell clang to replace the ud2 with a function call that does not exist so we get a link failure in this case. But a read to NULL will just get optimized out!

Here is a simple example:

~/work/Compiler>cat null.c

int test ()
{
   int *x = (int *)0;
   int foo = *x;
   if (!x)
       return -1;

   return foo;
}

~/work/Compiler>clang -S -Os null.c -arch i386
~/work/Compiler>cat null.s
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test
_test:                                  ## @test
## BB#0:
	pushl	%ebp
	movl	%esp, %ebp
	movl	$-1, %eax
	popl	%ebp
	ret


.subsections_via_symbols

Paolo  mentioned that this is also true for gcc:

GCC will just optimize the code as it sees
fit.  As David pointed out, it means for example that (without the above
option):

   int foo = *x;
   if (!x)
       return ERROR;

   return foo;

will be optimized to simply "return *x".

Thanks,

Andrew Fish

------------------------------------------------------------------------------
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester  
Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the  
endpoint security space. For insight on selecting the right partner to 
tackle endpoint security challenges, access the full report. 
http://p.sf.net/sfu/symantec-dev2dev
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/edk2-devel

------------------------------------------------------------------------------
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester  
Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the  
endpoint security space. For insight on selecting the right partner to 
tackle endpoint security challenges, access the full report. 
http://p.sf.net/sfu/symantec-dev2dev
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/edk2-devel