diff options
Diffstat (limited to 'cil/doc/cil016.html')
-rw-r--r-- | cil/doc/cil016.html | 342 |
1 files changed, 0 insertions, 342 deletions
diff --git a/cil/doc/cil016.html b/cil/doc/cil016.html deleted file mode 100644 index 3191a9d..0000000 --- a/cil/doc/cil016.html +++ /dev/null @@ -1,342 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" - "http://www.w3.org/TR/REC-html40/loose.dtd"> -<HTML> -<HEAD> - - - -<META http-equiv="Content-Type" content="text/html; charset=ANSI_X3.4-1968"> -<META name="GENERATOR" content="hevea 1.08"> - -<base target="main"> -<script language="JavaScript"> -<!-- Begin -function loadTop(url) { - parent.location.href= url; -} -// --> -</script> -<LINK rel="stylesheet" type="text/css" href="cil.css"> -<TITLE> -Who Says C is Simple? -</TITLE> -</HEAD> -<BODY > -<A HREF="cil015.html"><IMG SRC ="previous_motif.gif" ALT="Previous"></A> -<A HREF="ciltoc.html"><IMG SRC ="contents_motif.gif" ALT="Up"></A> -<A HREF="cil017.html"><IMG SRC ="next_motif.gif" ALT="Next"></A> -<HR> - -<H2 CLASS="section"><A NAME="htoc42">16</A> Who Says C is Simple?</H2><A NAME="sec-simplec"></A> -When I (George) started to write CIL I thought it was going to take two weeks. -Exactly a year has passed since then and I am still fixing bugs in it. This -gross underestimate was due to the fact that I thought parsing and making -sense of C is simple. You probably think the same. What I did not expect was -how many dark corners this language has, especially if you want to parse -real-world programs such as those written for GCC or if you are more ambitious -and you want to parse the Linux or Windows NT sources (both of these were -written without any respect for the standard and with the expectation that -compilers will be changed to accommodate the program). <BR> -<BR> -The following examples were actually encountered either in real programs or -are taken from the ISO C99 standard or from the GCC's testcases. My first -reaction when I saw these was: <EM>Is this C?</EM>. The second one was : <EM>What the hell does it mean?</EM>. <BR> -<BR> -If you are contemplating doing program analysis for C on abstract-syntax -trees then your analysis ought to be able to handle these things. Or, you can -use CIL and let CIL translate them into clean C code. <BR> -<BR> -<A NAME="toc24"></A> -<H3 CLASS="subsection"><A NAME="htoc43">16.1</A> Standard C</H3> -<OL CLASS="enumerate" type=1><LI CLASS="li-enumerate">Why does the following code return 0 for most values of <TT>x</TT>? (This -should be easy.) -<PRE CLASS="verbatim"><FONT COLOR=blue> - int x; - return x == (1 && x); -</FONT></PRE> -See the <A HREF="examples/ex30.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">Why does the following code return 0 and not -1? (Answer: because -<TT>sizeof</TT> is unsigned, thus the result of the subtraction is unsigned, thus -the shift is logical.) -<PRE CLASS="verbatim"><FONT COLOR=blue> - return ((1 - sizeof(int)) >> 32); -</FONT></PRE> -See the <A HREF="examples/ex31.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">Scoping rules can be tricky. This function returns 5. -<PRE CLASS="verbatim"><FONT COLOR=blue> -int x = 5; -int f() { - int x = 3; - { - extern int x; - return x; - } -} -</FONT></PRE> -See the <A HREF="examples/ex32.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">Functions and function pointers are implicitly converted to each other. -<PRE CLASS="verbatim"><FONT COLOR=blue> -int (*pf)(void); -int f(void) { - - pf = &f; // This looks ok - pf = ***f; // Dereference a function? - pf(); // Invoke a function pointer? - (****pf)(); // Looks strange but Ok - (***************f)(); // Also Ok -} -</FONT></PRE> -See the <A HREF="examples/ex33.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">Initializer with designators are one of the hardest parts about ISO C. -Neither MSVC or GCC implement them fully. GCC comes close though. What is the -final value of <TT>i.nested.y</TT> and <TT>i.nested.z</TT>? (Answer: 2 and respectively -6). -<PRE CLASS="verbatim"><FONT COLOR=blue> -struct { - int x; - struct { - int y, z; - } nested; -} i = { .nested.y = 5, 6, .x = 1, 2 }; -</FONT></PRE> -See the <A HREF="examples/ex34.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">This is from c-torture. This function returns 1. -<PRE CLASS="verbatim"><FONT COLOR=blue> -typedef struct -{ - char *key; - char *value; -} T1; - -typedef struct -{ - long type; - char *value; -} T3; - -T1 a[] = -{ - { - "", - ((char *)&((T3) {1, (char *) 1})) - } -}; -int main() { - T3 *pt3 = (T3*)a[0].value; - return pt3->value; -} -</FONT></PRE> -See the <A HREF="examples/ex35.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">Another one with constructed literals. This one is legal according to -the GCC documentation but somehow GCC chokes on (it works in CIL though). This -code returns 2. -<PRE CLASS="verbatim"><FONT COLOR=blue> - return ((int []){1,2,3,4})[1]; -</FONT></PRE> -See the <A HREF="examples/ex36.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">In the example below there is one copy of “bar” and two copies of - “pbar” (static prototypes at block scope have file scope, while for all - other types they have block scope). -<PRE CLASS="verbatim"><FONT COLOR=blue> - int foo() { - static bar(); - static (*pbar)() = bar; - - } - - static bar() { - return 1; - } - - static (*pbar)() = 0; -</FONT></PRE> -See the <A HREF="examples/ex37.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">Two years after heavy use of CIL, by us and others, I discovered a bug - in the parser. The return value of the following function depends on what - precedence you give to casts and unary minus: -<PRE CLASS="verbatim"><FONT COLOR=blue> - unsigned long foo() { - return (unsigned long) - 1 / 8; - } -</FONT></PRE> -See the <A HREF="examples/ex38.txt">CIL output</A> for this -code fragment<BR> -<BR> -The correct interpretation is <TT>((unsigned long) - 1) / 8</TT>, which is a - relatively large number, as opposed to <TT>(unsigned long) (- 1 / 8)</TT>, which - is 0. </OL> -<A NAME="toc25"></A> -<H3 CLASS="subsection"><A NAME="htoc44">16.2</A> GCC ugliness</H3><A NAME="sec-ugly-gcc"></A> -<OL CLASS="enumerate" type=1><LI CLASS="li-enumerate">GCC has generalized lvalues. You can take the address of a lot of -strange things: -<PRE CLASS="verbatim"><FONT COLOR=blue> - int x, y, z; - return &(x ? y : z) - & (x++, x); -</FONT></PRE> -See the <A HREF="examples/ex39.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">GCC lets you omit the second component of a conditional expression. -<PRE CLASS="verbatim"><FONT COLOR=blue> - extern int f(); - return f() ? : -1; // Returns the result of f unless it is 0 -</FONT></PRE> -See the <A HREF="examples/ex40.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">Computed jumps can be tricky. CIL compiles them away in a fairly clean -way but you are on your own if you try to jump into another function this way. -<PRE CLASS="verbatim"><FONT COLOR=blue> -static void *jtab[2]; // A jump table -static int doit(int x){ - - static int jtab_init = 0; - if(!jtab_init) { // Initialize the jump table - jtab[0] = &&lbl1; - jtab[1] = &&lbl2; - jtab_init = 1; - } - goto *jtab[x]; // Jump through the table -lbl1: - return 0; -lbl2: - return 1; -} - -int main(void){ - if (doit(0) != 0) exit(1); - if (doit(1) != 1) exit(1); - exit(0); -} -</FONT></PRE> -See the <A HREF="examples/ex41.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">A cute little example that we made up. What is the returned value? -(Answer: 1); -<PRE CLASS="verbatim"><FONT COLOR=blue> - return ({goto L; 0;}) && ({L: 5;}); -</FONT></PRE> -See the <A HREF="examples/ex42.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate"><TT>extern inline</TT> is a strange feature of GNU C. Can you guess what the -following code computes? -<PRE CLASS="verbatim"><FONT COLOR=blue> -extern inline foo(void) { return 1; } -int firstuse(void) { return foo(); } - -// A second, incompatible definition of foo -int foo(void) { return 2; } - -int main() { - return foo() + firstuse(); -} -</FONT></PRE> -See the <A HREF="examples/ex43.txt">CIL output</A> for this -code fragment<BR> -<BR> -The answer depends on whether the optimizations are turned on. If they are -then the answer is 3 (the first definition is inlined at all occurrences until -the second definition). If the optimizations are off, then the first -definition is ignore (treated like a prototype) and the answer is 4. <BR> -<BR> -CIL will misbehave on this example, if the optimizations are turned off (it - always returns 3).<BR> -<BR> -<LI CLASS="li-enumerate">GCC allows you to cast an object of a type T into a union as long as the -union has a field of that type: -<PRE CLASS="verbatim"><FONT COLOR=blue> -union u { - int i; - struct s { - int i1, i2; - } s; -}; - -union u x = (union u)6; - -int main() { - struct s y = {1, 2}; - union u z = (union u)y; -} -</FONT></PRE> -See the <A HREF="examples/ex44.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">GCC allows you to use the <TT>__mode__</TT> attribute to specify the size -of the integer instead of the standard <TT>char</TT>, <TT>short</TT> and so on: -<PRE CLASS="verbatim"><FONT COLOR=blue> -int __attribute__ ((__mode__ ( __QI__ ))) i8; -int __attribute__ ((__mode__ ( __HI__ ))) i16; -int __attribute__ ((__mode__ ( __SI__ ))) i32; -int __attribute__ ((__mode__ ( __DI__ ))) i64; -</FONT></PRE> -See the <A HREF="examples/ex45.txt">CIL output</A> for this -code fragment<BR> -<BR> -<LI CLASS="li-enumerate">The “alias” attribute on a function declaration tells the - linker to treat this declaration as another name for the specified - function. CIL will replace the declaration with a trampoline - function pointing to the specified target. -<PRE CLASS="verbatim"><FONT COLOR=blue> - static int bar(int x, char y) { - return x + y; - } - - //foo is considered another name for bar. - int foo(int x, char y) __attribute__((alias("bar"))); -</FONT></PRE> -See the <A HREF="examples/ex46.txt">CIL output</A> for this -code fragment</OL> -<A NAME="toc26"></A> -<H3 CLASS="subsection"><A NAME="htoc45">16.3</A> Microsoft VC ugliness</H3> -This compiler has few extensions, so there is not much to say here. -<OL CLASS="enumerate" type=1><LI CLASS="li-enumerate"> -Why does the following code return 0 and not -1? (Answer: because of a -bug in Microsoft Visual C. It thinks that the shift is unsigned just because -the second operator is unsigned. CIL reproduces this bug when in MSVC mode.) -<PRE CLASS="verbatim"><FONT COLOR=blue> - return -3 >> (8 * sizeof(int)); -</FONT></PRE><BR> -<BR> -<LI CLASS="li-enumerate">Unnamed fields in a structure seem really strange at first. It seems -that Microsoft Visual C introduced this extension, then GCC picked it up (but -in the process implemented it wrongly: in GCC the field <TT>y</TT> overlaps with -<TT>x</TT>!). -<PRE CLASS="verbatim"><FONT COLOR=blue> -struct { - int x; - struct { - int y, z; - struct { - int u, v; - }; - }; -} a; -return a.x + a.y + a.z + a.u + a.v; -</FONT></PRE> -See the <A HREF="examples/ex47.txt">CIL output</A> for this -code fragment</OL> -<HR> -<A HREF="cil015.html"><IMG SRC ="previous_motif.gif" ALT="Previous"></A> -<A HREF="ciltoc.html"><IMG SRC ="contents_motif.gif" ALT="Up"></A> -<A HREF="cil017.html"><IMG SRC ="next_motif.gif" ALT="Next"></A> -</BODY> -</HTML> |