C (for example) has five kinds of looping constructs: for
, while
, do while
, goto
, and recursion. I'll skip the last two, since they lack bounds check (the programmer has to insert a stop). I used to write assembly that would implement loops for each of the first three constructs like the following:
for | while | do while |
---|---|---|
|
|
|
|
|
|
|
|
|
Along the way a break
would get translated as a goto L_done
and a continue
would get translated as a goto L_done
.
This seemed pretty straight forward. On most platforms there is an instruction for each of these steps, including a branch on negative condition. And I'd see compilers that would implement loops this way. But then I found a compiler that did something strange. It always put the loop condition at the bottom:
for | while | do while |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
When I first encountered this I was baffled. This compiler was adding extra jumps. (On the smallest of microcontrollers a jump is the most expensive operation). This seemed wrong.
And it was wrong - for that compiler. That compiler did produce inefficient checks and branches in that version. This is critical in a microcontroller, where you often have to a pretty predictable number of clock cycles between any two steps of code of operation. (The most stringent you need this timing, the more stripped down your code is, to the point, at the extreme, you drop down just write assembly).
But I found that this style of constructing the loops in other compilers. It was there for a reason: it simplified the design of the compiler, and made it easier to generate efficient code for more platforms.
First, it turned each of the three kind of loops into one kind of construct. This makes it easier on the code generator, both for the loop, and for the termination check. It also makes it a lot easier on the optimizer, which has to go thru and look for special cases that should be stripped out. For instance, it is a bit easier now to write a check that the loop is infinite, or runs only once, and remove the extra jumps.
Second, it made all of the loop condition checks consistent - the way I was expecting would negate the check in 2 of the three cases. Not all processors have a conditional branch that jumps on the negative case - or atleast does so efficiently. This consistency makes it more likely that implementing equivalent loops in one form or another will have similar behavior.