Operator precedence – what does (i=1)*i– – –i*(i=-3)*i++ + ++i equal and why?
February 22, 2009
Why do you care?
As with any human language, it’s always best to not only be able speak the language, but also to understand it. Anyone can copy and paste a bit of code from the internet into their project, just as anyone can "speak" french by using google translate. But what happens if for some reason the copied code doesn’t work as you wanted it to? There are obviously a million and three possible reasons, and having more of an understanding of the language really comes into it’s own in these annoying and stressful situations. That being said, I’m not going to write about any particular language in this post; I’ll be talking about a fictional language CGCSL (Clinton’s Generic C Style Language) but as the name suggests, the underlying principles will (mostly) be true for any C style language (PHP, Javascript, Java etc).
What operator precedence means
In it’s very simplest terms, it means what (mathematical or logical) operator is more important than others. This is a very important concept, without rules as to which instruction to do first, computers would be pretty stupid. Take the example of 2 + 3 * 4
. As humans, we naturally know (because we were taught at school) to do the multiplication first, then the addition. But without these rules, it would be impossible to know whether the answer is 14 or 20 (because (2 + 3) * 4 is 20, and 2 + (3 * 4) is 14). But luckily, the clever people who invent computer languages define these rules and tell us what the rules are. These are usually quite long and boring documents, although thankfully for us, there are non-nerds out there who make these a little easier to read. Take a look at the Mozilla Javascript operator precedence document, for example. These rules help the compiler (or interpreter) to create a parse tree which tells it what to do, and in which order.
Parse trees
Now we’re getting onto quite complex topics which are studied at degree level computer science. But the idea is simple. Take the code, split it up into small pieces, and decide what order to execute the pieces in. Obviously for anything other than a tiny and simple piece of code these can get very long and complicated and so are best represented in a computer data structure than a human readable example. If you have any knowledge of data structures, you might have an inkling as to why they’re called trees – it turns out that a simple structure called a tree is very good at storing this kind of data.
Enough waffle, let’s look at an example
So first let’s take a look at making a parse tree for the simple problem 2 + 3 + 4 * 5
. We know that the answer is 25, right? But how do we get to it? First, start by breaking the problem down into smaller parts, we can do this by adding some brackets – (2 + 3) + (4 * 5)
. This is now perfect for a (simple) parse tree, it is broken down into small parts, each with two things to work on. It must be said at this point that parse trees don’t always have to have only two operands, but it helps for simplicity.
The algorithm to read (my) parse trees is a depth first traversal. In words: start at the top and keep going down the left hand branches until you reach the bottom (in this case the 2). Go up one level (to the +), then get it’s second argument from it’s right hand branch. If required, treat this operator as the root, and repeat the steps above until the whole tree is covered. So in my example, the order which you would read the nodes (numbers or operators) would be: 2 + (from the bottom) 3 + (from the top) 4 * 5. (Pretty cool that they come out in the same order as the original expression, no?)
Other useful bits
i++
As you know, this means increment i by 1 (add 1 to i). This is usually (and for readable code) should normally only be used inside for, while and do loops. More specifically, it means grab the value of i, then add 1 to it after the operation. A CGCSL example:
i=1;
print i++; // this will output 1
print i; // this will output 2
++i
While this looks like i++
, it is subtly different. This time, it means add 1 to i and then use the new value. So i++
will return a value of 1 lower than ++i
. A CGCSL example:
i=1;
print ++i; // this will output 2
print i; // this will also output 2
(i=1)
and (i=-3)
You might not know this, but when you assign a value to a variable, the value of the right hand side of the equals operator is returned. Although you might not know that you know this, you have probably used it. Take a look at the following PHP example (from the PHP opendir page):
if ($handle = opendir('/path/to/files')) {
// read the documentation on opendir, it states that it will return a directory handle
// if it's able to open the directory, or false otherwise. So the value of the assignment
// here will be a handle, or the boolean FALSE.
}
So extending that to (i=1)
, we can see that this code will: a) set the value of i to be 1, and b) return the value of 1.
Onto the original question
With all of these new tools at our disposal, let’s try to disambiguate the line of code, create a parse tree, and work out the answer. From the list of operator precedences, we know that multiplication is more important than addition, so let’s break the problem down into addition of multiplications. (Sounds scary, but adding in a few parentheses will make it easier to see).
// the original code
(i=1)*i-- - --i*(i=-3)*i++ + ++i
// new code with extra brackets
((i=1)*i--) - (--i*(i=-3)*i++) + (++i)
Now let’s understand it piece by piece, find the answers, then add them up.
(i=1)*i--
The code inside the brackets sets i to be 1, then returns the value of 1. On the right of the multiplication, we grab the value of 1, then subtract 1 from it. So after this line of code, we know that:
- The value of i after this snippet is 0
- The value returned from this snippet is 1 (from 1 * 1)
--i*(i=-3)*i++
We know that --i
takes away 1, then returns the value, so after this, we know that both the value of i, and the number we’re multiplying are both -1. Next, we set i to be -3, and the bracketed code returns a value of -3, so both i and the number to multiply are -3. Finally, i++
returns the value of i before incrementing it by 1, so the final number for the multiplication is -3, and i is -2. Again, we know that:
- The value of i after this snippet is -2
- The value returned from this snippet is -9 (from -1 * -3 * -3)
++i
And finally, the simplest part of the code, add 1 to i (making it -1) and return the new value. So we know that:
- The value of i after this snippet is -1 (which we now don’t care about)
- The value returned from this snippet is -1
Add them up
After executing all of the sub-parts of the calculation, and substituting the answers into the brackets instead of the horrific looking expressions, the final code would look something like the following snippet, which I’m sure we can all agree equals 9.
(1) - (-9) - (1)
Avert your eyes!
I have put together a parse tree for this rather nasty line of code, but I must warn you, it may well turn you into stone if you look at it from the wrong angle. You might notice that ++i
and it’s friends are missing, this is because I have moved them around into appropriate places in the tree as an i+1
operation.
Final thoughts
As with a number of my posts, I want to finish with the classic "Just because you can, doesn’t mean you should!". This post was intended as a rather flouncy and creative way to explain operator precedence and why it is important. Just imagine the scenario: you’re maintaining a piece of code where the developer decided to show off their sKiLLz and turn a multi-line if/else into a one liner. And we’re not talking a comedy genius one liner here – programming one liners hurt!
And I include you and me in the term ‘the developer’. I’ve lost count of the number of times I’ve gone back to a ‘clever piece of code’ to have no idea what it was doing. One of the many things which I’ve learned during my programming years (almost decade – that’s scary!) is that clever code isn’t clever. It might seem clever at the time, but real clever coding is writing more (and commented!) code which will make it it easier for you to read and understand in a year’s time. One of my favourite programming quotes is:
“ Programs must be written for people to read, and only incidentally for machines to execute.
Harold Abelson (Structure and Interpretation of Computer Programs, Second Edition)
The interpreters or compilers of today are extremely clever and highly optimise your code before running it, so fiddling with the code won’t make any difference. Of course, refactoring your algorithms will help and should be done, but using language constructs in obscure and obscene ways to try to optimise your code isn’t necessary.