I read a funny remark on ‘null’ comparison recently which at first glance seems to be surprising, but uppon further investigation becomes understandable. The following program demonstrates:
static void Main(string[] args) { var lessThanOrEqual = null <= null; var equal = null == null; Console.WriteLine(lessThanOrEqual); Console.WriteLine(equal); }
The output of the program:
False
True
How can the ‘Less than or equal’ operator be false, when it has the same value on both sides? And why does it work for equality operator? What is the difference? This is what we are going to decipher in the following.
null <= null
The trick of this expression is that it doesn’t look to be as it is. This is especially misleading for those with C++ background. The expression above doesn’t compare references nor pointers.
Comparison operators
The C# language has different comparison operators, one of them is <=. We can learn how these operators work from the C# Language Specification. (14.9 Relational and type-testing operators). Briefly, in case of x <= y, the operator result is true when x is less than or equal to y – otherwise the result is false. We are not surprised by the definition – but it is still a contradiction to the results. In case of 'a <= a' we expect true for every 'a', and it is also a rule of ordered sets in mathematics called reflexivity.
The specification also describes that the operator <= has many predefined versions. For example for integer types:
bool operator <=(int x, int y);
bool operator <=(uint x, uint y);
bool operator <=(long x, long y);
...
floating point types:
bool operator <=(float x, float y);
bool operator <=(double x, double y);
and for decimal and enum types.
What is null?
So far it hasn’t been revealed why <= compiles with null. We know there are overloads for integers, floating point and other numerical and enumerated types, but what fits to null? What is null anyway?
In C# when we write ‘null’ as a literal, it will evaluate to ‘null value’ which can denote a reference not pointing any object, or the absence of a value. The second case may sound strange, but it will be clear if we consider nullable types.
Many variations, less operators
We have seen a list of overloads for operator the <= in the C# Language Specification. One common attribute of these overloads is that both parameters have the same type. On the other hand, we know that the <= operator works even with mixed type arguments:
if (12M < 34L)
{
…
}
In this case a decimal and a long value gets compared. Yes, many know implicit conversions. C# defines lots of implicit conversion rules (specification, 13.1.2), usually from a narrower type to a broader one. These conversions follow a set of rules, these rules can be found in the 14.2.6.2 of C# specification:
- If either operand is of type decimal, the other operand is converted to type decimal
- Otherwise, if either operand is of type double, the other operand is converted to type double.
…
- Otherwise, if either operand is of type uint, the other operand is converted to type uint.
- Otherwise, both operands are converted to type int.
The most important one from this list is the last bullet point. It means that if there is no fit, values will be converted to int.
int vs. null
Now our problem is to convert null to int. This is obviously not possible, but there is a loophole which take us closer to our final goal – the Nullable types (Specification 8.19). We can learn from the specification that there is an implicit conversion to every Nullable type from null type. So we cannot convert null to int, but we can do so for to Nullable. What a pity Nullable is not what we wanted. Unless…
“Lifted” operators
To make Nullable types a first class citizen in the language, C# introduces many sophisticated tricks. One of those tricks is the mechanism of lifted operators. If an operator has a form of op(T x, T y)
, then it gets a lifted op(T? x, T? y)
form, too. It is a simple extension of the original operator, where the result is defined in case of x or y or both as null. For operator <= the definition says that if one or both operands are null, then the result is false. (Specification 14.2.7)
What happened then?
Now we understand the result of null <= null: The C# compiler found the operator<= with null operands. According to its rules, it started searching an appropriate overload, but it could not find any for null type. So it converted the operands to int? and executed operator<=(int? x, int? y) overload. According to the lifted operator definition, it resulted False.
But why False?
The process is clarified, but we might still debate as to why False is defined in case of null <= null. It has the same value on both sides. But our assumption that both sides have the same value is wrong. According to the C# specification, in this case null denotes the lack of value. There is nothing to compare. There are other similar cases, NaN behaves the same way. If it is still not enough, look at the following example:
static Dictionary<string, int> statistics = new Dictionary<string,int>(); static int? GetScoreOf(string thing) { int score; if (statistics.TryGetValue(thing, out score)) { return score; } else { return null; } } static void Main(string[] args) { var scoreOfMine = GetScoreOf("Me"); var scoreOfYours = GetScoreOf("You"); if (scoreOfYours <= scoreOfMine) { Console.WriteLine("I am at least as cool as you are!"); } }
The tiny program above has a small database of scores. If it cannot find the data then it returns null. The main section of the program gets “my” and “your” scores then compares. What is the right behavior if both “my” and “your” scores are missing? Should the program declare that I am at least as cool as you are? I think not. Because no data is available to tell it. It is also worth considering that it would be a wrong decision to use an else block for an opposite action – we cannot say that ‘I am less cool than you are’.
null == null
For now, most might accept the behavior of null <= null in C#. But it seemingly contradicts the result of null == null which is True. What is behind this?
If we read carefully the C# specification, in the 14.9 paragraph we can find the following:
If both operands of an equality-expression (which is == or !=) have the null type, then the overload resolution is not performed (so the compiler does not start to look for an overload of the operator==) and the expression evaluates to a constant value of true or false according to whether the operator is == or !=.
It means that ‘null == null’ is nothing more than a tricky way to express the boolean true constant. The question remains: why have the C# designers chosen this behavior? The previous example can be applied to the operator==, too:
static void Main(string[] args) { var scoreOfMine = GetScoreOf("Me"); var scoreOfYours = GetScoreOf("You"); if (scoreOfYours == scoreOfMine) { Console.WriteLine("I am as cool as you are"); } }
In this case the text will be appear even if none of our scores can be retrieved from the database – so no information about our “coolness”. Coming from this way C# behaves incorrectly. But the designers of C# needed to consider different aspects. Let’s look at the following snippet:
if (scoreOfMine == null) { Console.WriteLine("I do not know how cool I am"); }
Does it appear wrong? Not for me. We are trying to prove that the expression in the if statement should be false – but in this example it works the way as many developers expect – the value of the expression is true if scroreOfMine has no value. In an ideal world, every developer would have written the following:
if (!scoreOfMine.HasValue) { Console.WriteLine("I do not know how cool I am"); }
In a world like this, the value of null == null could have been False and it wouldn’t have been a problem for anyone. But because of the reference types and because of other programming languages, most developers expect from scoreOfMine == null expression, that it will be true when scoreOfMine has no value. Therefore, the designer of C# needed to choose between the well-shaped world and a world which works. Apparently, they have chosen the world which works, on the price of this world is a bit malformed around the operator == and <=.
Conclusion
We investigated a strange part of C#. If we look at this from math perspective, we might deem it wrong. But C# is rather a tool used in industry to produce money. Its goal is not to be round as a theory in math. Its goal is to fit into hand like a well designed hammer. To accomplish this, it should work as developers expect it in common use cases – even if it leads to contradictions in special cases.
…it should generate a compiler warning, anyway…
Original article in Hungarian: A C# védelmében (null <= null)