I was wondering if anyone when why the decision was made in Excel (or maybe it was made before Excel and Excel later adopted it) that strings are compared case-insensitively and strings are always considered different than numbers
Human beings need to know when two things are really equal, not just equivalent, this is a human need. Product or programming languages are enablers for this need.
Products and programming languages offer at least strict equality (truly equal by value and data type), i.e. comparison with no transformation, but there are others that also offer non-strict equality (by value but not by data type, i.e. equivalence). For example just take two programming languages: Javascript and Python (both considered dynamically typed languages).
Javascript
You have ==
and ===
(Strict Equality, i.e. value and type must be same). The operator ==
does the implicit conversion, i.e. "5" == 5
returns true
, but "5" === 5
returns false
. Check this link for more information.
Python
You have only ==
which doesn't do any conversion, in this case: "5" == 5
returns false
. Check this link for more information.
Excel
Equal Operator
Only compares for equality, through a single operator (=
) which behaves similarly to Python, i.e. the type and the value must be the same, strict equality, i.e. just comparison with no transformation. This behavior ensures that comparisons are performed accurately and consistently based on the data types involved. It helps prevent unintended and potentially erroneous comparisons that could occur if data types were automatically coerced or converted during the comparison process.
The above ensure fundamental principles of data types and comparisons in programming languages, such as: Type Safety and Explicitness, Predictability and Consistency, and Data Integrity.
Therefore:
"1" = 1 -> FALSE
0 = FALSE -> FALSE
The above examples are considered the most common use case scenarios, however, you can compare different data type, but you need to do explicitly the transformation (more verbose formula, since the is no other way to compare). For example:
VALUE("1") = 1 -> TRUE
0 + "1" = 1 -> TRUE
Why Excel made those product decisions? It is difficult to say unless they publish something around that. My guess is based on the most common scenario and user target audience back in 85: users wanted to know if two things are equal by type and value, which is fundamental principles of data types and comparisons in programming languages. If this is the most common scenario, then the decision was to specify the equal sign in a way it forces it. If you don't want this behavior, then it results in a more verbose formula, because you need to convert the value first (shown it before).
Case-insensitive string comparison
Probably the reason back in the 80s was based on compatibility with other existing competitors such as Multiplan(82), Lotus 1-2-3(83) to easy get/migrate their users. Previous programming languages in the 50s and 60s, such as BASIC, FORTRAN, or COBOL were all case-insensitive too. Another reason user expectations at that time Excel target users widely spread in various industries and by a diverse range of users, many of whom may not have programming backgrounds. For such users, having case-insensitive comparisons and searches can be more intuitive and aligned with their expectations.
Python "A"=="a"
returns false
, but Excel for the equivalent expression returns TRUE
. Excel is case insensitive for text comparison, except if you want to explicitly state it, then you need to use EXACT
function. For example:
=EXACT("aaa","Aaa") -> FALSE
It is true, such behavior is not explicitly documented, it doesn't specify the meaning of the equal sign (=
), based on the previous explanation it means same type and value, and for text values comparison it is case insensitive. ExcelJet states the same =
is not case sensitive.
The bottom line is there is no universal practice around this, Excel product designers made that decision because they thought it is better for their product, more user-friendly, following fundamental principles of data types in programming languages, and in the end more money.