3

I noticed some unexpected behavior when comparing Ruby strings. Which I will write below:

2.3.1 :011 >   '5.6' >= '5.5'
  => true
2.3.1 :012 >   '5.6' >= '5.7'
  => false
2.3.1 :013 >   '5.6' >= '5.6.1'
  => false
2.3.1 :014 >   '5.6' <= '5.6.1'
  => true
2.3.1 :016 >   '4.6.1' <= '5.6'
  => true
2.3.1 :017 >   '4.6.1' >= '5.6'
  => false

I see online in several places people are using Gem::Version.new() to compare semantic versions. That's not what my question is here though. Can anyone explain to me how Ruby seems to be able to compare semantic version strings without the assistance of any library? What happens when I compare two strings with numeric comparison operators?

From the above tests I think I can confirm that it is not simply comparing the ascii values of the first / last characters of each string. It is also not using string length as the primary comparison which was what I would have expected.

Thomas Deranek
  • 343
  • 3
  • 10
  • 2
    It's doing a strict string compare; not sure why you think it isn't. You can't exercise the differences until you give it a value where one of the "numbers" is greater than its associated string value, e.g., '11'. Oh he beat me to it. – Dave Newton May 25 '17 at 14:06
  • 2
    No it is just a string compassion and here is a counter example: `'4.11' >= '4.9' #=> false` – spickermann May 25 '17 at 14:06
  • 1
    By the way, it would be helpful if you could explain what *exactly* is unclear about [the documentation](http://ruby-doc.org/core/String.html#method-i-3C-3D-3E), so that the Ruby developers can improve it for future readers. – Jörg W Mittag May 25 '17 at 14:39

2 Answers2

3

It checks the ordinal of each individual character in the string. It stops the first time there is a mismatch on the same index. The higher the ordinal, the "bigger" the character is. Basically, it something like:

first_string.chars.map(&:ord) >= second_string.chars.map(&:ord)

As pointed in the comments, this doesn't lead to natural ordering, hence why people use Gem::Version:

'11' > '9' # => false
ndnenkov
  • 35,425
  • 9
  • 72
  • 104
1

It IS comparing plain strings.

For strings where all characters of one string are found at the beginning of the second string... but where the second string is a longer length, the shorter string is considered less-than.

Otherwise characters are compared one by one until the character in position "x" of one string is not equal to the character in position "x" of the second string, and in those cases the character earlier in the alphanumeric seqauence is considered less-than.

'cat' < 'caterpillar' 
=> true

'cow' < 'caterpillar' 
=> false

You CANNOT rely on this to give you correct comparisons of semantic version if the version numbers exceed one digit... so

'5.10' >= '5.9'
=> false

(which is not what one would hope)

SteveTurczyn
  • 36,057
  • 6
  • 41
  • 53