0

I have this Ruby function that tells me if two strings are "almost" equal, that is, if all characters in the string are identical and ordered in the same way except for one. So for instance, these are equal

equal
eual

but these are not

eal
equal

(two characters are missing in the above). So with help, I have come up with this

(lcs(a,b) == shortest && longest.length - shortest.length == 1)

in which las is defined by

  def lcs(xstr, ystr)
    return "" if xstr.empty? || ystr.empty?

    x, xs, y, ys = xstr[0..0], xstr[1..-1], ystr[0..0], ystr[1..-1]
    if x == y
      x + lcs(xs, ys)
    else
      [lcs(xstr, ys), lcs(xs, ystr)].max_by {|x| x.size}
    end
  end

but my function is taking an extraordinarily long time. Note my benchmark below

2.4.0 :011 > timing = Benchmark.measure { StringHelper.lcs("navesxkolsky|1227000", "navsxkolsky|1227000") }
 => #<Benchmark::Tms:0x007fa1753830d8 @label="", @real=21.341279999993276, @cstime=0.0, @cutime=0.0, @stime=0.030000000000000027, @utime=21.28, @total=21.310000000000002>

Is there something I'm missing here that can get my comparison time down to like one second instead of 21?

  • Maybe Levenshtein distance fill your need: https://en.wikipedia.org/wiki/Levenshtein_distance Ruby code here: https://stackoverflow.com/questions/46402903/levenshtein-distance-in-ruby/46410685#46410685 – Sofa Sep 28 '17 at 00:00

1 Answers1

0

Try this. The main idea is that if the method is to return false, it will do so as soon as that is known, even if rudundant code is required. (The method below still works if the line return false if (sz1-sz2).abs > 1 is removed.)

def equal_but_one?(str1, str2)
  sz1 = str1.size
  sz2 = str2.size
  return false if (sz1-sz2).abs > 1
  i = [sz1, sz2].max.times.find { |i| str1[i] != str2[i] }
  return false if i.nil?
  case sz1 <=> sz2
  when 0
    str1[i+1..-1] == str2[i+1..-1]
  when -1
    str2[i+1..-1] == str1[i..-1]
  when 1
    str1[i+1..-1] == str2[i..-1]
  end
end

equal_but_one?('cat', 'cut')     #=> true
equal_but_one?('bates', 'bats')  #=> true
equal_but_one?('buss', 'bus')    #=> true
equal_but_one?('cat', 'cat')     #=> false
equal_but_one?('pig', 'pigs')    #=> true 
equal_but_one?('pig', 'pegs')    #=> false
equal_but_one?('', '')           #=> false
equal_but_one?('', 'a')          #=> true

require 'benchmark'

Benchmark.measure { equal_but_one?("navesxkolsky|1227000", "navsxkolsky|1227000") }.real
  #=> 1.6000005416572094e-05
Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
  • Thanks so much for this. One thing -- it is returning "false" for me in this scenario -- StringHelper.lcs("bates", "bats") even though it shoulld return true ("e" is the only difference between the words). –  Sep 28 '17 at 15:37
  • I see. I misunderstood the question. l fixed my answer. – Cary Swoveland Sep 28 '17 at 16:33