0

So I currently have the below method which randomly returns a string (of a known set of strings) based on a weighted probability (based on this):

def get_response(request)
  responses = ['text1', 'text2', 'text3', 'text4', 'text5', 'text6']
  weights = [5, 5, 10, 10, 20, 50]
  ps = weights.map { |w| (Float w) / weights.reduce(:+) }
  # => [0.05, 0.05, 0.1, 0.1, 0.2, 0.5]

  weighted_response_hash = responses.zip(ps).to_h
  # => {"text1"=>0.05, "text2"=>0.05, "text3"=>0.1, "text4"=>0.1, "text5"=>0.2, "text6"=>0.5}

  response = weighted_response_hash.max_by { |_, weight| rand ** (1.0 / weight) }.first

  response
end

Now instead of a random weighted output, I want the output to be consistent based on an input string while keeping the weighted probability of the response. So for example, a call such as:

get_response("This is my request")

Should always produce the same output, while keeping the weighted probability of the output text.

I think Modulo can be used here in some way, hash mapping to the same result but I'm kinda lost.

Tom
  • 9,275
  • 25
  • 89
  • 147
  • 1
    have you looked at [srand](http://ruby-doc.org/core-2.0.0/Random.html#method-c-srand)? – max pleaner Jul 21 '17 at 21:14
  • @maxpleaner I'm not trying to seed the random function. I'm trying to get a consistent weighted output based on a string input. – Tom Jul 21 '17 at 21:17
  • I'm struggling to understand your question. Your method `get_response` takes an argument `request` which is not used in the method body. Can you fix that? – Cary Swoveland Jul 22 '17 at 06:06

1 Answers1

0

What @maxpleaner was trying to say with srand is

srand may be used to ensure repeatable sequences of pseudo-random numbers between different runs of the program.

So, if you seed the random generator, you will always get the same results back.

For example if you do

random = Random.new(request.hash)
response = weighted_response_hash.max_by { |_, weight| random.rand ** (1.0 / weight) }.first

you will always end up with the same response whenever you pass in the same request.

old code

3.times.collect { get_response('This is my Request') }
# => ["text6", "text1", "text6"]
3.times.collect { get_response('This is my Request 2') }
# => ["text6", "text4", "text5"]

new code, seeding the random

3.times.collect { get_response('This is my Request') }
# => ["text4", "text4", "text4"]
3.times.collect { get_response('This is my Request 2') }
# => ["text1", "text1", "text1"]

The output is still weighted, just now has some predictability:

randoms = 100.times.collect { |x| get_response("#{x}") }
randoms.group_by { |item| item }.collect { |key, values| [key, values.length / 100.0] }.sort_by(&:first)
# => [["text1", 0.03], ["text2", 0.03], ["text3", 0.08], ["text4", 0.11], ["text5", 0.27], ["text6", 0.48]]
Simple Lime
  • 10,790
  • 2
  • 17
  • 32
  • @hash is flawed and unreliable in my case. See this: https://stackoverflow.com/questions/6536885/consistent-stringhash-based-only-on-the-strings-content – Tom Jul 21 '17 at 22:54
  • Ah, interesting, never noticed that, as the answer in your link suggests, you could also use Digest::SHA1 (`Digest::SHA1.hex_digest(request).to_i(16)` seems to work across 3 irb sessions) or, in whatever way suits your needs convert a string to an integer reliably. – Simple Lime Jul 21 '17 at 23:07