1

I have a test which executes a function which uses random things. I would like to use hypothesis (or something else ?) to run it several times and know, when it fails, which random seed was used.

How can I do that?

My goal is to test my code several times to ensure it does not fail because of its use of random.

Chris
  • 1,335
  • 10
  • 19
bux
  • 7,087
  • 11
  • 45
  • 86
  • 1
    It doesn't sound like an overall good strategy for testing as you might never found those random values that make your function fail, so what's the point of testing? – user2314737 Aug 12 '20 at 19:13
  • The function is a game function which make a fight between several characters. Some random are used in several portions of code (chance to dodge, chance to make good shot, etc). Goal is to test if these random do not produce an unexpected behavior. (in really, players report a crash which i don't know how it happens. I hope find it with this way) – bux Aug 12 '20 at 19:21
  • Hypothesis sounds like a good choice - have you tried to use it? And what didn't work? – MrBean Bremen Aug 12 '20 at 19:53
  • I didn't undetstand how to do that by reading the doc :p – bux Aug 12 '20 at 20:29

3 Answers3

0

Hypothesis is an excellent possibility for your use case - if you use it right. First off, why it works: it's not random, but pseudorandom. When a test fails with a complex example, it will tone down the complexity until it finds the minimal test case that fails, and gives you that. Then it also stores a database of failing test cases, so replaying old failures is one of the first things it tries.

Now, the drawbacks are that building a testcase generally takes a long time, but the benefit is that you can be really sure of your code being robust.

I have no idea what your code looks like, but just to give you a mock-up:

from hypothesis import strategies as st
from hypothesis import assume

# In this example, damage and miss chance are based
# on the length of the name of the attack
samus = Character(health=100, attacks=['punch', 'shoot'])
wario = Character(health=70, attacks=['growl', 'punch'])
bowser = Character(health=250, attacks=['growl', 'kidnap_princess'])

st_character = st.sampled_from([samus, wario, bowser])
st_n_rounds = st.integer(min=0, max=10)

@st.composite
def fight_sequence(draw):
  player_one = draw(st_character)
  player_two = draw(st_character)

  # don't test entire fights, just simulate one, record the steps,
  # and check that the end state is what you expect
  actions = [
    dict(type='choose', player_number=1, player=player_one),
    dict(type='choose', player_number=2, player=player_two)
  ]

  # this filters out all test cases where players have the same character
  assume(player_one != player_two)

  n_rounds = draw(st_n_rounds)
  both_alive = True

  def _attack(player, other):
    if not both_alive:
      return

    attack = draw(player.attacks)
    response = draw(st.integers(min=0, max=len(attack)))
    response_type = 'miss' if response == 0 else 'crit' if response == len(attack)) else 'hit'
    actions.push(dict(type='attack', player=player, attack=attack, response=response_type))

    if response_type == 'hit':
       other.health -= len(attack)
    elif response_type == 'crit':
       other.health -= len(attack) * 2

    if other.health <= 0:
      actions.push(dict(type='ko', player=other))

  for _ in range(n_rounds):
    _attack(player_one, player_two)
    _attack(player_two, player_one)
  return actions

Then in your test case, feed the playback script to your code and check that the results line up. I hope you can use this for inspiration.

Ruben Helsloot
  • 12,582
  • 6
  • 26
  • 49
0

Yes, Hypothesis sounds like a good approach. For example:

from unittest import TestCase
from hypothesis import given
from hypothesis.strategies import integers
import hypothesis_random as undertest


class Test(TestCase):
    @given(seed=integers())
    def test_uses_random(self, seed):
        undertest.uses_random(seed)

If your function raises an error you will get a traceback for the exception and the falsifying example from Hypothesis that triggered it as output from the test e.g.

Falsifying example: test_uses_random(
    self=<test_hypothesis_random.Test testMethod=test_uses_random>, seed=-43,
)

Error
Traceback (most recent call last):
...
Chris
  • 1,335
  • 10
  • 19
0

Hypothesis' st.random_module() strategy is designed for exactly this use-case.

Zac Hatfield-Dodds
  • 2,455
  • 6
  • 19