0

Consider the problem of generating strings out our a set of possible strings, in such a way that once a string is chosen, it cannot be repeated again. For this task I would like to use QuickCheck's Gen functions.

If I look at the type of the function I'm trying to write, it looks pretty much like a state monad. Since I'm using another monad, namely Gen , inside the state monad. I wrote my first attempt using StateT.

arbitraryStringS :: StateT GenState Gen String
arbitraryStringS =
  mapStateT stringGenS get

where:

newtype GenState = St {getStrings :: [String]}
  deriving (Show)

removeString :: String -> GenState -> GenState
removeString str (St xs) = St $ delete str xs

stringGenS ::  Gen (a, GenState) -> Gen (String, GenState)
stringGenS genStSt =
  genStSt >>= \(_, st) ->
  elements (getStrings st) >>= \str ->
  return (str, removeString str st)

Something that troubles me about this implementation is the fact that I'm not using the first element of stringGenS. Secondly, my end goal is to define a random generator for JSON values, that make use of a resource pool (which contains not only strings). Using StateT led me to implement "stateful" variants of QuickCheck's elements, listOf, etc.

I was wondering whether there's a better way of achieving this, or such a complexity is inherent to defining stateful variants of existing monads.

Damian Nadales
  • 4,907
  • 1
  • 21
  • 34
  • I would do it the other way - to store the created `Strings` - or at least the seeds and compare each seed/generated string for membership in a `Set` of seeds/`String`. – epsilonhalbe Aug 17 '16 at 07:16
  • another choice could be using uuid's to generate "most probably" unique strings, if you only have a finite set of strings - you eventually run out of strings, you could work around by making combinations of big base sets - but still you will run into duplicate strings - if you need "real uniqueness" I'd go with a base set + an infinite set like the natural numbers and combine that. – epsilonhalbe Aug 17 '16 at 07:17
  • It is important that the strings come from the resource pool. This can be used to generate tests using data existing in some database. – Damian Nadales Aug 17 '16 at 07:21
  • 1
    (1) Why must be the generated strings unique? This doesn't sound like a standard testing use-case. (2) Why do you want to use the `Gen` monad? Perhaps the [random monad](https://hackage.haskell.org/package/MonadRandom-0.1.3/docs/Control-Monad-Random.html) would work better. – Petr Aug 17 '16 at 08:08
  • Forget about the uniqueness requirement. I just edited my answer. Then we can focus on generating random objects from a resource pool. About (2). I haven't consider this yet. But I'm curious on how to combine `Gen` with `StateT`. Besides using `Gen`, I can make use of its functions like `elements`, `listOf`, etc. – Damian Nadales Aug 17 '16 at 08:15
  • 1
    Is [`shuffle :: [a] -> Gen [a]`](https://hackage.haskell.org/package/QuickCheck-2.9.1/docs/Test-QuickCheck.html#v:shuffle) not exactly what you want? What exactly is the purpose of the `State`? You can get away with just `Reader` here. – user2407038 Aug 17 '16 at 19:04
  • This problem arises in a more complex context, where I need to generate random data for objects that have hierarchical dependencies. For instance, if I need to generate an object containing an artist and an album, once I chose the artist, I can only generate albums from that artist. – Damian Nadales Aug 18 '16 at 06:45

1 Answers1

1

The combination of StateT and Gen could look like this:

import Control.Monad.State
import Data.List (delete)
import Test.QuickCheck

-- A more efficient solution would be to use Data.Set.
-- Even better, Data.Trie and ByteStrings:
-- https://hackage.haskell.org/package/bytestring-trie-0.2.4.1/docs/Data-Trie.html
newtype GenState = St { getStrings :: [String] }
  deriving (Show)

removeString :: String -> GenState -> GenState
removeString str (St xs) = St $ delete str xs

stringGenS :: StateT GenState Gen String
stringGenS = do
  s <- get
  str <- lift $ elements (getStrings s)
  modify $ removeString str
  return str

The problem is that as you need the state, you can't run multiple such computations in Gen while sharing the state. The only reasonable thing to do would be to generate multiple random unique strings together (using the same state) as

evalStateT (replicateM 10 stringGenS)

which is of type GenState -> Gen [String].

Petr
  • 62,528
  • 13
  • 153
  • 317
  • Thanks. The use of `lift` is definitely more elegant. Regarding the State, as I mentioned in my previous comment, since I need to generate data that has a hierarchical relationship, the choice of one element limits my future choices, and that is the reason why I want to pass the state from one generator to the other. The generation of strings was just illustrative. – Damian Nadales Aug 18 '16 at 06:53
  • Beware that `Gen` doesn't give you any particular distribution. As it's used for reading, it tries to catch various corner cases, rather than being uniform in any way. – Petr Aug 18 '16 at 11:09
  • Good point. I definitely need to consider this in the future. It might be that I'll end up using a `Random` monad. However I hope I can use the same technique of composing monad transformers. – Damian Nadales Aug 18 '16 at 11:33
  • 1
    MonadRandom is actually better, because unlike Gen it also provides a transformer, RandT. So you can compose also the other way around. – Petr Aug 18 '16 at 11:43