4

I've been exploring using more newtype wrappers in my code to make more distinct types. I also do a lot of cheap serialization using Read/Show, particularly as a simple form of strongly-typed config file. I ran into this today:

The example starts like this, and I define a simple newtype to wrap around Int, along with a named field for unwrapping:

module Main where

import Debug.Trace ( trace )
import Text.Read ( readEither )


newtype Bar = Bar { unBar :: Int }
   deriving Show

Custom instance to read one of these from a simple Int syntax. The idea here is it would be great to be able to put "42" into a config file instead of "Bar { unBar = 42 }"

This instance also has trace "logging" so we can see when this instance is really used when observing the problem.

instance Read Bar where
   readsPrec _ s = [(Bar i, "")]
      where i = read (trace ("[debug \"" ++ s ++ "\"]") s)

Now another type containing a Bar. This one will just auto-derive Read.

data Foo = Foo { bar :: Bar }
   deriving (Read, Show)


main :: IO ()
main = do

Deserializing the Bar type alone works fine and uses the Read instance above

   print $ ((readEither "42") :: Either String Bar)
   putStrLn ""

But for some reason Foo, containing a Bar, and automatically derived into Read, is not drilling down and picking up Bar's instances! (Notice that the debug message from trace isn't displayed either)

   print $ ((readEither "Foo { bar = 42 }") :: Either String Foo)
   putStrLn ""

So ok, how about the default Show form for Bar, should match the default Read right?

   print $ ((readEither "Foo { bar = Bar { unBar = 42 } }") :: Either String Foo)

No! Doesn't work either!! Again, no debug message.

Here's the execution output:

  $ stack exec readbug
  [debug "42"]
  Right (Bar {unBar = 42})

  Left "Prelude.read: no parse"

  Left "Prelude.read: no parse"

This looks buggy to me but I'd love to hear that I'm doing it wrong.

A fully working example of the code above is available. See the file src/Main.lhs in a test project on darcshub

dino
  • 1,123
  • 9
  • 13
  • 3
    This is a very nice question. I love just how easy you've made it for someone to start debugging your code. Hopefully my answer is helpful in identifying the particular problem you are having. That aside, I would _not_ recommend ever using `Read` for anything more than debugging - and then make sure that `read . show = id`. I would put my config in either a JSON (and use `aeson` to encode/decode), or (if you insist on a custom parser) use something like `attoparsec` or `megaparsec`. `Read` is a phenomenally inefficient parser, because it is willing to backtrack anywhere. – Alec Dec 16 '16 at 02:30
  • 2
    You're wrong: the derived instance for `Foo` *is* using the `Read` instance for `Bar` that you wrote! It's just that the `Foo` instance is failing before it bothers to force the value of `Bar` (hence never forces the thunk with the `trace` in it), because `Bar` incorrectly reports that it consumed all the remaining input and so the `Foo` reader doesn't see the `}` it needs to succeed. – Daniel Wagner Dec 16 '16 at 03:29
  • @Alec I hadn't considered using JSON for configs. Keeps the typing and hierarchical structure. And then you end up with a config file that's usable by other languages/systems. I'll explore this with newtypes a bit now. Thanks! – dino Dec 16 '16 at 04:49

1 Answers1

4

The problem is in Read. readsPrec needs to consider the possibility that it might see some more stuff after the Bar. Quoting the Prelude:

readsPrec d s attempts to parse a value from the front of the string, returning a list of (<parsed value>, <remaining string>) pairs. If there is no successful parse, the returned list is empty.

In your case, you want:

instance Read Bar where
   readsPrec d s = [ (Bar i, s') | (i, s') <- readsPrec d tracedS ]
      where tracedS = trace ("[debug \"" ++ s ++ "\"]") s

Then, the following works:

ghci> print $ ((readEither "Foo { bar = 42 }") :: Either String Foo)
[debug " 42 }"]
Right (Foo {bar = Bar {unBar = 42}})

Your other problem, namely:

So ok, how about the default Show form for Bar, should match the default Read right?

 print $ ((readEither "Foo { bar = Bar { unBar = 42 } }") :: Either String Foo)

is your fault: you defined a Read instance for Bar such that read . show isn't an identity operation. When Foo derives Read, it uses Bars Read instance (it doesn't try to regenerate the code that Bar would've generated if you had derived Read on it).

Alec
  • 31,829
  • 7
  • 67
  • 114
  • 1
    I know it's kind of a tangent, but I can't help suggesting it: `instance Read Bar where readsPrec = coerce (readsPrec @Int)` – Daniel Wagner Dec 16 '16 at 03:26
  • @Alec Ah! I didn't realize this difference between `read` and `readsPrec` but it makes a lot more sense to use the same function. Thank you. – dino Dec 16 '16 at 03:58
  • @DanielWagner Ooh, I do like the terseness of that. I didn't know about Data.Coerce. I wonder if it's more efficient. Also requires `-XTypeApplications` Thanks! – dino Dec 16 '16 at 04:18
  • 1
    @dino `coerce` can sometimes make a big efficiency difference -- but in this case it will almost certainly be unnoticeable. – Daniel Wagner Dec 16 '16 at 04:33