1

The program below works if run with runhaskell or if compiled but not with -O2. If compiled with -O2 it seems to hang.

I'm using GHC 7.10.2.

I've changed the min/max iterations to 10 and 20 respectively. It will generate anywhere from 20 to 100 MB of output into the file test.out. Run time is about 15 - 60 secs.

Program Explanation

Below is a multi-threaded program that has a pool of workers and a manager. The workers generate traces to be used in plotting a Buddhabrot, put it in a queue, and a manager periodically empties the queue and writes the data to disk. When a certain amount of data has been generated, the program stops.

But when the program runs the manager thread only does one check, and then it gets stuck (the worker threads are still running). However, if I remove the part where the manager thread writes to file, then everything seems to work. I just don't understand why...

import Control.Concurrent
import Control.Concurrent.Async
import Control.Concurrent.STM
import Control.Monad
  ( forever
  , unless
  )
import Control.Monad.Loops
import System.IO
import System.Random

import qualified Data.Binary as B
import qualified Data.ByteString.Lazy as BS

type Coord = (Double, Double)

type Trace = [Coord]

-- | Represents a rectangle in the complex plane, bounded by a lower left
-- coordinate and an upper right coordinate.
data Plane
  = Plane { ll :: Coord, ur :: Coord }
  deriving (Show)

-- | Adds two coordinates.
(+.) :: Coord -> Coord -> Coord
(r1, i1) +. (r2, i2) = (r1 + r2, i1 + i2)

-- | Multiplies two coordinates.
(*.) :: Coord -> Coord -> Coord
(r1, i1) *. (r2, i2) = (r1*r2 - i1*i2, r1*i2 + r2*i1)

-- | Computes the square of a coordinate.
square :: Coord -> Coord
square (r, i) = (r*r - i*i, 2*r*i)

-- | Distance from origin to a given coordinate.
distFromOrigin :: Coord -> Double
distFromOrigin (r, i) = r*r + i*i

-- | A structure for passing data to the worker threads.
data WorkerData
  = WorkerData { wdMinIt :: Int
               , wdMaxIt :: Int
               , wdTraceQueue :: TQueue Trace
                 -- ^ A queue of traces to be written to disk.
               }

-- | A structure for passing data to the manager thread.
data ManagerData
  = ManagerData { mdOutHandle :: Handle
                   -- ^ Handle to the output file.
                , mdNumTraces :: Integer
                  -- ^ Number of traces to gather.
                , mdTraceQueue :: TQueue Trace
                  -- ^ A queue of traces to be written to disk.
                }

-- | Encodes an entity to binary bytestring.
encode :: B.Binary a => a -> BS.ByteString
encode = B.encode

-- | Writes a lazy bytestring to file.
writeToFile :: Handle -> BS.ByteString -> IO ()
writeToFile = BS.hPut

mkManagerData :: TQueue Trace -> IO ManagerData
mkManagerData t_queue =
  do let out_f = "test.out"
     out_h <- openBinaryFile out_f WriteMode
     let num_t = 1000
     return $ ManagerData { mdOutHandle = out_h
                          , mdNumTraces = num_t
                          , mdTraceQueue = t_queue
                          }

mkWorkerData :: TQueue Trace -> IO WorkerData
mkWorkerData t_queue =
  do let min_it =  10 -- 1000
         max_it =  20 -- 10000
     return $ WorkerData { wdMinIt = min_it
                         , wdMaxIt = max_it
                         , wdTraceQueue = t_queue
                         }

-- | The actions to be performed by the manager thread.
runManager :: ManagerData -> IO ()
runManager m_data =
  do execute 0
     return ()
  where execute count =
          do new_traces <- purgeTQueue $ mdTraceQueue m_data
             let new_count = count + (toInteger $ length new_traces)
             putStrLn $ "Found " ++ (show $ new_count) ++ " traces so far. "
             if length new_traces > 0
             then do putStrLn $ "Writing new traces to file..."
                     _ <- mapM (writeToFile (mdOutHandle m_data))
                               (map encode new_traces)
                     putStr "Done"
             else return ()
             putStrLn ""
             unless (new_count >= mdNumTraces m_data) $
               do threadDelay (1000 * 1000) -- Sleep 1s
                  execute new_count

-- | The actions to be performed by a worker thread.
runWorker :: WorkerData -> IO ()
runWorker w_data =
  forever $
    do c <- randomCoord
       case computeTrace c (wdMinIt w_data) (wdMaxIt w_data) of
         Just t  -> atomically $ writeTQueue (wdTraceQueue w_data) t
         Nothing -> return ()

-- | Reads all values from a given 'TQueue'. If any other thread reads from the
-- same 'TQueue' during the execution of this function, then this function may
-- deadlock.
purgeTQueue :: Show a => TQueue a -> IO [a]
purgeTQueue q =
  whileJust (atomically $ tryReadTQueue q)
            (return . id)

-- | Generates a random coordinate to trace.
randomCoord :: IO Coord
randomCoord =
  do x <- randomRIO (-2.102613, 1.200613)
     y <- randomRIO (-1.237710, 1.239710)
     return (x, y)

-- | Computes a trace, using the classical Mandelbrot function, for a given
-- coordinate and minimum and maximum iteration count. If the length of the
-- trace is less than the minimum iteration count, or exceeds the maximum
-- iteration count, 'Nothing' is returned.
computeTrace
  :: Coord
  -> Int
     -- ^ Minimum iteration count.
  -> Int
     -- ^ Maximum iteration count.
  -> Maybe Trace
computeTrace c0 min_it max_it =
  if isUsefulCoord c0
  then let step c = square c +. c0
           computeIt c it = if it < max_it
                            then computeIt (step c) (it + 1)
                            else it
           computeTr [] = error "computeTr: empty list"
           computeTr (c:cs) = if length cs < max_it
                              then computeTr (step c:(c:cs))
                              else (c:cs)
           num_it = computeIt c0 0
       in if num_it >= min_it && num_it <= max_it
          then Just $ reverse $ computeTr [c0]
          else Nothing
  else Nothing

-- | Checks if a given coordinate is useful by checking if it belongs in the
-- cardioid or period-2 bulb of the Mandelbrot.
isUsefulCoord :: Coord -> Bool
isUsefulCoord (x, y) =
  let t1 = x - 1/4
      p = sqrt (t1*t1 + y*y)
      is_in_cardioid = x < p - 2*p*p + 1/4
      t2 = x + 1
      is_in_bulb = t2*t2 + y*y < 1/16
  in not is_in_cardioid && not is_in_bulb

main :: IO ()
main =
  do t_queue <- newTQueueIO
     m_data <- mkManagerData  t_queue
     w_data <- mkWorkerData t_queue
     let num_workers = 1
     workers <- mapM async (replicate num_workers (runWorker w_data))
     runManager m_data
     _ <- mapM cancel workers
     _ <- mapM waitCatch workers
     putStrLn "Tracing finished"

Why It Fails

After reviewing the answers below, I finally realized why it doesn't work as intended. The program does not hang, but the time it takes for the manager thread to encode a single trace is in the order of tens of seconds (and when encoded it consumes several megabytes)! This means that even if there are some tens of traces in the queue when exhausted -- on my machine the workers manage to produce about 250 traces before the queue is exhausted by the manger thread -- it will take forever before the next exhaust.

Hence it matters little what solution I choose unless the work of the manager thread is greatly reduced. For that, I will have to abandon my idea of dumping each individual trace to file and instead process it once generated.

gablin
  • 4,678
  • 6
  • 33
  • 47
  • 3
    Voted to close. You should try to find the smallest possible program that produces this behavior. It's a good practice both for you and for us: for you, there's a good chance that minimization will make it obvious even to you what's wrong; and for us, a little bit of your time spent will be the same amount of time saved for each one of us -- a net savings if there are at least two people who try to help you at all. – Daniel Wagner Aug 01 '16 at 18:44
  • http://stackoverflow.com/help/mcve – chi Aug 01 '16 at 18:55
  • I'm voting to reopen this question. I've found that it works if run interpreted or not compiled with -O2, but it seems to hang when compiled with -O2. I've also modified the parameters (min/max iterations) to make it complete in less than a minute. – ErikR Aug 01 '16 at 21:47
  • Are you including the `-threaded` option? – beerboy Aug 01 '16 at 21:50
  • Initially no, but I've since retested with -threaded, and the results are the same. -O2 seems to hang, but no -O option works. – ErikR Aug 01 '16 at 22:01
  • The problem is that worker thread can deprive the manager thread from removing elements from the TQueue. I've written up the details (along with code examples) here: https://github.com/erantapaa/mandel – ErikR Aug 02 '16 at 01:38
  • Good point ErikR. Yes, starvation is possible using STM. A simpler example is here http://stackoverflow.com/questions/3031878/stm-monad-problem/3034057#3034057 – Thomas M. DuBuisson Aug 02 '16 at 05:08
  • @DanielWagner: I actually tried to first make a smaller example, but then the problem went away. So this is as small as I could get it. – gablin Aug 02 '16 at 07:40

1 Answers1

3

The problem is two-fold:

(1) The manager thread doesn't process any Traces until it has exhausted the queue.

(2) The worker thread can add elements to the queue very, very quickly.

This results in a race that the manager thread rarely wins. [ This also explains the observed behavior with -O2 - the optimization just made the worker thread faster. ]

Adding some debugging code shows that the worker can add items to the queue in excess of 100K Traces per second. Moreover, even though the manager is only interested in writing out the first 1000 Traces, the worker doesn't stop at this limit. So, under certain circumstances, the manager is never able to exit this loop:

purgeTQueue q = whileJust (atomically $ tryReadTQueue q) (return . id)

The simplest way to fix the code is to have the manager thread use readTQueue to read and process just one item off the queue at a time. This will also block the manager thread when the queue us empty obviating the need to the manager thread to periodically sleep.

Changing purgeTQueue to:

purgeTQueue = do item <- atomically $ readTQueue (mdTraceQueue m_data)
                 return [item]

and removing the threadDelay from runManager fixes the problem.

Example code available in the Lib4.hs module at: https://github.com/erantapaa/mandel

ErikR
  • 51,541
  • 9
  • 73
  • 124