31

I've been reading about the new IO manager in GHC, which uses asynchronous event notifications and eschews blocking I/O to achieve high throughput.

Which IO activities are eligible for management by the new asynchronous IO code? Reading and writing of files and network activity? Database access? Are there kinds of IO where the manager has to resort to blocking?

Bill
  • 44,502
  • 24
  • 122
  • 213
  • 2
    Wow... 4 favourites but only one upvote. That's strange. – fuz Dec 15 '10 at 11:26
  • @FUZxxl StackOverflow has favorites? I totally just noticed that because of your comment. – alternative Jul 13 '11 at 19:20
  • @monadic Yeah, there are. Just hit the start button right under a question to favourite it. If something changes, you get a notification as if it is your own question. – fuz Jul 13 '11 at 19:50

2 Answers2

26

Any file descriptor that can be managed by epoll/kqueue is eligible. Libraries that want asynchronous treatment of I/O need to cooperate with the I/O manager by

  • making file descriptors non-blocking, and
  • calling the threadWaitRead and threadWaitWrite functions in GHC.Conc before retrying a system call that previously returned EWOULDBLOCK.

This has already been done for the Handle and Socket types. If you use e.g. a binding to a C database library you will get blocking behavior as that library won't cooperate with the I/O manager.

tibbe
  • 8,809
  • 7
  • 36
  • 64
5

A somewhat satisfactory answer:

The heart of the new GHC IO manager is a kqueue()/epoll() event loop. So I would expect anything which can be built on top of this to be eligible -- if not now, then later. In particular this means:

  • File IO
  • Network IO

The code (I looked at it some months ago and things might have changed) also contains support for registering and running timeouts of various kinds through a priority (search) queue. This suggest that most sleep-like calls can also be piggybacking on the interface.

About Database Access: sure, you often access the database through a network IO socket so calling forkIO and doing DB access in a separate thread should be doable, fast, and safe. Communicating data back to the rest of the application can be done with one of the concurrency means, Chan or STM.TChan.

I don't think there are kinds of IO where the manager has to resort to blocking per se, but I can imagine that some libraries may circumvent the new IO manager and go straight for the jugular. They will, of course, block.

I GIVE CRAP ANSWERS
  • 18,739
  • 3
  • 42
  • 47