11

Consider the following code: http://hpaste.org/90394

I am memory mapping a large 460mb file to a lazy ByteString. The length of the ByteString reports 471053056.

When nxNodeFromID file 110000 is changed to a lower node ID, ie: 10000, it works perfectly. However; as soon as I try and serialize anything past exactly 2^18 bytes (262144) of the ByteString I get Segmentation fault/access violation in generated code and termination.

I'm running Windows and using GHC 7.4.2.

Please advise whether this is my fault, or some issue with the laziness, or, some issue with Haskell.

Micha
  • 5,117
  • 8
  • 34
  • 47
kvanbere
  • 3,289
  • 3
  • 27
  • 52
  • 1
    Your `getNXNode` doesn't match the `NXNode` data definition. If that's intentional, it would be worth a comment. But I don't see how that would cause a segfault here. – Daniel Fischer Jun 25 '13 at 08:44
  • @DanielFischer `NXNode 0 <$> ...` :) – kvanbere Jun 25 '13 at 08:52
  • Yes, but you `skip` 20 bytes, and read only 12 per node. – Daniel Fischer Jun 25 '13 at 08:55
  • Sorry, I misunderstood. `getNXMetadata` contributes to 2 (type) + 8 (data) optional bytes (which are null if they aren't used) totaling 20 for the size of the node (when added to the 4 + 4 + 2 read in `getNXNode`). – kvanbere Jun 25 '13 at 08:59
  • 1
    Makes sense. I suppose once `getNXMetadata` is complete, it becomes more or less obvious. – Daniel Fischer Jun 25 '13 at 09:01
  • 1
    Does the problem persist if you use `mmapFileByteString` instead of `mmapFileByteStringLazy`? (You'd need to wrap the returned strict `ByteString` of that in a `fromChunks [buf]` or so to get a lazy `ByteString` for the rest of your code to work with.) – Daniel Fischer Jun 25 '13 at 09:14
  • Yes, that works (however, I think I lose laziness now?). Is this segfault a wierd glitch in GHC? – kvanbere Jun 25 '13 at 09:30
  • 2
    Sure, you lose laziness. But the idea was to get a hint what the problem could be. I suspect it may be "-- FIXME: might be we need NOINLINE pragma here, investigate later" (concerning `mapChunk handle (offset,size) = unsafePerformIO $`), but it could be something else. Can't really investigate, though. (Well, if I _had to_, I could boot into Windows; but I'd still need a file to work on, and a great incentive to expose myself to the inconvenience.) – Daniel Fischer Jun 25 '13 at 09:41
  • 1
    Sorry, that's not my area of expertise. Besides, I don't think a Windows core-dump would work with gdb on Linux. You could try to contact the maintainer of `mmap`, he should have a better idea what the problem might be. – Daniel Fischer Jun 25 '13 at 09:53
  • 1
    BTW you should make the fields of your type strict. Better semantics. – Don Stewart Jun 25 '13 at 09:58

1 Answers1

1

Note that I have updated mmap to correctly include NOINLINE pragma at strategic point in the code. mmap-0.5.9 available for grabs. Let me know if the issue persists. Edit: yes, I'm the author of mmap.

Gracjan Polak
  • 596
  • 3
  • 16
  • While it appears you're the mmap author, this is not totally clear from your answer. I would consider adding more information. – Syon Sep 05 '13 at 17:28
  • HI, I have the same error with latest version of your library. https://stackoverflow.com/questions/53715138/address-boundary-error-in-haskell-application – user1685095 Dec 11 '18 at 19:20