1

my Haskell application reads input as a list of ByteString and I'm using Text.Regex.Posix.ByteString.regexec to find matches. Some input has a character code 253 (it's a 1/2 symbol in one IBM PC character set) and it seems that the pattern '.' (i.e., dot, "match any character") doesn't match it. Any way to make it match ?

2 Answers2

0

That doesn't make sense. Why would you want to match a half-character? . will match the full character.

David Knipe
  • 3,417
  • 1
  • 19
  • 19
0

This works for me on a Windows Haskell install:

> length $ ((pack ['\1'..'\253']) =~ "." :: [[ByteString]])
252

I.e. dot matches all characters in range including code 253.

Note that the library calls out to the underlying posix regex matcher, typically, I assume, from glibc.

So I would imagine any issue you have would be with that precise underlying c implementation.

Something like Text.Regex.TDFA.ByteString might give you clearer behavior in this case, since it is all in Haskell?

sclv
  • 38,665
  • 7
  • 99
  • 204