my Haskell application reads input as a list of ByteString and I'm using Text.Regex.Posix.ByteString.regexec to find matches. Some input has a character code 253 (it's a 1/2 symbol in one IBM PC character set) and it seems that the pattern '.' (i.e., dot, "match any character") doesn't match it. Any way to make it match ?
Asked
Active
Viewed 217 times
2 Answers
0
That doesn't make sense. Why would you want to match a half-character? .
will match the full character.

David Knipe
- 3,417
- 1
- 19
- 19
-
It's only half a character in certain encodings such as UTF-8. I'm looking to match 8-bit characters. – Michael restore Monica Cellio Oct 25 '13 at 07:31
0
This works for me on a Windows Haskell install:
> length $ ((pack ['\1'..'\253']) =~ "." :: [[ByteString]])
252
I.e. dot matches all characters in range including code 253.
Note that the library calls out to the underlying posix regex matcher, typically, I assume, from glibc
.
So I would imagine any issue you have would be with that precise underlying c implementation.
Something like Text.Regex.TDFA.ByteString
might give you clearer behavior in this case, since it is all in Haskell?

sclv
- 38,665
- 7
- 99
- 204