How to parse string literal in Megaparsec?

Asked Jan 28 '17 at 18:12

Active Jan 28 '17 at 18:12

Viewed 518 times

So I've spent some time reading the docs but for anything but the most basic of languages the tutorials aren't enough and it's hard to go through the whole docs of the language. I get lost too easily.

How would I create a "&"#someStringUtf8" parser?

Does the charLiteral work only with ASCII chars or does it support any kind of UTF8 character (numbers etc.)?

asked Jan 28 '17 at 18:12

ditoslav

4,563
10
47
79

2

There is no such thing as a UTF8-character. There is ASCII / Western / ... / Unicode characters, and there is _encodings_ of such characters (UTF-8 is an encoding). `Text` actually uses UTF-16 at the moment, so you'll probably have that in megaparsec; but this needn't bother you because the interface completely abstracts over the encoding and you only deal with characters. This obviously works fine with ASCII (which sure includes numbers, what made you think otherwise?), but shouldn't also have any difficulty with general Unicode. – leftaroundabout Jan 28 '17 at 18:49
I mean does `charLiteral` mean `([a-z] | [A-Z])` or is it all characters? It seems I'm confusing "characters" and "letters" – ditoslav Jan 28 '17 at 19:31
1

Indeed seems so! A character is simply any value of type `Char`, such as `'a'` or `'2'` or `'ξ'` or `'♣'` or `' '` or even unprintable characters like `'\r'`. – leftaroundabout Jan 28 '17 at 19:42

How to parse string literal in Megaparsec?

0 Answers0