I'm writing a simple wrapper-class for scanning a stream of characters character-by-character.
Scanner scanner("Hi\r\nYou!");
const char* current = scanner.cchar();
while (*current != 0) {
printf("Char: %d, Column: %d, Line: %d\n", *current, scanner.column(), scanner.line());
current = scanner.read();
}
C:\Users\niklas\Desktop>g++ main.cpp -o main.exe
C:\Users\niklas\Desktop>main.exe
Char: 72, Column: 0, Line: 0
Char: 105, Column: 1, Line: 0
Char: 13, Column: 0, Line: 1
Char: 10, Column: 0, Line: 2
Char: 89, Column: 1, Line: 2
Char: 111, Column: 2, Line: 2
Char: 117, Column: 3, Line: 2
Char: 33, Column: 4, Line: 2
This example already shows the problem I'm stuck with. One can interpret \r
as a new-line, as well as \n
. But together (\r\ n
) they are just a single new-line as well!
The function that processes line- and column-numbers is this:
void _processChar(int revue) {
char chr = _source[_position];
if (chr == '\r' or chr == '\n') {
_line += revue;
_column = 0;
}
else {
_column += revue;
}
}
Sure, I could just look at the character that appears after the character at the current position, but: I do not check for NULL-termination on the source because I want to be able to process character streams that may contain \0
characters without being terminated at that point.
How can I handle CRLF this way?
Edit 1: DOH! This seems to be working fine. Is this safe in any case or do I have an issue somewhere?
void _processChar(int revue) {
char chr = _source[_position];
bool is_newline = (chr == '\r' or chr == '\n');
if (chr == '\n' and _position > 0) {
is_newline = (_source[_position - 1] != '\r');
}
if (is_newline) {
_line += revue;
_column = 0;
}
else {
_column += revue;
}
}
Thanks!