0

I've been having an issue with parsing through a file and the use of seekg(). Whenever a certain character is reached in a file, I want to loop until a condition is met. The loops works fine for the first iteration but, when it loops back, the file seemingly skips a character and causes the loop to not behave as expected.

Specifically, the loop works fine if it is all contained in one line in the file, but fails when there is at least one newline within the loop in the file.

I should mention I am working on this on Windows, and I feel like the issue arises from how Windows ends lines with \r\n.

Using seekg(-2, std::ios::cur) after looping back fixes the issue when the beginning loop condition is immediately followed by a newline, but does not work for a loop contained in the same line.

The code is structured by having an Interpreter class hold the file pointer and relevant variables, such as the current line and column. This class also has a functional map defined like so:

// Define function type for command map
    typedef void (Interpreter::*function)(void);

    // Map for all the commands
    std::map<char, function> command_map = {
        {'+', increment_cell},
        {'-', decrement_cell},
        {'>', increment_ptr},
        {'<', decrement_ptr},
        {'.', output},
        {',', input},
        {'[', begin_loop},
        {']', end_loop},
        {' ', next_col},
        {'\n', next_line}
    };

It iterates through each character, deciding if it has functionality or not in the following function:

// Iterating through the file
void Interpreter::run() {
    char current_char;
    if(!this->file.eof() && this->file.good()) {
        
        while(this->file.get(current_char)) {

            // Make sure character is functional command (ie not a comment)
            if(this->command_map.find(current_char) != this->command_map.end()) {

                // Print the current command if in debug mode
                if(this->debug_mode && current_char != ' ' && current_char != '\n') {
                    std::cout << this->filename << ":" << this->line << ":" 
                              << this->column << ": " << current_char << std::endl;
                }

                // Execute the command
                (this->*(command_map[current_char]))();
            }

            // If it is not a functional command, it is a comment. The rest of the line is ignored
            else{
                std::string temp_line = "";
                std::getline(file, temp_line);
                this->line++;
                this->column = 0;
            }
            this->temp_pos = file.tellg();
            this->column++;
        }
    }

    else {
        std::cout << "Unable to find file " << this->filename << "." << std::endl;
        exit(1);
    }

    file.close();
}

The beginning of the loop (signaled by a '[' char) sets the beginning loop position to this->temp_pos:

void Interpreter::begin_loop() {
    this->loop_begin_pointer = this->temp_pos;
    this->loop_begin_line = this->line;
    this->loop_begin_col = this->column;
    this->run();
}

When the end of the loop (signaled by a ']' char) is reached, if the condition for ending the loop is not met, the file cursor position is set back to the beginning of the loop:

void Interpreter::end_loop() {
    // If the cell's value is 0, we can end the loop
    if(this->char_array[this->char_ptr] == 0) {
        this->loop_begin_pointer = -1;
    }
    // Otherwise, go back to the beginning of the loop
    if(this->loop_begin_pointer > -1){
        this->file.seekg(this->loop_begin_pointer, std::ios::beg);
        this->line = this->loop_begin_line;
        this->column = this->loop_begin_col;
    }
}

I was able to put in debugging information and can show stack traces for further clarity on the issue.

Stack trace with one line loop ( ++[->+<] ):

+ + [ - > + < ] [ - > + < ] done.

This works as intended.

Loop with multiple lines:

++[
-
>
+<]

Stack trace:

+ + [ - > + < ] > + < ] <- when it looped back, it "skipped" '[' and '-' characters.

This loops forever since the end condition is never met (ie the value of the first cell is never 0 since it never gets decremented).

Oddly enough, the following works:

++[
-
>+<]

It follows the same stack trace as the first example. This working and the last example not working is what has made this problem hard for me to solve.

Please let me know if more information is needed about how the program is supposed to work or its outputs. Sorry for the lengthy post, I just want to be as clear as possible.

Edit 1: The class has the file object as std::ifstream file;. In the constructor, it is opened with this->file.open(filename), where filename is passed in as an argument.

Idalas
  • 51
  • 8
  • 1
    did you open your file in binary mode? Please show a [mre] – Alan Birtles Nov 24 '21 at 21:22
  • @AlanBirtles The file is defined like so in the class: ```std::ifstream file```. In the constructor it is opened as ```this->file.open(filename)``` – Idalas Nov 24 '21 at 21:25
  • That will open in text more and may be performing some conversions, the most famous of which is combining the carriage return and line feed characters in a Windows format text file into a single newline character. – user4581301 Nov 24 '21 at 22:20
  • @user4581301 Right I had a feeling something like that might be happening. But is that affecting where seekg() goes to, even when seekg() is set to go to the position in the file that the loop starts at (and not the space next to it, which could be a newline)? – Idalas Nov 24 '21 at 23:28
  • @Idalas yes, text mode conversions affect seek positions – Remy Lebeau Nov 24 '21 at 23:43
  • @RemyLebeau Okay so would that mean I need to correct its position by 2 bytes for each newline when it seeks back to the beginning? For example, something like ```this->file.seekg(this->loop_begin_pointer, std::ios::beg); this->file.seekg(-2*newline_count, std::ios::cur);``` – Idalas Nov 25 '21 at 00:16
  • For a file opened in text mode, the only valid positions to pass to `seekg` are zero or a value previously returned by `tellg` (so you can return to the position you were at once). You can't do arithmetic on offsets. – Igor Tandetnik Nov 25 '21 at 04:31
  • Question suffers from kind of [XY problem](https://xyproblem.info/). You have some solution which doesn't work, you explained what doesn't work and ask to fix, `seekg` issue. When in fact you should describe what problem your code is solving then explain your approach. Looping over small file content is unusual and most probably wrong approach to your actual issue. – Marek R Nov 30 '21 at 10:34

1 Answers1

1

For a file stream, seekg is ultimately defined in terms of fseek from the C standard library. The C standard has this to say:

7.21.9.2/4 For a text stream, either offset shall be zero, or offset shall be a value returned by an earlier successful call to the ftell function on a stream associated with the same file and whence shall be SEEK_SET.

So for a file opened in text mode, you can't do any arithmetic on offsets. You can rewind to the beginning, position at the end, or return to the position you were at previously and captured with tellg (which ultimately calls ftell). Anything else would exhibit undefined behavior.

Igor Tandetnik
  • 50,461
  • 4
  • 56
  • 85