0

My task involves finding the longest common substring in two txt files using suffix arrays. I have done the following:

#include <iostream>
#include <cstring>
#include <algorithm>
#include <fstream>

int main() {
    char* charArrayA = charArrayFromTxtFile("~/txt_file1.txt");
    char* charArrayB = charArrayFromTxtFile("~/txt_file2.txt");

    int lengthA = strlen(charArrayA);
    int lengthB = strlen(charArrayB);

    char* suffixArrayA[lengthA];
    char* suffixArrayB[lengthB];

    for(int i = 0; i < lengthA; i++) { suffixArrayA[i] = &charArrayA[i]; }
    for(int i = 0; i < lengthB; i++) { suffixArrayB[i] = &charArrayB[i]; }
    charArrayA[lengthA] = 0;
    charArrayB[lengthB] = 0;

    ...

     return 0;
}

However, when I compiled this portion of code, I get the following error flag at the line containing the SECOND for-loop:

Thread 1: EXC_BAD_ACCESS (code=2, address=0x7ffeef1446e0)

For reference, the function I use to create charArrayA and charArrayB is:

char* charArrayFromTxtFile(std::string fileName) {
    std::ifstream filename;             // Variable for file
    int length;                         // Number of characters
    filename.open(fileName);
    filename.seekg(0, std::ios::end);   // Goes to the end of the file
    length = filename.tellg();          // Location of the end (index, length of file)
    filename.seekg(0, std::ios::beg);   // Go back to the beginning
    char* charArray = new char[length]; // Allocate a char array of "length" file
    filename.read(charArray, length);   // Write characters from txt file into the char array
    filename.close();

    return charArray;
}

Anybody know why it would be the case that the first txt file doesn't give me any trouble, but the second one does? I'll appreciate any guidance. Thanks so much guys!

P.S. This is my first stackoverflow question, so hopefully I was clear enough. I'll appreciate any feedback in question form as well! :D

Sebastian
  • 11
  • 1
  • 1
    For an array of size `n` the highest index you can access is `n-1`. The lines `charArrayA[lengthA]` and `charArrayB[lengthB]` always access one element past the end of the array which is undefined behavior. – François Andrieux Dec 13 '17 at 20:05
  • 2
    `suffixArrayA[lengthA]` is a variable length array and is not standard c++. It's supported by some compilers as extensions, but it's not a portable construct. Notably, gcc supports it by default. – François Andrieux Dec 13 '17 at 20:06
  • 7
    Since you are already using `std::string` in `charArrayFromTxtFile`, I don't understand why you choose to return a owning raw pointer to a dynamic array. You could simply return another `std::string` and make everything much simpler and safer. – François Andrieux Dec 13 '17 at 20:08

1 Answers1

2

Your charArrayfromTxtFile() function is not null terminating the string charArray.

charArray[length - 1] = '\0';

This must be done before you iterate over them with strlen().

Corey Taylor
  • 635
  • 5
  • 12
  • Close: `char* charArray = new char[length+1];` and then `charArray[length] = '\0';` Otherwise you wipe out the last byte in the file. – user4581301 Dec 13 '17 at 20:39
  • I left it to the user to figure out what valid input they need. This simply shows the null termination. – Corey Taylor Dec 13 '17 at 20:41
  • You will find that, more often than not, that approach just sets you up for a volley of questions that should have been resolved with the first answer. – user4581301 Dec 13 '17 at 20:53
  • The program is in complete and cannot be run. I wouldn't answer anything else about it. – Corey Taylor Dec 13 '17 at 20:54
  • I apologise for the late response guys, thanks for your feedback! I tried null terminating my char arrays, and I'm still running into the same EXC_BAD_ACCESS error. What's interesting is that if I run the program with just one txt file, it works just fine, but as soon as I attempt to run it with two files (that is, charArrayA and charArrayB), it gives me problems. I also tried dynamically allocating the char array outside the function, but it gives me the same error. I hope that clears up things more. (By the way, I am using char arrays because my professor specifically asked for them) – Sebastian Dec 15 '17 at 17:55
  • Did you remove the bad code? charArrayA[lengthA] = 0; charArrayB[lengthB] = 0; – Corey Taylor Dec 15 '17 at 18:24