6

I'm having problems in figuring out where and why I'm receiving a segmentation fault.

I'm writing a C code that prompts the user to input a regular expression and compile it and then enter a string with multiple sentences:

int main(void){

  char RegExp[50];
  regex_t CompiledRegExp;
  char *para;
  char delim[] = ".!?,";
  char *sentence;
  char *ptr1;

  printf("Enter regular expression: ");
  fgets(RegExp, 50, stdin);

if (regcomp(&CompiledRegExp,RegExp,REG_EXTENDED|REG_NOSUB) != 0) {                        

    printf("ERROR: Something wrong in the regular expression\n");                         

    exit(EXIT_FAILURE);                                                                   

  }

  printf("\nEnter string: ");

strtok_r is used to split the string with either of the following delimiters .,?! and then the resulting token (sentence) is used as the string parameter in the regexec function that searches it to see if the regular expression previously compiled is contained within the token:

if( fgets(para, 1000, stdin)){

    char *ptr = para;
    sentence = strtok_r(ptr, delim, &ptr1);

    while(sentence != NULL){

      printf("\n%s", sentence);

      if (regexec(&CompiledRegExp,sentence,(size_t)0,NULL,0) == 0) {
        printf("\nYes");
      } else {
        printf("\nNo");
      }
      ptr = ptr1;
      sentence = strtok_r(ptr, delim, &ptr1);

    }
  }
regfree(&CompiledRegExp);
}

It's probably a silly mistake I'm making but any help in locating the reasons of the segfaul would be greatly appreciated!

EDIT: Moved regfree to a more suitable location. However, segfault still occurring. I'm pretty sure It has something got to do with either how the regular expression is being read in or how it is being compared in regexec. Clueless, though.

too honest for this site
  • 12,050
  • 4
  • 30
  • 52
higz555
  • 115
  • 8
  • What about the debugger? – Dan Mašek Apr 22 '16 at 22:27
  • Compile the program for debugging and run the program under a debugger. The debugger will tell you exactly what happened. – wallyk Apr 22 '16 at 22:27
  • The gdb debugger doesn't give me any specifics - just claims that a segfault was found – higz555 Apr 22 '16 at 22:31
  • Afraid to say you are using the debugger wrong. When GDB halts, *bt* will list the stack trace leading up to the the halt and *print nameOfVariable* will print out the current state of nameOfVariable. when stopped for a segfault you can then look at what lead up to it and start reading the variables to see which may have contributed to badness. – user4581301 Apr 22 '16 at 23:12
  • 1
    Your regex isn't working because you didn't cut the newline from fgets off it. – Joshua Apr 22 '16 at 23:26
  • @Joshua Adding `len = strlen(para); para[len-1] = '\0';` still causes regex to fail – higz555 Apr 22 '16 at 23:37
  • @Joshua Forgot I had two fgets. Eliminating the newline character from both made it work. Thank you! – higz555 Apr 22 '16 at 23:46

3 Answers3

3

Instead of this:

char *para;
fgets(para, 1000, stdin);

Write this:

char para[1000];
fgets(para, 1000, stdin);

In the first variant, para is a pointer that points somewhere in memory, and to this somewhere the user-entered string is written. Most probably, para points to some address that is invalid, crashing your program immediately.

Roland Illig
  • 40,703
  • 10
  • 88
  • 121
  • Fixed segmentation fault - thank you! Now my regular expressions aren't being correctly analysed. Back to the drawing board. – higz555 Apr 22 '16 at 22:52
2

You called regfree inside the loop. The second time around the loop you call regexec on freed memory with undefined behavior.

Joshua
  • 40,822
  • 8
  • 72
  • 132
0

You are using strtok_r() incorrectly.

To parse a string with strtok_r(), in the first call the first argument is a pointer to the string you want parsed. Subsequent calls to strtok_r() to parse the same same string should have NULL passed as the first argument. What you're doing:

ptr = ptr1;  
sentence = strtok_r(ptr, delim, &ptr1); 

makes no sense.

Michael Burr
  • 333,147
  • 50
  • 533
  • 760
  • My understanding was that the pointer within strtok_r was pointing to the split string after the delimiter was found and so it could recursively cut through the string. It works for me. – higz555 Apr 22 '16 at 23:01
  • 1
    Makes sense to me. I use strtok_r like that a lot. – Joshua Apr 22 '16 at 23:25