0

The language I am working in is C.

I am trying to use a mix of built in c string functions in order to take a list of tokens (space separated) and "convert" it into a list of tokens that is split by quotations.

A string like

echo "Hello 1 2 3 4" test test2

gets converted to

[echo] ["Hello] [1] [2] [3] [4"] [test] [test2]

I then use my code (at bottom) to attempt to convert it into something like

[echo] [Hello 1 2 3 4] [test] [test2]

For some reason the second 'token' in the quoted statement gets overridden. Here's a snippet of the code that runs over the token list and converts it to the new one.

 88                 for (int i = 0; i < counter; i++) {
 89                         if ( (strstr(tokenized[i],"\"") != NULL) && (inQuotes == 0)) {
 90                                 inQuotes = 1;
 91                                 tokenizedQuoted[quoteCounter] = tokenized[i];
 92                                 strcat(tokenizedQuoted[quoteCounter]," ");
 93                         } else if ( (strstr(tokenized[i],"\"") != NULL) && (inQuotes == 1)) {
 94                                 inQuotes = 0;
 95                                 strcat(tokenizedQuoted[quoteCounter],tokenized[i]);
 96                                 quoteCounter++;
 97                         } else {
 98                                 if (inQuotes == 0) {
 99                                         tokenizedQuoted[quoteCounter] = tokenized[i];
100                                         quoteCounter++;
101                                 } else if (inQuotes == 1) {
102                                         strcat(tokenizedQuoted[quoteCounter], tokenized[i]);
103                                         strcat(tokenizedQuoted[quoteCounter], " ");
104                                 }
105                         }
106
107                 }
Clay Benson
  • 161
  • 1
  • 9

1 Answers1

1

In short, adding an space to a char * means that the memory pointed by it needs more bytes. Since you do not provide it, you are overwritting the first byte of the following "word" with \0, so the char * to it is interpreted as the empty string. Note that writting to a location that has not been reserved is an undefined behavior, so really ANYTHING could happen (from segmentation fault to "correct" results with no errors).

Use malloc to create a new buffer for the expanded result with enough bytes for it (do not forget to free the old buffers if they were malloc'd).

SJuan76
  • 24,532
  • 6
  • 47
  • 87
  • Is there any way to do this if I never malloc'd the array at the start? My 2 arrays are char* tokenized[maxArgs] and tokenizedQuoted[maxArgs] – Clay Benson Dec 14 '13 at 18:05
  • The original storage of the strings is not relevant (other than for avoiding a memory leak). You need to enlarge an string? Reserve enough space for the new string and perform your operation in that space. If you used `strtok`, you only need to free the original position. It is hard to give more advice without seeing the code that fills these arrays. – SJuan76 Dec 14 '13 at 18:24