0

I'm new to C and I've been working on this task for about 7 hours now - please don't say I didn't try.

I want to parse the path of a self-written webserver in C. Let's say I call

http://localhost:8080/hello/this/is/a/test.html

then the browser gets

GET /hello/this/is/a/test.html HTTP/1.1

I want to parse /hello/this/is/a/test.html, so the complete string between "GET " (note the white space after GET) and the first white space after /../../..html.

What I tried so far:

int main() {
  ...
  char * getPathOfGetRequest(char *);
  char *pathname = getPathOfGetRequest(buf);

  printf("%s\n\n%s", buf, pathname);
  ...
}

char * getPathOfGetRequest(char *buf) {

  char *startingGet = "GET ";
  char buf_cpy[BUFLEN];
  memcpy(buf_cpy, buf, sizeof(buf));

  char *urlpath = malloc(1000);
  char *path = malloc(1000);
  urlpath = strstr(buf_cpy, startingGet);

  char delimiter[] = " ";

  path = strtok(urlpath, delimiter);
  path = strtok(NULL, delimiter);  

  return path;
}

The pathname always only has 4 correct chars and may or may not be filled with other unrelated chars, like /hell32984cn)/$"§$. I guess it has something to do with strlen(startingGet), but I can't see the relationship between it. Where is my mistake?

sqe
  • 1,656
  • 4
  • 21
  • 38
  • Perhaps it would be helpful if the question indicated, specifically, what parsed result of `/hello/this/is/a/test.html` is expected? Just removal of 'GET ' from the beginning of the string? Returned in allocated storage? – Mahonri Moriancumer Jun 28 '14 at 02:11
  • You're right, thank you. I edited the question. I need to parse everyhing between the white space after GET and the end of that following string until I hit the next white space. – sqe Jun 28 '14 at 02:15
  • unless you absolutely need to get the program under XXkb, just use sscanf - strtok/strsep are horrible. – technosaurus Jun 28 '14 at 02:30

1 Answers1

3

Question code with commentary:

char * getPathOfGetRequest(char *buf) {

  char *startingGet = "GET ";
  char buf_cpy[BUFLEN];
  memcpy(buf_cpy, buf, sizeof(buf));

The above memcpy will likely only copy 4 bytes from buf to buf_cpy. This is due to buf being a pointer to a char. sizeof(buf) is the size of a pointer (likely: 4). Perhaps, instead of using 'sizeof()', it would have been better to use 'strlen()'.

  char *urlpath = malloc(1000);
  char *path = malloc(1000);

  urlpath = strstr(buf_cpy, startingGet);

Perhaps the questioner is not clear on why urlpath was allocated 1000 bytes of memory. In any case, the above assignment will cause that 1000 bytes to be leaked, and defeats the purpose of the 'urlpath=malloc(1000)'.

The actual effect of the above statements is urlpath = buf_cpy;, as strstr() will return the position of the beginning of 'GET ' in the buf_copy.

  char delimiter[] = " ";

  path = strtok(urlpath, delimiter);

Likewise, the above assignment will cause the 1000 bytes allocated to path to be leaked, and defeats the purpose of the 'path=malloc(1000)' above.

  path = strtok(NULL, delimiter); 

  return path;
}

An alternitive coding:

char *getPathOfGetRequest(const char *buf) 
   {
   const char *start = buf;
   const char *end;
   char       *path=NULL;
   size_t      pathLen;

   /* Verify that there is a 'GET ' at the beginning of the string. */
   if(strncmp("GET ", start, 4))
      {
      fprintf(stderr, "Parse error: 'GET ' is missing.\n");
      goto CLEANUP;
      }

   /* Set the start pointer at the first character beyond the 'GET '. */
   start += 4;

   /* From the start position, set the end pointer to the first white-space character found in the string. */
   end=start;
   while(*end && !isspace(*end))
      ++end;

   /* Calculate the path length, and allocate sufficient memory for the path plus string termination. */
   pathLen = (end - start);
   path = malloc(pathLen + 1);
   if(NULL == path)
      {
      fprintf(stderr, "malloc() failed. \n");
      goto CLEANUP;
      }

   /* Copy the path string to the path storage. */
   memcpy(path, start, pathLen);

   /* Terminate the string. */
   path[pathLen] = '\0';

CLEANUP:

   /* Return the allocated storage, or NULL in the event of an error, to the caller. */
   return(path);
   }

And, finally, if 'strtok()' must be used:

char *getPathOfGetRequest(char *buf)
   {
   char *path  = NULL;

   if(strtok(buf, " "))
      {
      path = strtok(NULL, " ");
      if(path)
         path=strdup(path);
      }

   return(path);
   }
Mahonri Moriancumer
  • 5,993
  • 2
  • 18
  • 28
  • This works like a charm - thanks a bunch Mahonri! For your effort and fast reaction. So, could you please explain in a few words with I did wrong or what I could do better? I can't see why my approach didn't work. – sqe Jun 28 '14 at 02:44