0

I have a buffer that holds a string from a CSV file that I opened and read. I split the string up by using strtok() and split on the " , ". So now my string looks like this:

char buff[BUFFER_SIZE] = "1000" "CAP_SETPCAP" "CAP_NET_RAW"

I want to make comparisons now for each section of the string, but for the life of me I cannot get it to work. I want to be able to do it without hard coding anything meaning I don't want to assume how many spaces I need to move over. For example to start at CAP_SETPCAP I don't want to have to put buff+5. Anybody know a better way to handle this?

#include<stdio.h>
#include<stdlib.h>
#include<string.h>

#define BUFFER_SIZE 1024

int main(int argc, char *argv[]) {

       FILE *fp = fopen("csvtest.csv", "r");
       char buff[BUFFER_SIZE];


       fgets(buff, 1024, fp);
       char *csvData = strtok(buff, ",");
       while(csvData != NULL){
             csvData = strtok(NULL, ",");
       }

      int i;
      while(buff[i] != '\0'){
         strcmp(buff, "CAP_NET_RAW")
         printf("Match found");
         i++;
      }

      //or I wanted to do string comparison, but I kept getting
      //segmentation fault (core dumped)

      char *found;
      found = strstr(buff, "CAP_NET_RAW");
      printf("%s\n", found);

      fclose(fp);

      return 0;
}
Roberto Caboni
  • 7,252
  • 10
  • 25
  • 39
ldd12345
  • 91
  • 5
  • For what it's worth you don't check the result of your `fgets` call. – tadman Apr 29 '20 at 20:45
  • If these are packets you probably want to set the buffer size to be in line with the expected packet size, like ~1500 bytes, not 1024. Additionally if you have a constant for your `BUFFER_SIZE` then please *use it*, don't just hard-code 1024 all over the place. – tadman Apr 29 '20 at 20:45
  • This code also appears to have numerous crippling syntax errors, like the `if (strcmp(...)` is missing a closing bracket. – tadman Apr 29 '20 at 20:47
  • 1
    I copied it by hand, I was just trying to show the process. – ldd12345 Apr 29 '20 at 20:49
  • I did check the result of the fgets as well I left that out because once again i copied by hand – ldd12345 Apr 29 '20 at 20:50
  • Note that your `while (buff[i] ...)` loop will run forever if there's no match, so it's bound to jam eventually. – tadman Apr 29 '20 at 20:50
  • If this isn't the code you're actually using that's going to make debugging your actual code a lot harder. – tadman Apr 29 '20 at 20:50
  • Doesn't `strstr(buff, "CAP_NET_RAW");` find what you are searching for? Does the segfault occur before it? – Roberto Caboni Apr 29 '20 at 20:53
  • no the segfault occurs during that. I commented everything out and just ran the strstr() and it won't work – ldd12345 Apr 29 '20 at 20:55
  • I think you need to do the `strstr` on `csvData` into the loop `while(csvData != NULL){` or not use strtok at all – Ôrel Apr 29 '20 at 20:55
  • What if you substitute the print after `strstr` with `printf("%s\n", found? found : "NULL");` ? – Roberto Caboni Apr 29 '20 at 21:00
  • So I commented everything out except" char *found, found = strstr(buff, "CAP_NET_RAW");, printf("Match found: %s\n", found. but it prints to the screen "CAP_NET_RAW" and "CAP_SYS_ADMIN". – ldd12345 Apr 29 '20 at 21:06
  • I use a state machine to parse csv files like 20 lines of code, strok wont work as commas can be part of the field, plus can segfault just lke the other C lib calls, but if you assume thats it I would still just parse through it in a one line loop, either make a list of field ends in another array or copy each field to another array/string as you go. and to see what I am talking about put some commas and quotes in various fields then save as to a csv file...quotes are the key moreso than commas, but see what you see... – old_timer Apr 29 '20 at 21:08
  • if parsing is not your problem but dealing with the fields that is another story I always operate per field before I fetch the next...YMMV – old_timer Apr 29 '20 at 21:10
  • `CAP_SYS_ADMIN"`? There's not such string in your example. Is it after `"CAP_NET_RAW"`? If yes, it is expected, since the found is a substring starting with `"CAP_NET_RAW"`'til the end of buff. – Roberto Caboni Apr 29 '20 at 21:29
  • I apologize it printed CAP_NET_RAW and CAP_SETPCAP – ldd12345 Apr 29 '20 at 21:32

1 Answers1

0

Your code has three different sections. Lets analyze them:


1. The strtok section

You get the data from the file and then you iterate on strtok:

fgets(buff, 1024, fp);
char *csvData = strtok(buff, ",");
while(csvData != NULL){
    csvData = strtok(NULL, ",");
}

You seem not interested in what you found in the different positions: in fact csvData is always overwritten with the last token. And at last it is equal to NULL.

The only thing you get is having the commas in the original array buff overwritten with '\0'. Printing buff you will only see "1000", because after this substring there is the string terminator placed by strtok.


2. Searching "CAP_NET_RAW"

You now iterate on buff[i] until the string terminator. But the string terminator is after the first substring "1000"!

int i;
while(buff[i] != '\0'){
    strcmp(buff, "CAP_NET_RAW")
    printf("Match found");
    i++;
}

Furthermore you search for CAP_NET_RAW, but even without the inner-terminators-issue, the comparison would never succeed. That's because (1) the string actually present in buff is "CAP_NET_RAW" (with double quotes); (2) that token is the last of the row, an it sitll will have the trailing '\n' (fgets doesn't remove it).

By the way: I copied the code after your edit, and now there's no check on strcmp() return value. I suppose it is a typo. Note: strcmp returns 0 if the string match.


3. The strstr attempt

Finally you look for the string using the strstr function. That's a clever idea. But as already said before, buff doesn't contain it. Well, the buffer actually do contain it, but string utilities will stop at the first '\0' they found.

  char *found;
  found = strstr(buff, "CAP_NET_RAW");
  printf("%s\n", found);

So found will be NULL, and dereferencing a NULL pointer (that's what %s tells printf to do) will lead to a segmentation fault.


4. Conclusions

As a very simple way to find the only string you care of, I suggest to use only strstr, without using strtok before. Alternatively you can still use strtok, but saving tokens in different strings so that you can access them later.

Roberto Caboni
  • 7,252
  • 10
  • 25
  • 39
  • I read the buffer data in from a csv so it actually originally looked like this: 1000,CAP_SETPCAP,CAP_NET_RAW but yes you're right I can't ever get passed the 1000 part of the string when trying to print it back and yes the found returns null. – ldd12345 Apr 29 '20 at 21:30
  • So no, double quotes in your original string? I guessed it from your sample line. Well, in that case just ignore those statements. – Roberto Caboni Apr 29 '20 at 21:32
  • no double quotes, I was just trying to show seperation, I can see how that is confusing – ldd12345 Apr 29 '20 at 21:36
  • 1
    I got strstr to work when I put it before strtok(). Thank you. – ldd12345 Apr 29 '20 at 21:42
  • @ldd12345 `%s` tells `printf`: _"The argument corresponding to this format is a string (a pointer to char). Copy to the output all the characters it points to until you find '\0'"_. Resolving the address contained into a pointer in order to read its contents is said "dereferencing" the pointer. – Roberto Caboni Apr 29 '20 at 21:49