6

I think it's the very first strtok call that's failing. It's been a while since I've written C and I'm at a loss. Thanks very much.

#include <stdio.h>
#include <string.h>

int main(int argc, char **argv) {
  char *str = "one|two|three";

  char *tok = strtok(str, "|");

  while (tok != NULL) {
    printf("%s\n", tok);
    tok = strtok(NULL, "|");
  }

  return 0;
}
nc.
  • 7,179
  • 5
  • 28
  • 38
  • 1
    Next time, don't be at a loss: you can use any debugger to find out exactly where the invalid access occurred. Or better yet, use valgrind, which can detect invalid accesses even when they don't crash your program. – John Zwinck Dec 18 '10 at 22:45
  • so, is valgrind a superset of gdb functionality? I'm trying to figure whether it'd be best to do a crash course in gdb, or in valgrind. – nc. Dec 18 '10 at 22:57
  • valgrind and gdb are complementary. valgrind is noninteractive, but substantially pickier about memory accesses than gdb. – zwol Dec 18 '10 at 23:24
  • Tangential editorializing about your code: `strsep` is functionally equivalent to `strtok` but with a superior calling convention. It's a nonstandard BSDism, but everyone except Windows has it, and it's a shame that C99 invented `strtok_r` instead of picking it up. – zwol Dec 18 '10 at 23:25
  • `strsep` has significantly different semantics in the presence of multiple consecutive delimiter characters. – R.. GitHub STOP HELPING ICE Dec 18 '10 at 23:43

3 Answers3

7

String literals should be assigned to a const char*, as modifying them is undefined behaviour. I'm pretty sure that strtok modifies it's argument, which would explain the bad things that you see.

Puppy
  • 144,682
  • 38
  • 256
  • 465
  • 1
    Yes, you are exactly right. `strtok()` modifies its input string – SiegeX Dec 18 '10 at 22:43
  • 2
    You are right: strtok modifies its first argument. Assigning a string literal to a non-const char* is supported for legacy reasons but should not be done in new code. – John Zwinck Dec 18 '10 at 22:43
  • 2
    With GCC, I believe that `-Wwrite-strings` (not included in `-Wall`) will get you a compiler warning when you try to do this. Of course, if your code isn't all const-correct, you could end up with some spurious warnings here too, e.g. when passing a string literal into a function of your own which doesn't modify its arguments, but is declared as `f(char *)` instead of `f(const char *)`. – Cascabel Dec 18 '10 at 22:47
  • @Jefromi - It's not included in `-Wextra` either, which surprised me, although probably for the reasons you state. Now I have to change my default warnings levels... – Chris Lutz Dec 18 '10 at 23:08
  • It's not in `-Wall` nor `-Wextra` because it can be an *enormous* amount of work to make all uses of string literals const-correct, especially in old code, and the odds of your actually finding a bug are pretty low. – zwol Dec 18 '10 at 23:21
2

There are 2 problems:

  1. Make str of type char[]. GCC gives the warning foo.cpp:5: warning: deprecated conversion from string constant to ‘char*’ which indicates this is a problematic line.

  2. Your second strtok() call should have NULL as its first argument. See the docs.

The resulting working code is:

#include <stdio.h>
#include <string.h>

int main(int argc, char **argv) {
  char str[] = "one|two|three";

  char *tok = strtok(str, "|");

  while (tok != NULL) {
    printf("%s\n", tok);
    tok = strtok(NULL, "|");
  }

  return 0;
}

which outputs

one
two
three
moinudin
  • 134,091
  • 45
  • 190
  • 216
  • I actually don't get that warning out of GCC; it's perfectly clean. I just tried -Wall and still don't see it (gcc version 4.2.1 (Apple Inc. build 5664)). Any suggestions on how to get that? – nc. Dec 18 '10 at 22:56
  • He got that warning because he compiled your code as C++ rather than C. The moral equivalent of `-Wwrite-strings` is on by default in C++. – zwol Dec 18 '10 at 23:23
0

I'm not sure what a "bus" error is, but the first argument to strtok() within the loop should be NULL if you want to continue parsing the same string.

Otherwise, you keep starting from the beginning of the same string, which has been modified, by the way, after the first call to strtok().

Jonathan Wood
  • 65,341
  • 71
  • 269
  • 466
  • sorry, yeah, I was using NULL in subsequent calls, but when I trimmed it down for pasting I missed that part. – nc. Dec 18 '10 at 22:55