1

I am trying to implement a binary search tree in C and am getting a segmentation fault when trying to run my code. Essentially in my code I am reading in a line of data from a file, creating a node for it and then inserting that into my binary search tree. I am not even sure if my implementation for this code is correct as I'm getting a segmentation fault when I try to run it. I'm very new to C programming and especially allocating memory so I am aware there are probably a large amount of errors in my code, even in the core structure of the code itself. As such any help with finding the reason behind the segmentation fault or help with the core structure of the code itself would be appreciated.

TheAddie
  • 73
  • 1
  • 6

4 Answers4

1

This is maybe more of a comment than an answer, but I don't have enough "reputation" to comment, so...

The problem has to do with the use, and not, of pointers, as commented by @Lundin earlier.

In this line: *node->left = insertNode(node->left, newNode);

node->left can be NULL. Inside the call to insertNode, you check for this and assign newNode to the local copy of node->left, but that has no effect on node->left, so when the return value of insertNode is written to the location pointed to by node-left (NULL), you have a guaranteed crash!

  • So would I fix this by assigning newNode to the actual copy of node-> left using a pointer? – TheAddie Aug 30 '18 at 09:34
  • Redeclaring insertNode like this would probably be a step in the right direction: void insertNode(bst_t **node, bst_t *newNode). That way you can work on the actual pointer that may be NULL. Since you are mixing passing pointers and values all over the code, it is difficult to suggest local fixes. – Johnny Johansson Aug 30 '18 at 10:16
0

You are free'ing a pointer to your stack :

int main(int argc, char *argv[]) {

bst_t root;                                       <<< root is allocated on stack
bst_t newNode;
int c;

while ((c = getchar()) != EOF){
  newNode = createNode();
  root = insertNode(&root,&newNode);
  }
  freeTree(&root);                                <<< Your are passing a pointer to root ( pointer to a stack element )
  return 0;
}



void freeTree(bst_t *parent){                    <<< parent is a pointer to a stack element
  if(! parent){
    return;
  }
  freeTree(parent->left);
  freeTree(parent->right);
  free(parent);                                  <<< you free a stack element => segmentation fault
}

You either have to allocated the root node so it can be free'ed or handle the root node in your freeTree


EDIT : Here is the syntax to malloc it :

bst_t *root = malloc(sizeof(bst_t));

malloc function returns a pointer to a newly allocated space, and takes as parameter the size of the allocation.

side note : If you wanna be really clean about it, malloc can return null if there is no memory available ( very rare ), so it is a good practice to check the return.
more info

Yanis.F
  • 612
  • 6
  • 18
  • How would I allocate the root note so it can be freed here? Sorry my understanding of pointers and allocating memory is not very great – TheAddie Aug 30 '18 at 09:30
0

I believe the problem is in the readData function. When you use scanf, you have to use the address-of operator & to read into variables.

The following code:

void readData(bst_t *node) {
while (scanf("%[^,],%[^,],%[^,],%[^,],%[^,],%[^,],%[^,],%[^,],%[^,],"
             "%[^,],%[^,],%[^,],%[^,],%[^,]",
             node->data.ID, node->data.name, node->data.sex,
             node->data.height, node->data.weight, node->data.team,
             node->data.NOC, node->data.games, node->data.year,
             node->data.season, node->data.city, node->data.sport,
             node->data.event, node->data.medal) == 14) {
    continue;
}

should be changed to:

void readData(bst_t *node) {
while (scanf("%[^,],%[^,],%[^,],%[^,],%[^,],%[^,],%[^,],%[^,],%[^,],"
             "%[^,],%[^,],%[^,],%[^,],%[^,]",
             &node->data.ID, &node->data.name, &node->data.sex,
             &node->data.height, &node->data.weight, &node->data.team,
             &node->data.NOC, &node->data.games, &node->data.year,
             &node->data.season, &node->data.city, &node->data.sport,
             &node->data.event, &node->data.medal) == 14) {
    continue;
}

Also some modifications you should make:
1. Pass arguments and return values using pointers to structures rather than structures.
2. Allocate memory on the heap and not on the stack when dealing with linked lists (Even for the root node).
3. When allocating memory on the heap, make sure that the size allocated is the size of the structure and not the size of pointer to structure.
4. Make sure you free all the memory allocated on the heap and don't mistakenly free the memory allocated on the stack as you seem to be doing here.

P.W
  • 26,289
  • 6
  • 39
  • 76
  • Hi, thanks for the help although I'm a bit confused. I understand about using pointers to structs rather than the actual structs but I'm not quite sure where I'm not doing that? (My level of understanding with pointers is not great). Additionally in terms of the heap and the stack, I think I understand these concepts but where do they come into the program here? As in how would I allocate and free the memory on the heap rather than the stack. Cheers – TheAddie Aug 30 '18 at 09:27
  • The functions `createNode` and `insertNode` are returning structs and not pointers to structs. when you use malloc, calloc or realloc to allocate memory, you are allocating memory on the heap. When you just declare a variable (and not a pointer to it) you are placing that variable on the stack. For example: `int* i; i = (int*) malloc(sizeof(int)); //heap allocation` `int j; //stack allocation` – P.W Aug 30 '18 at 09:34
  • So would changing the declaration in createNode to `bst_t *newNode;` `newNode = (struct bst_t *)malloc(sizeof(struct bst_t));` fix this? – TheAddie Aug 30 '18 at 09:48
  • What a function returns is preceded before the function. `bst_t insertNode(bst_t *node, bst_t *newNode)` should be `bst_t* insertNode(bst_t *node, bst_t *newNode)`. And of course, what you said in your comment also is correct. – P.W Aug 30 '18 at 09:53
  • Ok I think I understand this now and have fixed it in the code. However obviously there is still more wrong with the code to cause segmentation faults, assuming the allocation is now fine then there is the freeing that's causing the issue, so how would I go about freeing the heap rather than the stack here? – TheAddie Aug 30 '18 at 10:02
  • Declare `root` and `newNode` in `main` as pointers to `bst_t` and memory will be allocated on the heap for them. – P.W Aug 30 '18 at 10:04
  • Ok thanks very much, still getting seg faults but I definitely am understanding this more! – TheAddie Aug 30 '18 at 10:16
  • I tested out the code after making the changes on my PC and it works fine. – P.W Aug 30 '18 at 10:21
  • I think this may be as I am trying in with a large data set 100,000+ records which is what gives a segmentation fault but ill update my code in the main post to see if theres anything I missed, if you get a chance to look at it that would be much appreciated – TheAddie Aug 30 '18 at 10:30
  • When I just run the code by itself I get no segmentation fault it seems, just with the dataset – TheAddie Aug 30 '18 at 10:47
0

There are multiple problems in your code:

  • root is uninitialized in the main function. It must be initialized to NULL before passing it to insertNode().

  • you should only parse a single line in readdata() and return a success indicator.

  • you should iterate in the main loop until readdata fails to read more records from the file. You current test while ((c = getchar()) != EOF) is inappropriate and corrupts this input stream.

Here are modified versions of the readdata and main functions:

bst_t *createNode(void) {
    bst_t *newNode;
    newNode = (struct bst_t *)malloc(sizeof(struct bst_t));
    newNode->left = NULL;
    newNode->right = NULL;
    return newNode;
}

int readData(bst_t *node) {
    return scanf("%[^,],%[^,],%[^,],%[^,],%[^,],%[^,],%[^,],%[^,],%[^,],"
                 "%[^,],%[^,],%[^,],%[^,],%[^,]",
                 node->data.ID, node->data.name, node->data.sex,
                 node->data.height, node->data.weight, node->data.team,
                 node->data.NOC, node->data.games, node->data.year,
                 node->data.season, node->data.city, node->data.sport,
                 node->data.event, node->data.medal) == 14;
}

int main(int argc, char *argv[]) {
    bst_t *root = NULL;
    bst_t *newNode;

    while ((newNode = createNode()) != NULL) {
        if (readData(newNode)) {
            /* record was read correctly: insert into the tree */
            root = insertNode(root, newNode);
        } else {
            /* end of file or input error: stop reading the file */
            free(newNode);
            break;
        }
    }
    freeTree(root);
    return 0;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • While this does seem to make sense to me, when implementing it this way the code never actually executes the if loop inside the while loop in main. I.e readData is never called an no nodes are created – TheAddie Aug 31 '18 at 05:16
  • It does however fix segmentation issues – TheAddie Aug 31 '18 at 05:16
  • You might have missed some parentheses when copying the code. Do you retype from the answer or use cut and paste directly? – chqrlie Aug 31 '18 at 08:08
  • Try the above version of `createNode` that does not read the data. – chqrlie Aug 31 '18 at 18:16
  • Maybe a problem with the readData function reading too much or too little on the first run through and therefore not satisfying the conditions on the next? – TheAddie Sep 01 '18 at 01:35
  • After doing a lot of testing it seems that the issue definitely is that after the first readData function it acts as though there is no more data to be read. Also I'm not sure if this was clear but I am reading in from a file using stdin – TheAddie Sep 01 '18 at 08:39
  • @TheAddie: `readData` returns false if there is no data to read **or** if the data does not match the format string. If the CSV file contains empty fields, `scanf()` cannot parse them with `%^,]` because the conversion specification fails if there is no character different from `,` for any given field. Can you post in an EDIT the first few lines for the CSV file? – chqrlie Sep 02 '18 at 16:11
  • Sorry for the late response but I've managed to fix the issue, using fgets instead of scanf and reading per line manages to get the desired result. – TheAddie Sep 03 '18 at 02:07
  • @TheAddie: excellent improvement. reading lines with `fgets()` and scanning them with `sscanf()` or another more precise method is indeed a better approach. – chqrlie Sep 03 '18 at 17:51