-1

Hello and TIA for your help. As I am new to to posting questions, I welcome any feedback on how this quesiton has been asked. I have researched much in SO without finding what I thought I was looking for.

I'm still working on it, and I'm not really good at C.

My purpose is extracting data from certain specific tags from a given XML and writing it to file. My issue arises because as I try to fill up the data struct I created for this purpose, at a certain point the realloc() function gives me a pointer to an address that's out of bounds.

If you look at this example

#include <stdio.h>

int main() {
    char **arrayString = NULL;
    char *testString;
    testString = malloc(sizeof("1234567890123456789012345678901234567890123456789"));
    strcpy(testString, "1234567890123456789012345678901234567890123456789");
    int numElem = 0;
    while (numElem < 50) {
        numElem++;
        arrayString = realloc(arrayString, numElem * sizeof(char**));
        arrayString[numElem-1] = malloc(strlen(testString)+1);
        strcpy(arrayString[numElem-1], testString);
    }
    printf("done\n");
    return 0;
}

it does a similar, but simplified thing to my code. Basically tries to fill up the char** with c strings but it goes to segfault. (Yes I understand I am using strcpy and not its safer alternatives, but as far as I understand it copies until the '\0', which is automatically included when you write a string between "", and that's all I need)

I'll explain more in dephth below.

In this code i make use of the libxml2, but you don't need to know it to help me.

I have a custom struct declared this way:

 struct List {
    char key[24][15];
    char **value[15];
    int size[15];
 };

struct List *list; //i've tried to make this static after reading that it could make a difference but to no avail

Which is filled up with the necessary key values. list->size[] is initialized with zeros, to keep track of how many values i've inserted in value.

value is delcared this way because for each key, i need an array of char* to store each and every value associated with it. (I thought this through, but it could be a wrong approach and am welcome to suggestions - but that's not the purpose of the question)

I loop through the xml file, and for each node I do a strcmp between the name of the node and each of my keys. When there is a match, the index of that key is used as an index in the value matrix. I then try to extend the allocated memory for the c string matrix and then afterwards for the single char*.

The "broken" code, follows, where

  • read is the index of the key abovementioned.
  • reader is the xmlNode
  • string contained the name of the xmlNode but is then freed so consider it as if its a new char*
  • list is the above declared struct
if (xmlTextReaderNodeType(reader) == 3 && read >= 0)
    {
        /* pull out the node value */
        xmlChar *value;
        value = xmlTextReaderValue(reader);     
        if (value != NULL) {
            free(string);
            string=strdup(value);           
            /*increment array size */
            list->size[read]++;
            /* allocate char** */ list->value[read]=realloc(list->value[read],list->size[read] * sizeof(char**));
            if (list->value[read] == NULL)
                return 16;
            /*allocate string (char*) memory */
            list->value[read][list->size[read]-1] = realloc(list->value[read][list->size[read]-1], sizeof(char*)*sizeof(string));
            if (list->value[read][list->size[read]-1] == NULL)
                return 16;
            /*write string in list */
            strcpy(list->value[read][list->size[read]-1], string);
        }
        /*free memory*/
        xmlFree(value);
    }
    xmlFree(name);
    free(string);

I'd expect this to allocate the char**, and then the char*, but after a few iteration of this code (which is a function wrapped in a while loop) i get a segfault.

Analyzing this with gdb (not an expert with it, just learned it on the fly) I noticed that indeed the code seems to work as expected for 15 iteration. At the 16th iteration, the list->value[read][list->size[read]-1] after the size is incremented, list->value[read][list->size[read]-1] points to a 0x51, marked as address out of bounds. The realloc only brings it to a 0x3730006c6d782e31, still marked as out of bounds. I would expect it to point at the last allocated value.

Here is an image of that: https://i.stack.imgur.com/WQhXt.jpg

How can I properly allocate the needed memory without going out of bounds?

9Snick4
  • 31
  • 7

1 Answers1

1

Your code has quite a few problems:

  1. You are not including all the appropriate headers. How did you get this to compile? If you are using malloc and realloc, you need to #include <stdlib.h>. If you are using strlen and strcpy, you need to #include <string.h>.
  2. Not really a mistake, but unless you are applying sizeof to a type itself you don't have to use enclosing brackets.
  3. Stop using sizeof str to get the length of a string. The correct and safe approach is strlen(str)+1. If you apply sizeof to a pointer someday you will run into trouble.
  4. Don't use sizeof(type) as argument to malloc, calloc or realloc. Instead, use sizeof *ptr. This will avoid your incorrect numElem * sizeof(char**) and instead replace it with numElem * sizeof *arrayString, which correctly translates to numElem * sizeof(char*). This time, though, you were saved by the pure coincidence that sizeof(char**) == sizeof(char*), at least on GCC.
  5. If you are dynamically allocating memory, you must also deallocate it manually when you no longer need it. Use free for this purpose: free(testString);, free(arrayString);.
  6. Not really a mistake, but if you want to cycle through elements, use a for loop, not a while loop. This way your intention is known by every reader.

This code compiles fine on GCC:

#include <stdio.h> //NULL, printf
#include <stdlib.h> //malloc, realloc, free
#include <string.h> //strlen, strcpy

int main()
{
    char** arrayString = NULL;
    char* testString;
    testString = malloc(strlen("1234567890123456789012345678901234567890123456789") + 1);
    strcpy(testString, "1234567890123456789012345678901234567890123456789");
    for (int numElem = 1; numElem < 50; numElem++)
    {
        arrayString = realloc(arrayString, numElem * sizeof *arrayString);
        arrayString[numElem - 1] = malloc(strlen(testString) + 1);
        strcpy(arrayString[numElem - 1], testString);
    }
    free(arrayString);
    free(testString);
    printf("done\n");
    return 0;
}
DarkAtom
  • 2,589
  • 1
  • 11
  • 27
  • Thank you for pointing out the errors. As you can see I'm still learning. Weird about point 2, as I was able to compile it on my machine. I understand about point 6, but compiler issues won't allow me to use for loops. Would you be so kind to explain points 3 and 4? – 9Snick4 Sep 16 '19 at 08:15
  • @9Snick4 point 3: Declare a `char*` pointer and allocate memory into it using `malloc`. Then, use `strcpy` to copy a string into it. If you use `strlen` on the pointer, you get the length of the string without the NULL character. If you use `sizeof` you get `sizeof(char*)` which is not what you want. – DarkAtom Sep 16 '19 at 20:04
  • point 2 was a mistake, I compiledy code wrong when I wrote it. The correct statement is: you don't have to enclose the object within () when applying `sizeof` to it but you **can** if you want. If you are applying `sizeof` to a type, then you **have to** use () to enclose it. – DarkAtom Sep 16 '19 at 20:07
  • @9Snick4 point 4: don't apply sizeof to the type of the pointer when using `malloc`, but rather to the type you point to. You used `sizeof(char**)` in you `malloc` when you should have used `sizeof(char*)`. In order to avoid mistakes like this you can simply use `sizeof *ptr` and the compiler will get the correct type for you. – DarkAtom Sep 16 '19 at 20:10
  • Thank you for taking the time to write that to me. I appreciate that you shared your expertise with me. – 9Snick4 Sep 17 '19 at 07:41