0

Problem: when printing items in a hash table, non top-level items do not print correctly. I add items line-by-line from a text file. First, number of lines are determined, and then, the hash table is constructed, with the size found in the first loop of the file.

hash constructor

hash::hash(unsigned int tSize)
{
    this->tableSize = tSize;
    hashTable.resize(tableSize);
    for (int i = 0; i < tableSize; i++)
    {
        hashTable[i] = new item;
        hashTable[i]->firstLetters = "__";
        hashTable[i]->str = "____";
        hashTable[i]->next = NULL;
    }
}

adding items, setting values, and printing them

void hash::add(std::string key)
{
    int index = hafn(key);
    if (hashTable[index]->str == "____")
    {
        hashTable[index]->str = key;
        setFirstLetters(index,key[0],key[1]);
    }
    else
    {
        item *iter = hashTable[index];
        item *n_ptr = new item;
        n_ptr->str = key;
        setFirstLetters(n_ptr, key[0], key[1]);
        n_ptr->next = NULL;
        while (iter->next != NULL)
        {
            iter = iter->next;
        }
        iter->next = n_ptr;
    }
}

void hash::setFirstLetters(item *n_ptr, char a, char b)
{
    n_ptr->firstLetters[0] = a;
    n_ptr->firstLetters[1] = b;
}

void hash::setFirstLetters(int index, char a, char b)
{
    hashTable[index]->firstLetters[0] = a;
    hashTable[index]->firstLetters[1] = b;
}


void hash::print()
{
    int num;
    for (int i = 0; i < tableSize; i++)
    {
        num = numInIndex(i);
        printer(num, i);
        if (num > 1)
        {
            int c = 0;
            for (int j = num - 1; j > 0; j--)
            {
                printer(c, num, i);
                c++;
            }
        }
    }
}

void hash::printer(int num, int i)
{
    item *iter = hashTable[i];
    cout << "-------------------------------" << endl;
    cout << "index = " << i << endl;
    cout << iter->str << endl;
    cout << iter->firstLetters << endl;
    cout << "# of items = " << num << endl;
    cout << "-------------------------------" << endl;
    cout << endl;
}

void hash::printer(int numIn, int num, int i)
{
    item *iter = hashTable[i];
    for (int j = 0; j < numIn; j++)
    {
        iter = iter->next;
    }
    cout << "-------------------------------" << endl;
    cout << "index = " << i << endl;
    cout << iter->str << endl;
    cout << std::flush;
    cout << iter->firstLetters << endl; //this does not work, though the pointer is pointing to the correct values
    //printf("%s\n", iter->firstLetters.c_str()); //this works, even without flushing
    cout << "# of items = " << num << endl;
    cout << "-------------------------------" << endl;
    cout << endl;
}


    struct item
    {
        std::string str;
        std::string firstLetters;
        item *next;
    }; 

The problem is that firstLetters does not print correctly. firstLetters is set correctly. However, in third- and later level items (ie, using the same hash index), firstLetters does not print at all.

In order to be more clear, here is an output example:

-------------------------------
index = 15
will
wi
# of items = 3
-------------------------------

-------------------------------
index = 15
will
wi
# of items = 3
-------------------------------

-------------------------------
index = 15
good

# of items = 3
-------------------------------

Notice, under the heading "adding items, setting values, and printing them," and in the method hash::add(std::string key), I use setFirstLetters() to access the elements inside std::string firstLetters without first initializing them. This causes any change of these values to be lost, essentially. When accessing iter->firstLetters when it's time to print the values, no actual data can be accessed. To avoid this undefined behavior, I changed the definition of hash::add(std::string key) to set the value of firstLetters before attempting to change them.

New definition of hash::add()

void hash::add(std::string key)
{
    int index = hafn(key);
    if (hashTable[index]->str == "____")
    {
        hashTable[index]->str = key;
        setFirstLetters(index,key[0],key[1]);
    }
    else
    {
        item *iter = hashTable[index];
        item *n_ptr = new item;
        n_ptr->firstLetters = "__";        //here
        n_ptr->str = key;
        setFirstLetters(n_ptr, key[0], key[1]);
        n_ptr->next = NULL;
        while (iter->next != NULL)
        {
            iter = iter->next;
        }
        iter->next = n_ptr;
    }
}

Adding this line fixes the output.
This is the new output:

-------------------------------
index = 15
will
wi
# of items = 3
-------------------------------

-------------------------------
index = 15
will
wi
# of items = 3
-------------------------------

-------------------------------
index = 15
good
go
# of items = 3
-------------------------------

This type of behavior counts as "undefined behavior." There are several moving parts here and the only cause of this behavior is due to lack of initialization. In cases where std::cout fails, but printf() does not, make sure any members/variables are initialized before attempting to access them. In specific cases using std::string, the [] operator can only function properly after initializing normally, using the = operator or other member functions of std::string.

Adam
  • 39
  • 1
  • 9
  • I have been using the debugger continually. On the line that `firstLetters` is printed, that variable is correct inside `struct item` – Adam Nov 12 '18 at 23:42
  • You seem to be expecting us to debug your uncompilable code. –  Nov 12 '18 at 23:44
  • I am not. The code does compile. I haven't given every single piece of code, only the relevant ones. I just need help on situations that `cout` will not print, even though the output stream has been flushed. – Adam Nov 12 '18 at 23:45
  • 2
    Bug in there somewhere that isn't jumping out at me. Could we trouble you for a [mcve]? This is important because if you don't know what the bug is, how can you know what is and what isn't relevant? The real beauty of the MCVE is usually you get in part way and the reduced code to bug ratio reveals the bug to you. – user4581301 Nov 12 '18 at 23:46
  • Unrelated: `endl` is a line ending AND a stream flush. Generally you want to buffer data in memory until you're forced to flush as flushing can be really expensive. You can replace most of those `endl`s with a newline character `'\n'` and profit. – user4581301 Nov 12 '18 at 23:48
  • I know that putting together a compile-able example that demonstrates the problem is a huge pain in the butt. However, it is better that you post the complete code instead of each of us making our own complete code, which may miss the bug you want us to help you to solve. Help us help you. – Eljay Nov 12 '18 at 23:54
  • That's, I guess, the problem with the idea of MCVE: generally complete and minimal are incompatible. In this case, I'll focus on complete instead of minimal, and add all 4 files necessary, I'll edit that in – Adam Nov 12 '18 at 23:58
  • About the only thing that could explain your results is undefined behavior. The most likely source is a problem with the pointer. – Mark Ransom Nov 13 '18 at 00:03
  • @MarkRansom Thanks, though when I debug the code the pointer is pointing to the correct reference, and `cout` seems to check the correct string. It just doesn't print. – Adam Nov 13 '18 at 00:07
  • @Adam When creating new secondary `item` objects when adding keys, you don't allocate any memory for `firstLetters` before accessing `firstLetters[0]` and `firstLetters[1]`, so you are trashing random memory (using `firstLetters.at(...)` instead of `firstLetters[...]` would have caught that mistake, by throwing an `std::out_of_range` exception). Only the `hash` constructor is allocating any memory for `firstLetters` in the top-level `item` objects. So, inside of `setFirstLetters()`, you need to add a `firstLetters = "__";` statement before then assigning the individual characters. – Remy Lebeau Nov 13 '18 at 00:12
  • @RemyLebeau Right, but I go through the entire file first, then find how many words there are. After I do that, I allocate memory for `firstLetters` with the constructor. How else should I allocate memory for `firstLetters`? It is already defined as `"__"` at the time of creating a `new item` and adding keys – Adam Nov 13 '18 at 00:13
  • 1
    @Adam You are initializing `firstLetters = "__"` only for the initial top-level `item` objects that the `hash` constructor creates, but you are NOT initializing `firstLetters` to *anything* in the secondary `item` objects that `hash::add()` creates, so `n_ptr->firstLetters` is blank when `setFirstLetters(n_ptr, ...)` is called, so `setFirstLetters()` tries to index into invalid memory. If you want `firstLetters` to alows be initialized as `"__", you should add a constructor to the `item` struct for that. – Remy Lebeau Nov 13 '18 at 00:14
  • And there's undefined behavior in a nutshell. The perverse part is sometimes things look like they're working, as they did with your `printf`. Job well done @RemyLebeau. – Mark Ransom Nov 13 '18 at 00:31
  • @MarkRansom Is it possible to have an answer setup/have this moved so that it can better benefit people searching for similar problems? I tried to edit the question to be as close as possible (although there's still the issue with the fact that 100% of my code is required to reproduce the problem, yet the problem is smaller in scope than 100% of the code) to the requirements – Adam Nov 13 '18 at 00:33
  • @Adam An answer can't be posted until the question first receives 5 votes to be reopened. – Remy Lebeau Nov 13 '18 at 01:00
  • An answer is unlikely to help anyone else, since undefined behavior probably won't show the same symptoms without your exact code. I'm not going to vote to reopen. – Mark Ransom Nov 13 '18 at 01:13

0 Answers0