Pushing to a stack containing ONLY unique values in C

Question

I've implemented a stack with pointers, that works like it's suppose too. Now, I need it push to the stack, without it pushing a duplicate. For example, if I push '2' into the stack, pushing another '2' will still result with only one '2' in the stack because it already exists.

Below is how I went about trying to create the new push function. I know that I'm suppose to traverse the stack and check it for the element I'm adding, but I guess I'm doing that wrong? Can anyone help me out?

    typedef struct Node {
        void *content;
        struct Node *next;
    } Node;

    typedef struct Stack {
        Node *head;
        int count; 
    } Stack;

    void push(Stack *stack, void *newElem) {
        Node *newNode = (Node*) malloc(sizeof(Node));
        if (stack->count > 0) {
             int i;
             for (i = 0, newNode = stack->head; i < stack->count; i++, newNode =
                 newNode->next) {
                   if (newNode->content == newElem) return;
             }
        } else {
            newNode->next = stack->head;
            newNode->content = newElem;
            stack->head = newNode;
            stack->count++;
        }
    }

Note that you should not do the `malloc()` until you know you need to add the item. If the item you pushed already existed, you'd leak memory. You have a problem with knowing how to compare the values (content) of two nodes; how big is the space pointed to by the `content` and what is an appropriate comparator function. — Jonathan Leffler, Mar 29 '13 at 19:44

score 3 · Answer 1 · answered Mar 29 '13 at 19:38

3

if (newNode->content == newElem)

You are comparing two pointers. I guess you want to check whether their contents are equal:

#include <string.h>

if (memcmp(newNode->content, newElem, size) == 0)

The value size may be indicated by the caller. In your case, it should be sizeof(int).

Moreover, once you have traversed the stack, you don't add the element to your data structure.

answered Mar 29 '13 at 19:38

md5

23,373
3
44
93

How about the for-loop? Am I iterating through the stack correctly? I'm still receiving segmentation faults – tomato Mar 29 '13 at 19:52
@EjayTumacder: Apart from the fact you should not reuse `newNode` to iterate through your stack (otherwise, it results in a memory leak), what you are doing seems "rational". Check whether you are inserting the nodes correctly. – md5 Mar 29 '13 at 19:56

score 2 · Answer 2 · answered Mar 29 '13 at 19:43

The problem is that if your stack is non-empty, and you don't find the element already in the stack, you don't do anything. You need to get rid of the else keyword and make that code unconditional. Then, you allocate space for the new Node before you know if you need it or not, and, even worse, overwrite the newly allocated pointer with your iteration over the stack to see if you need to push it or not. So move the malloc down after the } ending the if

score 1 · Accepted Answer · answered Mar 29 '13 at 19:52

1

You already have a working

void push(Stack *stack, void *newElem);

right?

So, why not write a new function

int push_unique(Stack *stack, void *newElem) {
    if (find_value(stack, newElem) != NULL) {
        return 1; // indicate a collision
    }
    push(stack, newElem); // re-use old function
    return 0; // indicate success
}

Now you've reduced the problem to writing

Node *find_value(Stack *stack, void *value);

... can you do that?

answered Mar 29 '13 at 19:52

Useless

64,155
6
88
132

Would I just iterate through the stack using the for-loop that I had posted in the question, and then use memcmp to see if they are equal? Thanks for your help, very useful advice – tomato Mar 29 '13 at 19:56
Pretty much, yes. I don't know whether you need the `memcmp` or whether your original pointer comparison is OK. – Useless Mar 30 '13 at 11:38
@Useless, a helpful observation. As suggested in my answer below, I would additionally favor find_value() operating against a Hashtable to get O(1) search time – Apr 28 '13 at 10:08

score 1 · Answer 4 · edited Dec 16 '17 at 01:52

I'm not sure you realized it, but your proposed implementation is performing a linear search over a linked list. If you're pushing 2,000 elements on a stack with an average of 2 duplicates of each element value, that's 2,000 searches of a linked list averaging between 500-750 links(it depends on when, IE:what order, the duplicates are presented to the search function in. This requires 1 million+ compares. Not pretty.

A MUCH more efficient duplicate detection in find_value() above could use a hash table, with search time O(1), or a tree, with search time O(log N). The former if you know how many values you're potentially pushing onto the stack, and the latter if the number is unknown, like when receiving data from a socket in real-time. (if the former you could implement your stack in an array instead of a much slower, and more verbose linked-list)

In either case, to properly maintain the hashtable, your pop() function would need to be paired with a hashtable hashpop() function, which would remove the matching value from the hashtable.

With a Hashtable, your stack could just point to the element's value sitting in it's hash location - returned from find_value(). With a self-balancing tree however, the location of the node, and thus the element value, would be changing all the time, so you'd need to store the element's value in the stack, and the tree. Unless you're writing in a very tight memory environment, the performance the 2nd data structure would afford would be well worth the modest cost in memory.

Pushing to a stack containing ONLY unique values in C

4 Answers4