2

I am working on a query processor that reads in long lists of document id's from memory and looks for matching id's. When it finds one, it creates a DOC struct containing the docid (an int) and the document's rank (a double) and pushes it on to a priority queue. My problem is that when the word(s) searched for has a long list, when I try to push the DOC on to the queue, I get the following exception: Unhandled exception at 0x7c812afb in QueryProcessor.exe: Microsoft C++ exception: std::bad_alloc at memory location 0x0012ee88..

When the word has a short list, it works fine. I tried pushing DOC's onto the queue in several places in my code, and they all work until a certain line; after that, I get the above error. I am completely at a loss as to what is wrong because the longest list read in is less than 1 MB and I free all memory that I allocate. Why should there suddenly be a bad_alloc exception when I try to push a DOC onto a queue that has a capacity to hold it (I used a vector with enough space reserved as the underlying data structure for the priority queue)?

I know that questions like this are almost impossible to answer without seeing all the code, but it's too long to post here. I'm putting as much as I can and am anxiously hoping that someone can give me an answer, because I am at my wits' end.

The NextGEQ function reads a list of compressed blocks of docids block by block. That is, if it sees that the lastdocid in the block (in a separate list) is larger than the docid passed in, it decompresses the block and searches until it finds the right one. Each list starts with metadata about the list with the lengths of each compressed chunk and the last docid in the chunk. data.iquery points to the beginning of the metadata; data.metapointer points to wherever in the metadata the function currently is; and data.blockpointer points to the beginning of the block of uncompressed docids, if there is one. If it sees that it was already decompressed, it just searches. Below, when I call the function the first time, it decompresses a block and finds the docid; the push onto the queue after that works. The second time, it doesn't even need to decompress; that is, no new memory is allocated, but after that time, pushing on to the queue gives a bad_alloc error.

Edit: I cleaned up my code some more so that it should compile. I also added in the OpenList() and NextGEQ functions, although the latter is long, because I think the problem is caused by a heap corruption somewhere in it. Thanks a lot!

struct DOC{

    long int docid;
    long double rank;

public:
    DOC()
    {
        docid = 0;
        rank = 0.0;
    }

    DOC(int num, double ranking)
    {
        docid = num;
        rank = ranking;

    }

     bool operator>( const DOC & d ) const {
       return rank > d.rank;
    }

      bool operator<( const DOC & d ) const {
       return rank < d.rank;
    }
    };

struct listnode{

int* metapointer;
int* blockpointer;
int docposition;
int frequency;
int numberdocs;
int* iquery;
listnode* nextnode;

};

void QUERYMANAGER::SubmitQuery(char *query){

    listnode* startlist;

        vector<DOC> docvec;
        docvec.reserve(20);
        DOC doct;


    //create a priority queue to use as a min-heap to store the documents and rankings;


        priority_queue<DOC, vector<DOC>,std::greater<DOC>> q(docvec.begin(), docvec.end());

        q.push(doct);

    //do some processing here; startlist is a pointer to a listnode struct that starts the   //linked list

        //point the linked list start pointer to the node returned by the OpenList method

        startlist = &OpenList(value);
        listnode* minpointer;
        q.push(doct);


        //start by finding the first docid in the shortest list
            int i = 0;
            q.push(doct);
            num = NextGEQ(0, *startlist);
            q.push(doct);
            while(num != -1)
               {

            q.push(doct);

    //the is where the problem starts - every previous q.push(doct) works; the one after
    //NextGEQ(num +1, *startlist) gives the bad_alloc error

            num = NextGEQ(num + 1, *startlist);

         //this is where the exception is thrown
            q.push(doct);               
        }

    }



//takes a word and returns a listnode struct with a pointer to the beginning of the list
//and metadata about the list 
listnode QUERYMANAGER::OpenList(char* word)
{
    long int numdocs;

    //create a new node in the linked list and initialize its variables


    listnode n;
    n.iquery = cache -> GetiList(word, &numdocs);
    n.docposition = 0;
    n.frequency = 0;
    n.numberdocs = numdocs;

   //an int pointer to point to where in the metadata you are
    n.metapointer = n.iquery;
    n.nextnode = NULL;
  //an int pointer to point to the uncompressed block of data, if there is one
    n.blockpointer = NULL;



    return n;


}


int QUERYMANAGER::NextGEQ(int value, listnode& data)
{
     int lengthdocids;
     int lengthfreqs; 
     int lengthpos;
     int* temp;
     int lastdocid;


     lastdocid = *(data.metapointer + 2);

while(true)
{

         //if it's not the first chunk in the list, the blockpointer will be pointing to the 
        //most recently opened block and docpos to the current position in the block
    if( data.blockpointer && lastdocid >= value)
    {

            //if the last docid in the chunk is >= the docid we're looking for,
            //go through the chunk to look for a match


        //the last docid in the block is in lastdocid; keep going until you hit it
        while(*(data.blockpointer + data.docposition) <= lastdocid)
        {
            //compare each docid with the docid passed in; if it's greater than or equal to it, return a pointer to the docid
             if(*(data.blockpointer + data.docposition ) >= value)
             {

                 //return the next greater than or equal docid
                 return *(data.blockpointer + data.docposition);
             }
             else
             {
                 ++data.docposition;
             }
        }

        //read through the whole block; couldn't find matching docid; increment metapointer to the next block;
        //free the block's memory

        data.metapointer += 3;
        lastdocid = *(data.metapointer + 3);
        free(data.blockpointer);
        data.blockpointer = NULL;
    }


        //reached the end of a block; check the metadata to find where the next block begins and ends and whether 
        //the last docid in the block is smaller or larger than the value being searched for


        //first make sure that you haven't reached the end of the list
            //if the last docid in the chunk is still smaller than the value passed in, move the metadata pointer
           //to the beginning of the next chunk's metadata; read in the new metadata


            while(true)
         //  while(*(metapointers[index]) != 0 )
           {
               if(lastdocid < value && *(data.metapointer) !=0)
               {
               data.metapointer += 3;
               lastdocid = *(data.metapointer + 2);
               }


           else if(*(data.metapointer) == 0)
           {
               return -1;
             }

           else
               //we must have hit a chunk whose lastdocid is >= value; read it in
           {
                //read in the metadata
           //the length of the chunk of docid's is cumulative, so subtract the end of the last chunk 
           //from the end of this chunk to get the length

               //find the end of the metadata


                temp = data.metapointer;

            while(*temp != 0)
            {
                temp += 3;
            }
                temp += 2;
    //temp is now pointing to the beginning of the list of compressed data; use the location of metapointer
    //to calculate where to start reading and how much to read

         //if it's the first chunk in the list,the corresponding metapointer is pointing to the beginning of the query
        //so the number of bytes of docid's is just the first integer in the metadata
                if(  data.metapointer == data.iquery)
                {
                    lengthdocids = *data.metapointer;

                }

                else
                {
                    //start reading from the offset of the end of the last chunk (saved in metapointers[index] - 3)
                    //plus 1 = the beginning of this chunk

                    lengthdocids = *(data.metapointer) - (*(data.metapointer - 3));
                    temp += (*(data.metapointer - 3)) / sizeof(int); 

                   }


           //allocate memory for an array of integers - the block of docid's uncompressed
           int* docblock = (int*)malloc(lengthdocids * 5 );

           //decompress docid's into the block of memory allocated
            s9decompress((int*)temp, lengthdocids /4, (int*) docblock, true);

            //set the blockpointer to point to the beginning of the block
            //and docpositions[index] to 0
            data.blockpointer = docblock;
            data.docposition = 0;
            break;

                }

           } 

}
}

Thank you very much, bsg.

bsg
  • 825
  • 2
  • 14
  • 34
  • You are running out of memory. It seems that the fault does not lie with the library telling you you are out of memory; rather the fault lies with the fact that you're putting too much data into it. You may need some sort of on-disk structure if you are exhausting `std::priority_queue`'s limits. – Billy ONeal Apr 07 '10 at 16:41
  • I don't understand. All I'm putting in to the queue is a structure containing a single integer and a single double. Why is that exhausting the queue's limits? – bsg Apr 07 '10 at 16:44
  • Apparently not. `std::bad_alloc` means "Out of memory". Check for infinite loops or anything of that sort which may be inserting more things into the queue than you think are actually being inserted into the queue. – Billy ONeal Apr 07 '10 at 16:49
  • I think the key here is somehow that when I read in a short list, there are no memory issues. I can't figure out why, though. The long lists are not that long. And it works every time until I call NextGEQ that second time. I don't know why. – bsg Apr 07 '10 at 16:52
  • Have you tried breaking into a debugger when the exception is thrown to see exactly what the state of things is? – Michael Burr Apr 07 '10 at 17:07
  • Can you clean up the code a bit? It's impossible to read. – Nikolai Fetissov Apr 07 '10 at 17:25
  • Yes, of course. I've been debugging endlessly for days. The queue seems to have extra space, the doc is an ordinary DOC with just an integer and a double in it, etc. – bsg Apr 07 '10 at 17:25
  • If it helps, this is the code that throws the exception (in the xmemory class) // allocate storage for _Count elements of type _Ty return ((_Ty _FARQ *)::operator new(_Count * sizeof (_Ty))); 00422288 mov eax,dword ptr [_Count] 0042228B shl eax,4 0042228E push eax 0042228F call operator new (411834h) 00422294 add esp,4 – bsg Apr 07 '10 at 17:27
  • typename A::template rebind::other::pointer _Ty _FARQ *_Allocate(_SIZT _Count, _Ty _FARQ *) { // check for integer overflow if (_Count <= 0) _Count = 0; else if (((_SIZT)(-1) / _Count) < sizeof (_Ty)) _THROW_NCEE(std::bad_alloc, NULL); return ((_Ty _FARQ *)::operator new(_Count * sizeof (_Ty))); } – bsg Apr 07 '10 at 17:29
  • @bsg: Can you provide a small compilable sample that reproduces the problem? Also please update your old question (http://stackoverflow.com/questions/2575275/bad-alloc-exception-when-using-new-for-a-struct-c) instead of opening new ones. – Georg Fritzsche Apr 07 '10 at 17:31
  • @bsg The fact that you are getting a `bad_alloc` error probably does not mean that you have exhausted memory, but more likely that you have corrupted your heap in some way so that the allocation system thinks you have, or is in an undefined state. I point this out simply so you don't go searching for possibly imaginary memory leaks. –  Apr 07 '10 at 17:43
  • @Neil - thanks a lot! I was getting nervous, b/c I really don't think I have any memory leaks; the error occurs near the beginning of the program where all allocated memory is being used. How might I have corrupted my heap? What are some common reasons? @gf I will try, but my code is rather complex and any compilable sample that can reproduce the problem will probably not be short. I did have another question open, but I wasn't getting much response after a while and my problem had changed so much that it had very little to do with the title of the post. Sorry about that. – bsg Apr 07 '10 at 17:59
  • What are the sizes of the queue and num when it throws the exception? That while(num != -1) loop has the potential to run forever if NextGEQ has a bug preventing it from returning -1 in some cases. – Mark B Apr 07 '10 at 19:26
  • @Mark The queue has 3 items in it, I believe. num is 1. You're right about the loop - I'll change it to while num <= 0. I don't think NextGEQ does have a bug like that, but better safe than sorry. – bsg Apr 07 '10 at 20:11

4 Answers4

1

QUERYMANAGER::OpenList returns a listnode by value. In startlist = &OpenList(value); you then proceed to take the address of the temporary object that's returned. When the temporary goes away, you may be able to access the data for a time and then it's overwritten. Could you just declare a non-pointer listnode startlist on the stack and assign it the return value directly? Then remove the * in front of other uses and see if that fixes the problem.

Mark B
  • 95,107
  • 10
  • 109
  • 188
1

Another thing you can try is replacing all pointers with smart pointers, specifically something like boost::shared_ptr<>, depending on how much code this really is and how much you're comfortable automating the task. Smart pointers aren't the answer to everything, but they're at least safer than raw pointers.

David Thornley
  • 56,304
  • 9
  • 91
  • 158
0

Assuming you have heap corruption and are not in fact exhausting memory, the commonest way a heap can get corrupted is by deleting (or freeing) the same pointer twice. You can quite easily find out if this is the issue by simply commenting out all your calls to delete (or free). This will cause your program to leak like a sieve, but if it doesn't actually crash you have probably identified the problem.

The other common cause cause of a corrupt heap is deleting (or freeing) a pointer that wasn't ever allocated on the heap. Differentiating between the two causes of corruption is not always easy, but your first priority should be to find out if corruption is actually the problem.

Note this approach won't work too well if the things you are deleting have destructors which if not called break the semantics of your program.

  • Hmmm... I am actually freeing a pointer that was never allocated, but it never caused issues before, and when I take it out, it still doesn't work. Thanks, though - will have to fix that. – bsg Apr 07 '10 at 19:12
  • @bsg Freeing a pointer that was never allocated is technically OK, provided the pointer is set to NULL. However, doing so is normally a sign of an iffy design. –  Apr 07 '10 at 19:18
  • @Neil Yeah, it was old code from before I refactored and didn't realize that it could have been moved somewhere better. – bsg Apr 07 '10 at 20:11
  • Getting odder and odder... I added a loop to add 20 default DOC's to the queue after the list is open, and they all go in fine, and somehow that makes the rest of them get added too... until about 8 iterations in, when I get the assertion failure: Expression: _BLOCK_TYPE_IS_VALID(pHead->nBlockUse). Any idea what's going on here? Thanks. – bsg Apr 07 '10 at 23:29
  • Could it be that I need a copy constructor for one or both of the struct types? (DOC and listnode) – bsg Apr 08 '10 at 00:01
  • 1
    @bsg That assertion failure is proof your heap is corrupted. As for the copy ctor, DOC would appear not to need one. As for list node, it depends what you are doing with it - the key question there is would your code work the same in C - if so you probably don't need one. –  Apr 08 '10 at 10:02
0

Thanks for all your help. You were right, Neil - I must have managed to corrupt my heap. I'm still not sure what was causing it, but when I changed the malloc(numdocids * 5) to malloc(256) it magically stopped crashing. I suppose I should have checked whether or not my mallocs were actually succeeding! Thanks again! Bsg

bsg
  • 825
  • 2
  • 14
  • 34