Recursive Function Seriously slowing performance?

Question

I have a function which I am calling an infinite amount of times (or until a condition is met). The problem with this recursive function is that on a higher level, it is called by a worker thread which pushes and pops to a deque. In the meantime, I am iterating through a file with the main thread which is a lot quicker than the recursive function which is processing data that is retrieved from iterating through that main loop. An example of the code which is severely slowing things down is below:

bool CCapsule::DecodeElementList( ElementList &elementList, char* pszBuffer, unsigned int uiBufferLength, std::string strSrcAddress, std::string strDestAddress, std::vector<std::string> &vsContents, bool bIsInner )
{
    // Create an buffer to contain pszBuffer.
    RBuffer rBuffer;
    rBuffer.data = pszBuffer;

    // Create an RElement List and Element Entry to store elements
    RRet rRet;

    RElementEntry rElementEntry;

    // Decode Element List
    if(rDecodeElementList(&m_rDecodeIter, &rElementList, 0) >= RET_SUCCESS)
    {
        std::vector<std::string> _vsContents;
        while ((RRet = rDecodeElementEntry(&m_rDecodeIter, &rElementEntry)) != RRET_END_OF_CONTAINER)
        {
            if (RRet < RET_SUCCESS)
            {
                return false;
            }

            std::string strEntryName = "";

            if (rElementEntry.data != NULL)
                rElementEntry.data;

            switch(rElementEntry.dataType)
            {
            case R_MSG:
                // Create a new RCapsule to encapsulate the inner message.
                m_pInnerMessage = new CCapsule();
                if ( false == m_pInnerMessage->Load( pszBuffer, uiBufferLength, strSrcAddress, strDestAddress, _vsContents, true ) )
                {
                    // An error occurred, clean up.
                    delete m_pInnerMessage;
                    m_pInnerMessage = 0;
                    return false;
                }
                break;
            case R_ELEMENT_LIST:
                // Decode Element List
                RElementList rInnerElementList;
                DecodeElementList(rInnerElementList, pszBuffer, uiBufferLength, strSrcAddress, strDestAddress, _vsContents, true);
                break;
            case R_FIELD_LIST:
                // Decode Field List
                DecodeFieldList(pszBuffer, uiBufferLength);
                break;
            case R_DATE:
                {
                    // Decode DATE
                    RDate rDate;
                    RRet = rDecodeDate( &m_rDecodeIter, &rDate );
                    if ( RRet != RET_SUCCESS )
                    {
                        return false;
                    }
                    std::stringstream sstream;
                    sstream << static_cast<int>( rDate.day ) << "/" << static_cast<int>( rDate.month ) << "/" << rDate.year;
                    _vsContents.push_back(sstream.str());
                }
                break;
            case R_DATETIME:
                {
                    // Decode DATETIME
                    RDateTime rDateTime;
                    RRet = rDecodeDateTime( &m_rDecodeIter, &rDateTime );
                    if ( RRet != RET_SUCCESS )
                    {
                        return false;
                    }

                    RBuffer rStringBuffer;
                    RRet = rDateTimeToString( &rStringBuffer, R_DATETIME,  &rDateTime );
                    if ( RRet != RET_SUCCESS )
                    {
                        return false;
                    }

                    std::stringstream sstream;
                    sstream << static_cast<int>( rDateTime.date.day ) << "/" << static_cast<int>( rDateTime.date.month ) << "/" << static_cast<int>( rDateTime.date.year) << " " << static_cast<int>( rDateTime.time.hour )
                        << ":" << static_cast<int>( rDateTime.time.minute ) << ":" << static_cast<int>( rDateTime.time.second ) << "." << static_cast<int>( rDateTime.time.millisecond ); 
                    _vsContents.push_back(sstream.str());
                }
                break;
            case R_DOUBLE:
                // Decode DOUBLE
                RDouble rDouble;
                RRet = rDecodeDouble( &m_rDecodeIter, &rDouble );
                _vsContents.push_back(boost::lexical_cast<std::string>(rDouble));
                //m_sStringStream << rDouble << ",";
                break;
            case R_UINT:
                // Decode UINT
                RUInt rUInt;
                RRet = rDecodeUInt( &m_rDecodeIter, &rUInt );
                _vsContents.push_back(boost::lexical_cast<std::string>(rUInt));
                //m_sStringStream << rUInt << ",";
                break;
            case R_ASCII_STRING:
                {
                // Decode STRING
                RBuffer rStringBuffer;
                RRet = rDecodeBuffer( &m_rDecodeIter, &rStringBuffer );
                std::string strData(rStringBuffer.data);
                _vsContents.push_back(strData);
                //m_sStringStream << rStringBuffer.data << ",";
                }
                break;
            case R_NO_DATA:
                RRet = RET_SUCCESS;
                break;
            default:
                RRet = RET_FAILURE;
                break;
            }
        }

        std::stringstream ssReport;
        std::copy(_vsContents.cbegin(),_vsContents.cend(),std::ostream_iterator<std::string>(ssReport,","));
        vsContents.push_back(ssReport.str());

    }
    else
    {
        return false;
    }
    return true;
}

Unfortunately it has to be in this sequential order as the vector of strings stored will contain a list of comma separated elements which i will output later to a csv list. This code simply decodes certain elements and pushes the resulting strings back to a vector of strings which is then put into a stringstream.

Does anyone have any suggestions on how to speed up performance??? Thanks...

You might want to try the Code Review site for improving performance type of questions. — BartoszKP, Sep 09 '14 at 22:46
You may pass `std::string` by const reference. You may remove useless statement as `rElementEntry.data;` — Jarod42, Sep 09 '14 at 22:55
This code isn't complete. There is no evidence that recursion is the problem. There is no evidence that this method is the problem, and not one of the methods it calls. — user207421, Sep 09 '14 at 23:13
If i comment out the while loop then the deque pushes back and item and pops an item without there being an actual queue so the size of the deque is always 1 or 0. When I add the while loop, the size of the deque and go into its 100,000s.... Everything works speedily if the while loop is commented out, or probably more specifically, the call to DecodeElementList. I have tested it myself... — Patricia Aydin, Sep 09 '14 at 23:17

score 3 · Answer 1 · answered Sep 09 '14 at 23:01

Recursion itself will have no effect on performance. Your performance issues will be of the more conventional kind, eg.:

Use a profiler to measure performance. That is easier if you split things out into multiple functions (eg. one per data type that you read in, here).
Verify (with the profiler, or some other timer) that the problem is indeed in the code you posted, and not the code that it calls. We can't see half of those functions, and it might well be the case that it is they which take up most of the processing time, being called repeatedly from this one.
Prefer reusing old objects to creating new ones, and prefer stack allocated objects to heap allocated ones, as memory management is expensive.
Reserve space in std::vectors prior to filling them in, to reduce the amount of reallocation and copying done as you add items to them.
Initialise objects in the scope where you use them (eg. move rRet into the if statement) to avoid paying the cost of initialisating objects that never get used
Check for bugs before you check for performance. Nonsensical statements such as if (rElementEntry.data != NULL) rElementEntry.data; won't do anything useful but they may cost you CPU cycles.
Replace std::stringstream use for string formatting with something quicker like sprintf. Yes, I know it's more dangerous, but it's almost always faster. Just be careful and use the safer version such as snprintf or your platform equivalent. Same probably goes for lexical_cast, sorry.
If you absolutely must use std::stringstream, use std::ostringstream instead - you don't need the istream part, right?

Hi Kylotan and thanks for the advice :). I have commented out the while loop and it performs much faster (the deque pops and pushes so that there is always either 1 element or 0) so i know the problem is within that DecodeElementList function. Also in terms of reserving space for the vector, i don't know how big the vector will be so not too sure how i would be able to reserve the space... — Patricia Aydin, Sep 09 '14 at 23:07
Commenting out the while loop isn't going to tell you how much of the relative time is spent in functions called from within the loop. As for reserving space, you don't need to know how many slots you'll need. It has no effect on what can be contained, just on how many allocations it takes. http://www.cplusplus.com/reference/vector/vector/reserve/ — Kylotan, Sep 10 '14 at 00:38

score 1 · Answer 2 · answered Sep 09 '14 at 22:59

You can try to replace the recursive algorithm by an iterative one, by pushing the parameters that'd normally be passed to the recursive function onto a stack.

Something Like:

Stack<Object> stack = new Stack<Object>;
stack.push(first_object);
while( !stack.isEmpty() ) {

   // Do something

     my_object = stack.pop();

  // Push other objects on the stack.
}

This technique is called recursion elimination. Check this link out.

http://cs.saddleback.edu/rwatkins/CS2B/Lab%20Exercises/Stacks%20and%20Recursion%20Lab.pdf

I'm not sure how i'd implement that in my code. Really what is causing it to slow down is when it calls DecodeElementList an n number of times... i don't know how i would use the above to change this... — Patricia Aydin, Sep 09 '14 at 23:03

score 0 · Answer 3 · answered Sep 10 '14 at 00:44

I feel there are two things are significant in your code:

You algorithm seems like a DFS in a graphic or pre-order travel in a n-branch tree. So there should be an easy way to do it in an iterative way. For each of the node in the graph or tree, you can define a separate Access() method for it and make the code cleaner.
Get away from the stream and use snprintf as @Kylotan suggested.

Above to change could make your code way faster than before. If it still can't meet your bar, then you need to try some profiling tool.

Recursive Function Seriously slowing performance?

3 Answers3