8

I have the following implementation of Kadane's algorithm to solve the problem of the maximum subarray of an array:

public static decimal FindBestSubsequence
    (this IEnumerable<decimal> source, out int startIndex, out int endIndex)
{
    decimal result = decimal.MinValue;
    decimal sum = 0;
    int tempStart = 0;

    List<decimal> tempList = new List<decimal>(source);

    startIndex = 0;
    endIndex = 0;

    for (int index = 0; index < tempList.Count; index++)
    {
        sum += tempList[index];
        if ((sum > result) || 
            (sum == result && (endIndex - startIndex) < (index - tempStart)))
        {
            result = sum;
            startIndex = tempStart;
            endIndex = index;
        }
        else if (sum < 0)
        {
            sum = 0;
            tempStart = index + 1;
        }
    }

    return result;
}

It fails when I use a sequence that starts with a negative number like -1, 2, 3 giving a result of 4, [0,2] instead of 5, [1,2].

For the life of me that I cannot find where the error is. Maybe its a defect on the algorithm design?

Thanks in advance.

default locale
  • 13,035
  • 13
  • 56
  • 62
Ignacio Soler Garcia
  • 21,122
  • 31
  • 128
  • 207

6 Answers6

8

Your initial implementation suffered from unnecessarily complicated and partially wrong checks within the main scan cycle. These checks are two:

  • if greater intermediate sum is found, store it constituents as a temporary result;
  • independently, if sum got negative, reset it to 0 and prepare to build a new sequence from next scan position.

Refactored FindBestSubsequence method implementation is listed below:

public static decimal FindBestSubsequence (this IEnumerable<decimal> source, out int startIndex, out int endIndex)
{
    decimal result = decimal.MinValue;
    decimal sum = 0;
    int tempStart = 0;

    List<decimal> tempList = new List<decimal>(source);

    startIndex = 0;
    endIndex = 0;

    for (int index = 0; index < tempList.Count; index++)
    {
        sum += tempList[index];
        if (sum > result)
        {
            result = sum;
            startIndex = tempStart;
            endIndex = index;
        }
        if (sum < 0)
        {
            sum = 0;
            tempStart = index + 1;
        }
    }

    return result;
}

Now not only for -1,2,3 the code above produces correct answer 5,[1,2] but also it correctly processes arrays of all negative numbers without any extra code: entering -10,-2,-3 will return -2,[1,1].

Gene Belitski
  • 10,270
  • 1
  • 34
  • 54
  • 1
    Perfect. I just took an already existent implementation in C that seemed standard and ported it to C#. Yours pass all my unit tests so I think its the best option. Thanks! – Ignacio Soler Garcia Mar 28 '12 at 08:13
  • 1
    Additionally, if you are refactoring it, I would iterate the IEnumerable directly, there is no need to create a copy of the list. And passing multiple 'out' arguments is usually a bad practice, a custom return type would be better. – vgru Mar 28 '12 at 16:41
  • 2
    Agree with the copy of the list. Don't agree with creating a new return type as in this case seems pretty obvious the usage of start index and end index. – Ignacio Soler Garcia Mar 28 '12 at 19:44
3

In your example you always have sum > result even if sum<0 in the first iteration of The loop because 0 > decimal.MinValue.

So you never go to your second case.-

You need to change the first if by adding a condition sum > 0:

if ((sum >0 ) & ((sum > result) || 
    (sum == result && (endIndex - startIndex) < (index - tempStart))))
{
    ...
}
else if (sum < 0)
{
    ...
}

Update:

As explained in my comment you can just change the initialisation of result to 0 :

decimal result = 0;

From wikipedia :

This subarray is either empty (in which case its sum is zero) or consists of one more element than the maximum subarray ending at the previous position

Therefore if the array contains only negative numbers the solution is an empty subarray with sum 0.

abatishchev
  • 98,240
  • 88
  • 296
  • 433
Ricky Bobby
  • 7,490
  • 7
  • 46
  • 63
1

Change this line:

decimal result = decimal.MinValue;

to

decimal result = 0;
vgru
  • 49,838
  • 16
  • 120
  • 201
  • Thanks makes the algorith return 0 when all the values are negative. With an input of -1, -2, -3 the best subarray is -1. – Ignacio Soler Garcia Mar 27 '12 at 13:39
  • @SoMoS: that's right, I just compared your code to the Wikipedia article you posted. This also means that their python example suffers from the same problem. – vgru Mar 27 '12 at 13:42
  • 1
    (wikipedia) Kadane's algorithm consists of a scan through the array values, computing at each position the maximum subarray ending at that position. This subarray is either empty (in which case its sum is zero) or consists of one more element than the maximum subarray ending at the previous position. – Ricky Bobby Mar 27 '12 at 13:44
0

Built upon Gene Belitski's answer and comments:

    public static void Main()
    {
        var seq = new[] { -10M, -2M, -3M };
        var stuff = seq.FindBestSubsequence();

        Console.WriteLine(stuff.Item1 + " " + stuff.Item2 + " " + stuff.Item3);
        Console.ReadLine();
    }

    public static Tuple<decimal, long, long> FindBestSubsequence(this IEnumerable<decimal> source)
    {
        var result = new Tuple<decimal, long, long>(decimal.MinValue, -1L, -1L);

        if (source == null)
        {
            return result;
        }

        var sum = 0M;
        var tempStart = 0L;
        var index = 0L;

        foreach (var item in source)
        {
            sum += item;
            if (sum > result.Item1)
            {
                result = new Tuple<decimal, long, long>(sum, tempStart, index);
            }

            if (sum < 0)
            {
                sum = 0;
                tempStart = index + 1;
            }

            index++;
        }

        return result;
    }
Community
  • 1
  • 1
Jesse C. Slicer
  • 19,901
  • 3
  • 68
  • 87
0

For each position you should take the maximum of the value there (from the original sequence) and your sum as you have written it. If the original number is bigger, then it's better to start summing 'from beginning', i.e. sum = max(sum+tempList[index],tempList[index]); Then you won't need the case for sum < 0 at all.

Stefan Marinov
  • 570
  • 4
  • 14
0

At the end this is how I corrected the algorithm to handle all the scenarios, just in case it helps to someone:

    public static decimal FindBestSubsequence (this IEnumerable<decimal> source, out int startIndex, out int endIndex)
    {
        decimal result = decimal.MinValue;
        decimal sum = 0;
        int tempStart = 0;

        List<decimal> tempList = new List<decimal>(source);

        if (tempList.TrueForAll(v => v <= 0))
        {
            result = tempList.Max();
            startIndex = endIndex = tempList.IndexOf(result);
        }
        else
        {
            startIndex = 0;
            endIndex = 0;

            for (int index = 0; index < tempList.Count; index++)
            {
                sum += tempList[index];

                if (sum > 0 && sum > result || (sum == result && (endIndex - startIndex) < (index - tempStart)))
                {
                    result = sum;
                    startIndex = tempStart;
                    endIndex = index;
                }
                else if (sum < 0)
                {
                    sum = 0;
                    tempStart = index + 1;
                }
            }
        }

        return result;
    }
Ignacio Soler Garcia
  • 21,122
  • 31
  • 128
  • 207
  • Thanks to Ricky Bobby and Groot to point me in the right direction. – Ignacio Soler Garcia Mar 27 '12 at 16:12
  • The code above still allows for few important improvements, such as removal of unnecessary special case processing arrays of all negatives. You may check my implementation for refactored `FindBestSequence`. – Gene Belitski Mar 28 '12 at 04:27