4

I am trying to complete Project Euler Problem 14 in c++ and I am honestly stuck. Right now when I run the problem it gets stuck at So Far: the number with the highest count: 113370 with the count of 155 So Far: the number with the highest count but when I try changing the i value to over 113371 it works. What is going on??

The question is:

The following iterative sequence is defined for the set of positive integers: n → n/2 (n is even) n → 3n + 1 (n is odd)

Using the rule above and starting with 13, we generate the following sequence:

13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1 It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1. Which starting number, under one million, produces the longest chain?

#include<stdio.h>
int main() {
    int limit = 1000000;
    int highNum, number, i;
    int highCount = 0;
    int count = 0;
    for( number = 13; number <= 1000000; number++ )
    {
        i = number;
        while( i != 1 ) {
            if (( i % 2 ) != 0 ) {
                i = ( i * 3 ) + 1;
                count++;
            }
            else {
                count++;
                i /= 2;
            }
        }
        count++;
        printf( "So Far: the number with the highest count: %d with the count of %d\n",
                     number, count );
        if( highCount < count ) {
            highCount = count;
            highNum = number;
        }
        count = 0;
        //break;
    }
    printf( "The number with the highest count: %d with the count of %d\n",
            highNum, highCount );
}
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
PictureMeAndYou
  • 533
  • 2
  • 5
  • 16
  • 12
    If the program works correctly for relatively small numbers, but "gets stuck in an infinite loop" for large numbers, it's likely not an infinite loop, your algorithm is just slow. – Cory Kramer Oct 24 '14 at 21:43
  • 3
    Your problem is that you are recomputing the same partial results over and over. – isekaijin Oct 24 '14 at 21:45
  • 2
    check the limits of data types you are using, use unsigned long int – nayab Oct 24 '14 at 21:50
  • Look up the concept of [memoization](http://en.wikipedia.org/wiki/Memoization). :) – Qix - MONICA WAS MISTREATED Oct 24 '14 at 21:56
  • @Nadosh it's only a `1000000` it's not ever the 2.1 cap – PictureMeAndYou Oct 24 '14 at 22:06
  • 2
    @PictureMeAndYou: Your limit is 1000000 but as you compute the length of each sequence some _intermediate_ values will exceed the range of a 32-bit `int`. – Blastfurnace Oct 24 '14 at 22:15
  • I can't help but wonder what would happen if you computed all chains starting from zero going in the opposite direction: `{1}, {2}, {4}, {8}, {16}, {32, 5}, {64, 10}, ...` and keeping track of which elements in 0-1000000 have not yet been reached, would that be faster or slower than the "obvious" algorithm mentioned by everyone else? No need of memoization then, or any sort of history. I don't know if it's more or less memory, the number of elements in each pass would grow exponentially. – Mooing Duck Oct 24 '14 at 22:29
  • 1
    Actually, along that same line of thought, in the obvious strategy, you can immediately discard every number between 1-500000 as the longest chain, because the value double that is always going to have a chain one longer. You can also discard all numbers where `(N-1)%3==0`. That cuts the number of possibilities down by 66% right there. – Mooing Duck Oct 24 '14 at 22:35
  • Wait, my strategy will fail, because at a chain of 64, it tries to store a temporary that overflows a 64bit value. – Mooing Duck Oct 24 '14 at 23:28
  • I tested it, even if you discard large temporaries, my idea is absurdly slow. – Mooing Duck Oct 24 '14 at 23:28
  • @MooingDuck: Starting at 500000 is really bad for the branch-predictor. I tested it with my linked variant doing less memoization. – Deduplicator Oct 25 '14 at 00:06

4 Answers4

3

You are getting integer overflow. Update your code like this and see it yourself:

if (( i % 2 ) != 0 ) {
    int prevI = i;
    i = ( i * 3 ) + 1;
    if (i < prevI) {
        printf("oops, i < prevI: %d\n", i);
        return 0;
    }
    count++;
}

You should change the type of i to long long or unsigned long long to prevent the overflow.

(And yes, cache the intermediate results)

Anton Savin
  • 40,838
  • 8
  • 54
  • 90
3

Remember all intermediate results (up to some suitably high number).
Also, use a big-enough type:

#include <stdio.h>

static int collatz[4000000];
unsigned long long collatzmax;

int comp(unsigned long long i) {
  if(i>=sizeof collatz/sizeof*collatz) {
      if(i>collatzmax)
        collatzmax = i;
      return 1 + comp(i&1 ? 3*i+1 : i/2);
  }
  if(!collatz[i])
      collatz[i] = 1 + comp(i&1 ? 3*i+1 : i/2);
  return collatz[i];
}

int main() {
  collatz[1] = 1;
  int highNumber= 1, highCount = 1, c;
  for(int i = 2; i < 1000000; i++)
    if((c = comp(i)) > highCount) {
      highCount = c;
      highNumber = i;
    }
  printf( "The number with the highest count: %d with the count of %d\n",
        highNumber, highCount );
  printf( "Highest intermediary number: %llu\n", collatzmax);
}

On coliru: http://coliru.stacked-crooked.com/a/773bd8c5f4e7d5a9

Variant with smaller runtime: http://coliru.stacked-crooked.com/a/2132cb74e4605d5f

The number with the highest count: 837799 with the count of 525
Highest intermediary number: 56991483520

BTW: The highest intermediary encountered needs 36 bit to represent as an unsigned number.

Deduplicator
  • 44,692
  • 7
  • 66
  • 118
  • @Jarod42: The correct magic number is a bit too high. Added code to calculate it. – Deduplicator Oct 24 '14 at 22:36
  • **+1** I think, you need to iterate only over odd numbers. – lapk Oct 24 '14 at 22:37
  • 1
    @PetrBudnik: I'm not convinced that's a valid optimization (though I determined several that are valid along similar reasoning) – Mooing Duck Oct 24 '14 at 22:41
  • @PetrBudnik: You only need to iterate two out of every three numbers 500000-1000000 though: where `N%3==1` is false. In the end I bet it's about the same since most of the rest get memoized anyway. – Mooing Duck Oct 24 '14 at 23:30
0

With your algorithm, you compute several time identical series. you may cache result for previous numbers and reuse them.

Something like:

void compute(std::map<std::uint64_t, int>& counts, std::uint64_t i)
{
    std::vector<std::uint64_t> series;
    while (counts[i] == 0) {
        series.push_back(i);
        if ((i % 2) != 0) {
            i = (i * 3) + 1;
        } else {
            i /= 2;
        }
    }
    int count = counts[i];
    for (auto it = series.rbegin(); it != series.rend(); ++it)
    {
        counts[*it] = ++count;
    }
}

int main()
{
    const std::uint64_t limit = 1000000;

    std::map<std::uint64_t, int> counts;
    counts[1] = 1;
    for (std::size_t i = 2; i != limit; ++i) {
        compute(counts, i);
    }
    auto it = std::max_element(counts.begin(), counts.end(),
        [](const std::pair<std::uint64_t, int>& lhs, const std::pair<std::uint64_t, int>& rhs)
        {
            return lhs.second < rhs.second;
        });
    std::cout << it->first << ":" << it->second << std::endl;
    std::cout << limit-1 << ":" << counts[limit-1] << std::endl;
}

Demo (10 seconds)

Jarod42
  • 203,559
  • 14
  • 181
  • 302
0

Don't recompute the same intermediate results over and over!

Given

typedef std::uint64_t num;  // largest reliable built-in unsigned integer type

num collatz(num x)
{
    return (x & 1) ? (3*x + 1) : (x/2);
}

Then the value of collatz(x) only depends on x, not on when you call it. (In other words, collatz is a pure function.) As a consequence, you can memoize the values of collatz(x) for different values of x. For this purpose, you could use a std::map<num, num> or a std::unordered_map<num, num>.

For reference, here is the complete solution.

And here it is on Coliru, with timing (2.6 secs).

Community
  • 1
  • 1
isekaijin
  • 19,076
  • 18
  • 85
  • 153
  • Yes I know there are faster ways to do this, but I want to try my way because it was original for me and understand why it doesn't work. – PictureMeAndYou Oct 24 '14 at 22:05
  • 2
    @PictureMeAndYou: Your way does not work *precisely* because your algorithm is too slow. – isekaijin Oct 24 '14 at 22:06
  • AFAIK, I thought it just takes a while to compile and run I never knew it would completely stop? – PictureMeAndYou Oct 24 '14 at 22:12
  • 1
    @PictureMeAndYou: If you write a program that would take a thousand years to complete running, for all practical intents and purposes, your program will never stop. – isekaijin Oct 24 '14 at 22:15
  • @EduardoLeón: His code only takes ~3.7 seconds on my machine once I change everything to `long long`. – Mooing Duck Oct 25 '14 at 00:12