4

I know that there was a program like this:

#include <iostream>
#include <string>


int main() {
    const std::string alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
    std::string temp = "1234567890";
    srand(MAGICNUMBER);
    for (int i = 0;; ++i) {
        for (int j = 0; j < 10; ++j)
            temp[j] = alphabet[rand() % alphabet.size()];
        std::cout << temp << std::endl;
    }
}

Basically, random 10-symbol string generator.

I also know that the 124660967-th generated string was "2lwd9JjVnE". Is there a way to find what the MAGICNUMBER is, or, at least, the next string in the sequence?

Brute-forcing would be painful, given the time it takes to generate one such sequence, but I have some info about the compiler used (if that helps?): it was 64-bit g++ 4.8 for Linux.

UPD. Finding the next item would already be very helpful; can I do that in reasonable amount of time (especially without a seed)?

Akiiino
  • 1,050
  • 1
  • 10
  • 28
  • 2
    As C++ does not specifier the method used, other than it is predictable, try all seeds. – chux - Reinstate Monica Feb 09 '16 at 18:51
  • good edit. Yes, specifing the compiler resolves the problem mentioned by @chux.. whether it *helps*.. not sure. – Karoly Horvath Feb 09 '16 at 18:58
  • Bruteforce might not be *that* bad. The seed in only 32 bit, so even without parallelization you'll probably be done in an hour or so. E.g it took my PC about 10 minutes to find the 32 bit seed of a mt19337. – Baum mit Augen Feb 09 '16 at 19:00
  • 1
    @BaummitAugen But it takes like a minute to generate one such sequence; wouldn't it take much, _much_ more than an hour? – Akiiino Feb 09 '16 at 19:03
  • @Akiiino Oh, seems like I underestimated that. If each sequence really takes that long, you would need to wait about 8k years. That is probably a bit too much. :) – Baum mit Augen Feb 09 '16 at 19:06
  • This is called *breaking* a PRNG. It's what cryptologists do. If you are a good cryptologist and you know exactly what the algorithm of your PRNG is and it is not very strong, you might be able to find an attack that is faster than brute force. Then you publish your result (or perhaps pass it to the spooks if you're working for a three-letter agency). – n. m. could be an AI Feb 09 '16 at 19:10
  • 1
    As a first step, try `srand(t)` where `t = time()` of the last few seconds. Little harm in trying the [_usual suspects_.](https://www.youtube.com/watch?v=vtSmfws0_To) – chux - Reinstate Monica Feb 09 '16 at 19:30
  • If it's an LCG, you can brute force a seed that will generate the number you know as the first number, and the rest of the sequence will be the same as the original sequence. – alain Feb 09 '16 at 19:33
  • Knowing that it's LCG means you can do better than brute force. Even if it's not LCG but you do know what the algorithm is you can still use brute force (even if it may take a very long time). – sh1 Feb 10 '16 at 19:19

3 Answers3

3

Yes, given typical rand() implementations this is likely to be possible, fairly easy, even.

rand() is typically a linear congruential generator such that each internal state of the generator is formed from a simple arithmetic equation of the previous state: x1 = (a*x0 + c) % m. You'll need to know the constants a, c and m used by the particular implementation you're targeting, and the method of producing the output value from the state (usually the values are either the entire state, or the upper half of the state). It's also important that the state is typically only 32-bits. A larger state would be more difficult.

So you need to find a state for the pRNG such that the next ten states produce the particular sequence of indices that produce the 10 characters you're looking for: 2lwd9JjVnE. So assuming the entire state is output by rand(), you need to find some 32-bit number x such that:

x % 62 = 54
(x1 = (a*x + c) % m) % 62 == 11
(x2 = (a*x1 + c) % m) % 62 == 22
(x3 = (a*x2 + c) % m) % 62 == 3
(x4 = (a*x3 + c) % m) % 62 == 61
(x5 = (a*x4 + c) % m) % 62 == 35
(x6 = (a*x5 + c) % m) % 62 == 9
(x7 = (a*x6 + c) % m) % 62 == 47
(x8 = (a*x7 + c) % m) % 62 == 13
(x9 = (a*x8 + c) % m) % 62 == 30

This could be done without too much difficulty by trying all 2^32 possible state values (assuming the typical 32-bit state). However, since the constants used were probably chosen to ensure that the RNG runs through a complete 32-bit period, you can simply choose any state at all and run it until you find this sequence.

Either way, once you know the state that produces these values, you then simply have to run the generator backwards for 124660967 * 10 steps in order to find which state was used as the original seed. To do that you'll need to compute the congruence multiplicative inverse of a mod m. Alternatively you could run it forward for (period - 124660967*10) steps.

bames53
  • 86,085
  • 15
  • 179
  • 244
  • So, if I understand correctly, not only is it possible to find out what the seed was, it's also _very_ easy to find the next string -- just by generating them until I find the one I know, and the next one would be the same regardless of the seed, right? – Akiiino Feb 09 '16 at 19:42
  • Tried it for fun, but I did not find a matching state. Anything obvious wrong with that? http://coliru.stacked-crooked.com/a/63d28ccd0d26b1c5 The LCG parameters are from Wiki btw. – Baum mit Augen Feb 09 '16 at 20:34
  • 1
    @BaummitAugen It looks to me like those constants must not be correct. When I used them the results are not the same as what I get from `rand()` when using gcc. [Here's](http://melpon.org/wandbox/permlink/JRFZGBeA0LZfBusx) a program that tests if an LCG uses certain constants; for srand it says these constants aren't correct; testing `minstd_rand0` the same way shows a match with the constants wikipedia lists for that LCG. – bames53 Feb 09 '16 at 22:00
  • I think glibc's implementation of `rand()` is not LCG (it contains LCG but is normally not used, or is used only for seeding, as I recall). The method described here still works if you can get the right implementation of `rand()`, but it could take substantially longer. – sh1 Feb 10 '16 at 19:13
  • Actually, we can _see_ that it's not LCG. LCG states alternate between odd and even values, and `%62` won't change that (and this is exactly why `rand()%foo` is considered bad). In the list of results given we have three odd values in a row. An LCG cannot do this. – sh1 Feb 10 '16 at 19:48
  • @sh1 An LCG could be dropping some number of lower bits, but I think you're right that in this case this isn't an LCG. glibc [apparently uses](http://stackoverflow.com/a/3933182/365496) a LFSR by default. – bames53 Feb 10 '16 at 20:27
2

No, it's almost not possible. As @chux pointed out in their comment the exact implementation isn't specified in the c++ standard.

You'll need to check for all of the sequences that will be generated with all possible seeds. That will run in an unreasonable amount of computing time necessary.


Though if the compiler is well known, and the implementation is open source (as is in your specific case), there could be ways to find out the initial seed value, knowing the specific rand() result for a specific iteration on the call.

Community
  • 1
  • 1
πάντα ῥεῖ
  • 1
  • 13
  • 116
  • 190
0

If you have access to the program, disassemble it to attempt to learn what the magic number was.

Otherwise the standard doesn't specify anything about storing the srand value so you're stuck with alternate approaches, such as brute-forcing all seeds, or possibly trying to store the sequence of random numbers looking for the ten in a row that generate the string you're interested in.

Mark B
  • 95,107
  • 10
  • 109
  • 188