Spelled numbers to digits - understanding short obfuscated code

Question

Can someone help me understand this code?

long long n,u,m,b;main(e,r){for(;n++||(e=getchar()|32)>=0;b=
"ynwtsflrabg"[n%=11]-e?b:b*8+n)for(r=b%64-25;e<47&&b;b/=8)for
(n=19;n;"1+DIY/.K430x9G(kC["[n]-42&255^b||(m+=n>15?n:n>9?m%
u*~-u:~r?n+!r*16:n*16,b=0))u=1ll<<6177%n--*4;printf("%llx",m);}

I found some explanations, but they are very brief. What I know so far: This code outputs a decimal number for a given input of spelled numbers. It searches for some combinations of characters, that form the numbers and stores those characters in octal representation instead of hexadecimal. It assigns some values to the found combinations. Then it adds those values to the result (for words like "five", "thirteen" etc.), or somehow shifts the digits (for words like "thousand" etc.). That's all I understood from the explanations I found: https://www.ioccc.org/2012/kang/hint.html, http://j.mearie.org/post/7462182919/spelt-number-to-decimal.

I tried to understand this code expression by expression, but I only came to an endless loop, where I don't understand what one variable is for, since it is only modified later on, but then it depends on a next variable, that is also modified later in the loops. I just don't know where to start.

I tried to comment every single piece of the code + remove some obfuscations and ambiguities:

long long n, u, m, b; // = 0

int main(int e, int r) {

/* loop1:*/ for(;
/* --condition: */
/*   if not:*/  n++ || // n++ == 0 <=> n == 0        // Set n += 1
/*     then:*/
/*   if not:*/  (e = getchar() | 32) >= 0;           // Set e <- input char
                // `| 32` transforms upper case to lower case
                // getchar < 0 <=> EOF
/*     then break*/
/* --increment: */
                b =
/*           if:*/  "ynwtsflrabg"[n %= 11] - e ?     // Set n %= 11
/*       then b=*/  b                          :
/*       else b=*/  b * 8 + n)

    /* loop2:*/ for(r = (b % 64) - 25;               // Set r
    /* --condition: */
    /*    if not:*/ (e < 47) && b;
    /*      break*/
    /* --increment: */
                    b /= 8)                          // Set b

        /* loop3:*/ for(n = 19;                      // Set n
        /* --condition: */
        /*   if not:*/  n; // <=> n != 0
        /*     then break */
        /* --increment: */
        /*   if not:*/  ("1+DIY/.K430x9G(kC["[n] - 42) & 255 ^ b ||
                        /* <=> {7, 1, 26, 31, 47, 5, 4, 33, 10, 9, 6, 78, 
                                15, 29, 254, 65, 25, 49, 214}[n] == b */
        /*     then:*/  (m +=                       // Set m
        /*             if:*/  n > 15           ?
        /*        then m+=*/  n                :
        /*        else if:*/  n > 9            ?
        /*        then m+=*/  m % u * ~-u      :
                              // <=> m % u * (u - 1)
        /*        else if:*/  ~(int)r          ?
        /*        then m+=*/  n + !(int)r * 16 :
        /*        else m+=*/  n * 16,
                        b = 0))                      // Set b

        /* --body: */   u = 1ll << (6177 % n-- * 4); // Set u, n
                        // <=> u = pow(2, 6177 % n-- * 4)

    printf("%llx\n", m);
}

Most of this rewritten code may look even scarier than the original, but I just needed some step by step walk-through, so maybe it will help someone else too.

It seems that the output is generated in the last loop, at m += .... But I can't understand the condition above it: "1+DIY/.K430x9G(kC["[n] - 42) & 255 ^ b. It translates to: {7, 1, 26, 31, 47, 5, 4, 33, 10, 9, 6, 78, 15, 29, 254, 65, 25, 49, 214}[n] == b, where my {...} notation behaves like a string literal containing characters corresponding to those numbers - which seem meaningless to me at this point.

Edit

Someone has voted to close this question for being unclear. To make clear what I need:

I would appreciate any hints or insights on any of the techniques used in this code. Like for example:

Author of the code used deliberate obfuscation: i["foo"] instead of "foo"[i].
Author shortened the (u - 1) expression as ~-u

My main problems are with parts:

"1+DIY/.K430x9G(kC["[n]-42&255^b - Why those characters?
m+=n>15?n:n>9?m%u*~-u:~r?n+!r*16:n*16,b=0) - Why those conditions on n?
e<47 - Why 47 - the / character?

It's not portable C and therefore not particularly clever. There are better examples. But your analysis thus far is worth an upvote as far as I'm concerned. — Bathsheba, Sep 27 '18 at 15:40
Try stepping through the code in your debugger of choice - that can often help when trying to understand someone else's code. — Paul R, Sep 27 '18 at 15:49
A string literal is also an array expression, and can be subscripted like any other array expression - `"foo"[1]` evaluates to `'o'`. — John Bode, Sep 27 '18 at 15:52
@Bathsheba: By not portable C you mean the old K&R syntax: `main(e, r)`? — egst, Sep 27 '18 at 16:01
@PaulR: I have no debugger installed right now, so I just didn't think of it. But i'll try it. — egst, Sep 27 '18 at 16:01
Why do you want to understand this code? It's just a joke. Not reason to be able to understand it as no serious C programmer will ever write such code — Support Ukraine, Sep 27 '18 at 16:16
No it’s the fact that it assumes ascii encoding. Most obfuscation contests insist on portable C. — Bathsheba, Sep 27 '18 at 16:16
The close votes are because the question is off-topic according to the principles that govern this site. Could you answer your own question with a couple of sessions with a debugger? Could this question conceivably be of use to someone else in the future? — Jim Garrison, Sep 27 '18 at 16:24
@4386427: It was a part of an assignment for a programming course to get some bonus points which I probably won't make in time, so I am doing this just for fun to be honest. And also I think it's good to practice some C language details, that you don't normally think about. — egst, Sep 27 '18 at 16:26
@McSim wow, an assignment for a programming course! Really? Hopefully the remaining part of the course wasn't stupid as well — Support Ukraine, Sep 27 '18 at 16:28
Perhaps the purpose of the assignment is to remind you to DOCUMENT YOUR CODE! All commercial code you write will eventually be picked up by someone else, who will have to do this sort of analysis if the commentary is not clear. — Gem Taylor, Sep 27 '18 at 16:49
@GemTaylor Your point is "normally" good but in this case it doesn't apply. This code is written as a joke - nothing more - nothing less. It's on purpose that it's made hard to understand. — Support Ukraine, Sep 27 '18 at 16:54
I read that this code was written for some kind of short code competition. And as a part of the programming course, it was from series of "fun" homeworks to get some extra points. To be honest, I learnt a few interesting properties of C language, when analyzing this, so afterall it wasn't such a stupid assignment. — egst, Sep 27 '18 at 17:51

Spelled numbers to digits - understanding short obfuscated code

Edit

0 Answers0