2

I'm implementing arithmetic functions for numbers in a quote notation representation, which is a form of p-adic numbers. Taking the basic idea from this paper, I have a struct and a constructor.

enum { base = 10 };

/*
   d[n+m-1] ... d[n+1] d[n] ' d[n-1] ... d[2] d[1] d[0]

   Sum(i=0..n d[i]*b^i) + Sum(i=n+1..n+m d[i]*b^i/(b^m-1))
 */
typedef struct q { 
    int b,n,m,d[];
} *q; 

This is mostly the same as the paper, but I've changed the indexing a little.

/* nb. not defined for LONG_MIN */
q qlong(long i){ 
    if (i<0) return qneg(qlong(-i));
    int m=0;
    int n=i?ceil(log(i)/log(base)):0;
    q z=malloc(sizeof*z + (n+m)*sizeof*z->d);
        z->b=base; z->n=n; z->m=m;
    int j;
    for(j=0; j<n; j++){
        z->d[j] = i % base;
        i /= base;
    }   
    qprint(z);
    return z;
}

Using the bounds presented in the paper, I can estimate the digits of the result of addition.

q qadd(q x,q y){
    int m=lcm(x->m,y->m);
    int n=max(x->n,y->n)+m+2;
    int t,c=0;
    q z=malloc(sizeof*z + (n+m)*sizeof*z->d);
        z->b=base; z->n=n; z->m=m;
    int k;
    for(k=0; k<n+m; k++){
        t = qdig(x,k) + qdig(y,k) + c;
        z->d[k] = t % base;
        c = t / base;
    }

    findrepetition(z);
    return z;
}

But this may perform unnecessary calculations once it hits a repeated state. And even more calculations in the normalization step.

In his earlier paper, Hehner presents an algorithm for detecting repetitions.

Hehner termination algorithm

void findrepetition(q z){
    int i;
    int j;
    qprint(z);
    for(i=1; i<z->n+z->m-1; i*=2) {
        for(j=i+1; j<2*i; j++) {
            if (z->d[j-1]==z->d[i-1]) {
                z->n=i;
                z->m=j-i;
                return;
            }
        }
    }
}

But can I incorporate this into my existing calculation loop instead of making a second pass? I've avoided i and j in qadd so their use in these loops doesn't conflict with my usage.

Some of this material was previously posted to comp.lang.c and there is a linked video which is an excellent introduction to the whole topic.

luser droog
  • 18,988
  • 3
  • 53
  • 105
  • Out of curiosity, is this just for fun, or for a practical application? I'd think that the inefficiency of the representation (the size of the repeated component in the representation of 1/n will be on the order of phi(n) base-b digits for "most" n) would rule this format out for anything practical. – Mark Dickinson Sep 13 '15 at 18:56
  • Um, it's a little of both. It's a [personal project](https://github.com/luser-dr00g/inca/tree/master/olmec) which I hope will be useful to others. The authors of the papers I've cited certainly believed it to be practical, but the fact that it's not widely known or used does suggest that there may be problems. – luser droog Sep 13 '15 at 19:53
  • Yes, I think those authors may have been a bit short-sighted, perhaps understandably given the date of publication. Consider that for `p=2`, the simple fraction `1/10000000019` takes over ten billion bits to represent in this notation! (Essentially because the order or 2 modulo `10000000019` is `10000000018`.) And it's not much better for `p=3` or `p=5`, either: in both those cases the repeating part has period `5000000009`. – Mark Dickinson Sep 13 '15 at 20:00
  • Hm. I'm testing with p=10 as a shortcut until I write radix-conversion functions (I know this will complicate division since 10 is not prime), but my plan was to use the largest prime number < 2^32-1. Will it have the same problems? ... I assume that any number system I choose will have *some* pathological little corners. But I really like the fact that this one can handle integers, big ints, rationals, and (sacrificing the cyclic repeating part) high-precision floating-point. This should greatly simplify my type handling in the rest of the interpreter. – luser droog Sep 13 '15 at 20:15
  • It'll have the same problems. This isn't a case of pathological corners, it's a systemic problem in the whole representation scheme. No matter what base `p` you use, you can expect *most* fractions of the form `m / n` to require on the order of `phi(n)` digits for the repeating part in that base (where `phi` is Euler's totient function). Sometimes it'll be exactly `phi(n)` digits, sometimes `phi(n) / 2` or something smaller, and occasionally you'll get lucky and the repeating part will be much shorter than `phi(n)` digits. But those short cases are the exception, rather than the rule. – Mark Dickinson Sep 13 '15 at 20:20
  • Well that's very discouraging. :( One of the common APL functions I'll be implementing is the reciprocal 1/x. So, that'll typically be expensive. :( – luser droog Sep 13 '15 at 20:25
  • @MarkDickinson much belatedly, I've followed up on the issues you've raised over on [math.se](http://math.stackexchange.com/questions/1615290/is-there-a-better-representation-than-p-adics-for-exact-computer-arithmetic). I quote your comments there, so I'd very much appreciate your input. – luser droog Jan 17 '16 at 06:06

0 Answers0