Defining a type hierarchy including function types in Prolog

Question

I'm revisiting Prolog after studying it in college, and would like to describe a type hierarchy that includes function types. So far, this is what I got (SWISH link):

% subtype/2 is true if the first argument is a direct subtype of
% the second.
subtype(bit, byte).
subtype(byte, uint16).
subtype(uint16, uint32).
subtype(uint32, uint64).
subtype(uint64, int).

subtype(int8, int16).
subtype(int16, int32).
subtype(int32, int64).
subtype(int64, int).

% isa/2 checks if there's a sequence of types that takes
% from X to Y.
isa(X,Y) :- subtype(X,Y).
isa(X,Y) :-
    subtype(X,Z),
    isa(Z,Y).

This program works well for the following queries:

?- subtype(bit, int).
true
?- findall(X,isa(X,int),IntTypes).
IntTypes = [uint64, int64, bit, byte, uint16, uint32, int8, int16, int32]

I then added the following definition for function subtypes just above isa, where a function is a complex term func(ArgsTypeList, ResultType):

% Functions are covariant on the return type, and
% contravariant on the arguments' type.
subtype(func(Args,R1), func(Args,R2)) :-
    subtype(R1, R2).
subtype(func([H1|T],R), func([H2|T],R)) :-
    subtype(H2, H1).
subtype(func([H|T1],R), func([H|T2],R)) :-
    subtype(func(T1,R), func(T2,R)).

Now, I am still able to do some finite checks, but even trying to enumerate all subtypes of byte fails with stack overflow.

?- isa(func([int,int], bit), func([bit,bit], int)).
true
?- isa(X, byte).
X = bit ;
Stack limit (0.2Gb) exceeded

What am I doing wrong?

Paulo Moura · Answer 1 · 2018-08-26T17:01:42.443

The problem only occurs, as you noticed, when you added the second set of subtype/2 definitions. When you call the goal isa(X, byte) and ask for a second solution, you use the second clause for isa/2, resulting in a call to subtype/2 with both arguments unbound. Eventually, you end up calling the second set of subtype/2 definitions. The first argument, unbound in the query, is unified with the func(Args,R1) term, where both arguments are variables. Thus, the recursive call is going to eventually repeat the unification between a variable and a func(Args,R1) term creating an ever increasing term, with increasing recursive calls, which eventually exhausts the stack.

To make it more clear, note that asking for a second solution results in using the second clause for isa/2 with the following bindings:

isa(X,byte) :- subtype(X,Z), isa(Z, byte).

Everytime a solution for the subtype(X,Z) goal, it results in a failure for the next goal, isa(Z, byte). Thus, you keep backtracking into the first clause of the second set of subtype/2 clauses.

The usual solution to understand this issues is to use the Prolog system trace mechanism. For some reason, I couldn't use it with SWI-Prolog, which appears that you're using given your reference to SWISH, but I had better luck with GNU Prolog:

{trace}
| ?- isa(X, byte).
1    1  Call: isa(_279,byte) ? 
2    2  Call: subtype(_279,byte) ? 
2    2  Exit: subtype(bit,byte) ? 
1    1  Exit: isa(bit,byte) ? 

X = bit ? ;
...
17    7  Exit: subtype(func([byte|_723],int),func([bit|_723],int)) ? 
...
20    8  Exit: subtype(func([bit,byte|_839],int),func([bit,bit|_839],int)) ? 
...
21    9  Call: subtype(_806,bit) ? 
21    9  Fail: subtype(_806,bit) ? 
...

24    9  Exit: subtype(func([bit,bit,byte|_985],int),func([bit,bit,bit|_985],int)) ? 
...
25    9  Call: subtype(_806,bit) ? 
25    9  Fail: subtype(_806,bit) ?

I have omitted most of the trace lines for brevity but you can see that a func/2 term with a growing list in the first argument is being built.

How to solve the problem? Maybe distinguish between simple and compound types? For example:

simple_subtype(bit, byte).
simple_subtype(byte, uint16).
simple_subtype(uint16, uint32).
simple_subtype(uint32, uint64).
simple_subtype(uint64, int).

simple_subtype(int8, int16).
simple_subtype(int16, int32).
simple_subtype(int32, int64).
simple_subtype(int64, int).

% Functions are covariant on the return type, and
% contravariant on the arguments' type.
compound_subtype(func(Args,R1), func(Args,R2)) :-
    simple_subtype(R1, R2).
compound_subtype(func([H1|T],R), func([H2|T],R)) :-
    simple_subtype(H2, H1).
compound_subtype(func([H|T1],R), func([H|T2],R)) :-
    compound_subtype(func(T1,R), func(T2,R)).

% subtype/2 is true if the first argument is a direct subtype of
% the second.
subtype(X,Y) :- simple_subtype(X,Y).
subtype(X,Y) :- compound_subtype(X,Y).

% isa/2 checks if there's a sequence of types that takes
% from X to Y.
isa(X,Y) :-
    subtype(X,Y).
isa(X,Y) :-
    subtype(X,Z),
    isa(Z,Y).

Still, the second and third clauses for compound_subtype/2 are problematic as they impose no bound in the length of the list...

Thanks for your attention, simply splitting them wasn't enough, though. One thing I'd like is for functions to be able to return functions, too. On SWISH, I found that I can use `trace, isa(X, int).` to trigger the debugging view. — Bruno Kim, Aug 27 '18 at 01:04
Is there an effective or practical limits to the number of function arguments and the depth of function composition? If so, the code could be modified to allow enumerating all possible solutions. — Paulo Moura, Aug 27 '18 at 08:58
thanks for your attention, I managed to find a solution with some code duplication, and added it as an answer, if you're interested. — Bruno Kim, Sep 03 '18 at 12:23

score 0 · Accepted Answer · answered Sep 03 '18 at 12:22

I was able to avoid the issue of unbounded left-recursion by including the logic for supertypes, and using one or another depending on which of the variables are bound.

First, I defined a clause for simple types, enumerating all that are used later explicitly:

simple_type(bit).
simple_type(byte).
% ...
simple_type(int).

I then restricted the subtype rule only for cases where the first term is already bound, using nonvar.

subtype(func(Args,R1), func(Args,R2)) :-
    nonvar(R1),
    subtype(R1, R2).
subtype(func([H1|T],R), func([H2|T],R)) :-
    nonvar(H1),
    supertype(H1, H2).
subtype(func([H|T1],R), func([H|T2],R)) :-
    nonvar(T1),
    subtype(func(T1,R), func(T2,R)).

I then defined a supertype rule, that is just the opposite of subtype for simple types...

supertype(X, Y) :-
    simple_type(X),
    subtype(Y, X).

...but is fully duplicated for function types.

supertype(func(Args,R1), func(Args,R2)) :-
    nonvar(R1),
    supertype(R1, R2).
supertype(func([H1|T],R), func([H2|T],R)) :-
    nonvar(H1),
    subtype(H1, H2).
supertype(func([H|T1],R), func([H|T2],R)) :-
    nonvar(T1),
    supertype(func(T1,R), func(T2,R)).

isa is still the same, with two additions:

A type is the same as itself (i.e., an int is-a int).
If the first term is not bound but the second is, use the inverse rule typeof.

_

isa(X,Y) :- X = Y.
isa(X,Y) :- subtype(X,Y).
isa(X,Y) :-
    nonvar(X),
    subtype(X,Z),
    isa(Z,Y).
isa(X,Y) :-
    var(X), nonvar(Y), typeof(Y,X).

Finally, typeof is just the opposite of isa, using supertype instead of subtype:

typeof(X,Y) :- X = Y.
typeof(X,Y) :- supertype(X,Y).
typeof(X,Y) :-
    nonvar(X),
    supertype(X,Z),
    typeof(Z,Y).
typeof(X,Y) :-
    var(X), nonvar(Y), isa(Y,X).

I noticed that there are a lot of inefficiencies and duplicate results with these rules, but at least it's working :)

?- setof(X, isa(func([byte, byte], uint32), X), All), length(All, L).

All = [func([bit, bit], int), func([bit, bit], uint32), func([bit, bit], uint64), func([bit, byte], int), func([bit, byte], uint32), func([bit, byte], uint64), func([byte, bit], int), func([byte, bit], uint32), func([byte, bit], uint64), func([byte, byte], int), func([byte, byte], uint32), func([byte, byte], uint64)],
L = 12

Defining a type hierarchy including function types in Prolog

2 Answers2