What's wrong with Prolog's append?

Question

According to my university's course in logic we could expect a different outcome than defined by Prolog for the following query:

append([], a, X)

(which unifies for X=a).

However I don't get what they're aiming at? What should be expected as a valid response, given that append should unify X for (in this example) the concatenation of [] and a?

I assume they may be expecting a return of false or [a]; however I suppose that should be the result of concatenating a and [], not [] and a (since [] is the tail of [a]).

@WillemVanOnsem As in, we could claim a isn't (by definition) a list and therefore append(a, [], X) shouldn't succeed? — Skyfe, Mar 21 '17 at 16:45
furthermore that the `append/3` in (swi-)Prolog is not implemented like most students implement it. — Willem Van Onsem, Mar 21 '17 at 16:46
@Skyfe ,when I'm trying in swipl: append(a,[],L). returns false and not L = a. Where did you try the above and gave you L=a ???( Did you mean A instead of a, as a variable...?) — coder, Mar 21 '17 at 16:52
@coder My bad, I turned the `[]` and `a` around lol. I actually meant `append([], a, X)` (edited the post). — Skyfe, Mar 21 '17 at 17:00
As complementary post to the answers below you can check the swipl implementation of append/3 here: http://www.swi-prolog.org/pldoc/doc/swi/library/lists.pl?show=src — coder, Mar 21 '17 at 17:11

score 7 · Answer 1 · answered Mar 21 '17 at 17:09

The point here is that we expect append/3 to hold only for lists.

In the query you show, a is not a list, yet append/3 still holds.

Thus, the relation is in fact more general than we would initially expect: It holds for other cases too!

The reason why this is so can be soon from the first clause of the traditional definition of append/3:

append([], Bs, Bs).

This clause alone already makes the query succeed! No additional pure clause can prevent this. Thus, it is this clause that must be restricted if we want the relation to hold only for lists. This means, we must put a constraint on the second argument, which we do by stating it in the body of the clause:

append([], Bs, Bs) :- ... (left as an exercise)

This obviously comes at a price: Performance.

So, the trade-off here is between performance and precision. In Prolog, we often accept such a trade-off because we implicitly use such predicates only with the intended terms. On the other hand, for many predicates, we want to benefit from domain errors or type errors if they are not called with the expected types.

Thanks for the explanation, makes sense! I guess `append([], [H|T], [H|T])` would suffice? — Skyfe, Mar 21 '17 at 17:16
It is a step in the right direction, but still too general. For example, with that clause, the query `?- append([], [a|b], Cs).` **succeeds** with `Cs = [a|b]`, but `[a|b]` is **not** a list either! Again, this shows that you have to *change* this very clause: No additional pure clause can prevent this overly general case that is now still a consequence of the program. — mat, Mar 21 '17 at 17:21

false · Answer 2 · 2022-08-12T09:44:00.197

Your course is aiming at a very important point of Prolog programming.

Manuals are often quite sloppy on the precise definition of append/3 and similar predicates. In fact, the complete definition is so complex that it is often preferred to define only part of the actual relation. Consider the first definition in the Prolog prologue:

append(Xs, Ys, Zs) is true if Zs is the concatenation of the lists Xs and Ys.

Note the if. The definition thus gives cases, where the relation holds but does not explicitly exclude further cases. To exclude further cases, it would say iff instead. The cases mentioned (that we are talking about lists) are the intended use of the predicate. So which cases now may be additionally included? Those cases where the precondition (that the arguments are lists) does not hold.

Consider a definition of append/3 with 'iff' in place of 'if':

append([], Xs, Xs) :-
   list(Xs).
append([X|Xs], Ys, [X|Zs]) :-
   append(Xs, Ys, Zs).

list([]).
list([X|Xs]) :-
   list(Xs).

The cost for appending two lists is now |Xs|+|Ys|. That is quite an overhead compared to |Xs| alone.

But the situation is even worse. Consider the query:

?- append([1,2], Ys, Zs).
;  Ys = [], Zs = [1,2]
;  Ys = [_A], Zs = [1,2,_A]
;  Ys = [_A,_B], Zs = [1,2,_A,_B]
;  ... .

So we get infinitely many answers to this query. Contrast this to the usual definition:

?- append([1,2], Ys, Zs).
   Zs = [1,2|Ys].

There is a single answer only! It contains all the answers for all lists plus some odd cases as you have observed. So the usual definition for append has better termination properties. In fact, it terminates if either the first or the third argument is a list of known length¹.

Note that the answer contains Ys. In this manner infinitely many answers can be collapsed into a single one. This in fact is the power of the logical variable! We can represent with finite means infinitely many solutions. The price to pay are some extra solutions² that may lead to programming errors. Some precaution is thus required.

1 It also terminates in some further obscure cases like append([a|_],_,[b|_]).

2 append([a], Zs, Zs). produces (in many systems) an answer, too.

Willem Van Onsem · Accepted Answer · 2017-03-21T22:06:04.627

However I don't get what they're aiming at?

Knowing exactly what they are aiming at is of course impossible without asking them.

Nevertheless I think they aim to show that Prolog is (more or less) untyped. append/3 is documented as:

append(?List1, ?List2, ?List1AndList2)

List1AndList2 is the concatenation of List1 and List2.

So clearly one expects that the three arguments are lists and a is not a list. a is not the concatenation of [] and a since one would consider the two not "concatenatable".

Now this still succeeds, because append/3 is usually implemented as:

append([],T,T).
append([H|T],T2,[H|R]) :-
    append(T,T2,R).

So if you give it append([],a,X)., it will simply unify with the first clause and unify X = a.

The same "weird" behavior happens with append([14],a,X). Here X = [14|a] which is not a list as well. This is because the Prolog interpreter does not "know" it is working with lists. For Prolog [A|B] is the same like any other functor.

A more "type safe" way to handle this could be:

append([],[],[]).
append([H|T],T2,[H|R]) :-
    append(T,T2,R).
append([],[H|T],[H|R]) :-
    append([],T,R).

Or more elegantly:

list([]).
list([_|T]) :-
    list(T).

append([],T,T) :-
    list(T).
append([H|T],T2,[H|R]) :-
    append(T,T2,R).

since here we check whether the second argument is a list. The downside however is that now we will append/3 in O(m+n) with m the length of the first list and n the length of the second list whereas in the original code it would take only O(m) time. Furthermore note that Prolog will not raise a warning/error at parse time. It will only fail to append [] with a at the moment you query these.

Not checking types results in the fact that you have less guarantees if the program compiles/does not raises errors when you feed it to an interpreter. This can be a good thing, but a problem might be that you call some predicates in a way they don't expect which may raise errors eventually later. That is why statically typed languages are sometimes used: they "guarantee" (at least to some extent) that if you call the problem, no such errors will occur. Of course that does not mean that the program cannot error on other things (or simply make no sense). haskell for instance is statically typed and has an append like:

(++) [] t2 = t2
(++) (h:t) t2 = h:((++) t t2)

The definition is "more or less" the same, but Haskell will derive that the type of (++) is (++) :: [a] -> [a] -> [a]. Because it know the type of the input and output of every function, it can perform calculus on it, and therefore at compile time, it will raise errors if you would give (++) something different than a list.

Whether that is a good thing is of course a different question: dynamically typed programming languages are designed that way deliberately since it allows more flexibility.

Thank you for explaining, sounds like that's what they were aiming at. Would `append([], [H|T], [H|T])` be a correct way to handle this? — Skyfe, Mar 21 '17 at 17:17
@Skyfe: in that case there is still no guarantee that `T` itself is a valid list, etc. Furthermore `append([],[],X)` should also succeed. — Willem Van Onsem, Mar 21 '17 at 17:19
Your "type safe" way changes quite a lot. Better add `list(Xs)` to the fact. — false, Mar 21 '17 at 21:31
@false: but then - given `list/1` only succeeds for a grounded list - `append/3` no longer works multi-directional. — Willem Van Onsem, Mar 21 '17 at 21:33
@false: but if we use these facts as guards, the result is that the program will keep backtracking on the `list/1` predicate until it finally comes up with the correct link: after it succeeds, the result is grounded (we can of course use double negation), but that makes it rather complex. — Willem Van Onsem, Mar 21 '17 at 21:39
Please give a concrete example where you think your version is better than this one! — false, Mar 21 '17 at 21:42
@false: the problem is that you do not write an `append/3` itself, you write an implementation for `list/1`. Indeed that `list/1` will come validate/generate all lists, but the question is how to incorporate it within `append/3` in an elegant and efficient way. — Willem Van Onsem, Mar 21 '17 at 21:43
`append([], Xs, Xs) :- list(Xs).` that's what I meant by "better add `list(Xs)` to the fact" — false, Mar 21 '17 at 21:44

What's wrong with Prolog's append?

3 Answers3

Linked