15

Is there a way to create a Mathematica pattern which matches expressions whose heads may be arbitrarily deep, i.e. something like f[___][___][___]...?

Perce
  • 253
  • 1
  • 5
  • 1
    [WReach](http://stackoverflow.com/users/211232/wreach) has answered similar question in the thread ["Currying with Mathematica"](http://stackoverflow.com/questions/5686494/currying-with-mathematica/5687109#5687109). But his code works only for fixed depth of the expression. – Alexey Popkov May 11 '11 at 02:14
  • Welcome to StackOverflow. See [this FAQ](http://meta.stackexchange.com/questions/7931/faq-for-stack-exchange-sites) if you have question about the site. Please remember to [vote for and accept](http://stackoverflow.com/faq#howtoask) good answers. – Mr.Wizard May 11 '11 at 09:16
  • 3
    +1, for me, this question comes up most often when trying to match `Derivative[_][_][_]`, and it is annoying every time. – rcollyer May 11 '11 at 11:35

5 Answers5

12

The suggested solution

There seems to be no built-in construct to pattern-test nested heads automatically. We can achieve the goal by writing a function which would, for any given (sub)expression of the form f[___]...[___], efficiently determine f (which, with a slight abuse of terminology, we may call a symbolic head for the expression). Here is the code:

ClearAll[shead];
SetAttributes[shead, HoldAllComplete];
shead[expr_] := Scan[Return, Unevaluated[expr], {-1}, Heads -> True];

Here is how it can be used (I will use the same set of tests as @Sasha):

In[105]:= Cases[{f[1], g[f[1]], f[1, 2, 3][1], f[1][2][3][4]}, x_ /; shead[x] === f]

Out[105]= {f[1], f[1, 2, 3][1], f[1][2][3][4]}

The pattern syntax

If you prefer to use the syntax suggested by @Sasha, that version would look like

Clear[headPattern];
headPattern[head_] := _?(Function[Null, shead[#] === head, HoldFirst]);

In[108]:= Cases[{f[1], g[f[1]], f[1, 2, 3][1], f[1][2][3][4]}, headPattern[f]]

Out[108]= {f[1], f[1, 2, 3][1], f[1][2][3][4]}

Further explanations and comments

How it works

Here are some hints for the logic that lead to this solution, and how it works. The solution will be most concise and efficient if we manage to leverage some of the built-in expression-traversal functions. Some that come to mind are Map, Scan,Cases,MapIndexed,Position. Given that we need the heads, we'd need to pass the Heads->True option. I used Scan, since this one is easy to stop at any point (unlike other mentioned constructs, for which you'd typically need to throw an exception to stop them "in the middle", which is rather inelegant and induces some overhead as well) as soon as we find what we want. Our result will be the very first thing Scan finds on its depth-first expression traversal, so it is expected to be very efficient (it does not traverse the entire expression).

Avoiding the evaluation leaks

Another comment is on evaluation. You can see that HoldAllComplete attribute is used in shead, and Unevaluated is used in its body. These are very important - they serve to prevent possible evaluation of expressions passed to the function. It may matter in cases like this:

In[110]:= m = n = 0;
g[x_] := n++;
h[x_] := m++;
{Cases[Hold[f[g[1]][h[2]]], x_ /; shead[x] === f :> Hold[x], Infinity], {m, n}}

Out[113]= {{Hold[f[g[1]][h[2]]]}, {0, 0}}

Here, we see what we'd expect - even though Cases has been traversing the entire expression and feeding its (sub)parts to shead, no evaluation of sub-parts was triggered by shead. Now we define a naive version of shead which "leaks evaluation":

sheadEval[expr_] := Scan[Return, expr, {-1}, Heads -> True]

And now,

In[114]:= {Cases[Hold[f[g[1]][h[2]]], x_ /; sheadEval[x] === f :> Hold[x], Infinity], {m, n}}

Out[114]= {{Hold[f[g[1]][h[2]]]}, {2, 1}}

The latter behavior is unsatisfactory generally. The whole code-is-data paradigm, so useful in meta - programming, is very powerful in Mathematica because you can use rules to destructure code. Possible (unwanted) evaluation during the pattern- matching would greatly impair it. The whole problem is in the sub-parts. Wrapping Hold only prevents the whole expression from evaluation. Functions like Cases and other similar functions for code destructuring are so great because they don't evaluate sub-parts when doing the structural (syntactic) matching.

Comment on symbolic heads

The last comment here (mostly about definitions) is that the shead function returns not exactly what is normally called symbolic head in Mathematica. The difference is for atomic expressions. For example, shead[f] returns f, while for atomic expressions, the true symbolic head should coincide with the head of an expression (Symbol in this case). I have developed the symbolicHead function with this behavior here, and that one can also be successfully used in place of shead in the above, although shead is more efficient.

Community
  • 1
  • 1
Leonid Shifrin
  • 22,449
  • 4
  • 68
  • 100
  • If I could vote this more than once I would. You are the master instructor. – Mr.Wizard May 11 '11 at 11:30
  • @Mr.Wizard Thanks! But it was our comment-based discussion that prompted me to do it here. I wish I always had the attitude to find decent wording for my answers. – Leonid Shifrin May 11 '11 at 11:34
6

A recursive matching strategy could be used here:

curried[head_] := _head | (x_[___] /; MatchQ[Hold[x], _[curried[head]]])

Usage:

In[26]:= $testCases = {f, f[1], g[f[1]], f[1,2,3][1], f[1][2][3][4]};
         Cases[$testCases, curried[f]]

Out[27]= {f[1],f[1,2,3][1],f[1][2][3][4]}

Update

At Leonid's suggestion, Unevaluated can be used as a clearer and faster way to avoid evaluation leaks in the pattern condition:

curried[head_] := _head | (x_[___] /; MatchQ[Unevaluated[x], curried[head]])
WReach
  • 18,098
  • 3
  • 49
  • 93
  • A very nice use of recursive patterns - +1. – Leonid Shifrin May 12 '11 at 07:43
  • So _that's_ how to make that work. It was the first thing I tried, and I promptly made an infinite recursion. – Mr.Wizard May 12 '11 at 12:02
  • 2
    While I very much like the idea, I must say that the performance of this solution is inferior for even moderately deeply nested heads. The following alternative `curriedAlt[head_] := _head | (x_[___] /; MatchQ[Unevaluated[x], curriedAlt[head]])` offers much better performance while is essentially the same idea. One can see the difference with e.g. this code `Cases[{HeadCompose[f, Sequence @@ Range[15]]}, curried[f]]` and the same with `curriedAlt`. – Leonid Shifrin May 12 '11 at 19:36
  • @Leonid `Unevaluated` reflects the intent much more clearly than `Hold` and so I prefer it even if there were not a performance difference. The `MatchQ[Hold[...], _[...]]` pattern was just me blindly following an idiom that I had used in an unrelated context. I'll update my answer to include your suggestion. – WReach May 12 '11 at 21:24
  • I see - I have also used a lot the pattern similar to the one you mentioned. Whether or not to like design and other patterns, but we seem to often think in them. – Leonid Shifrin May 12 '11 at 21:34
5

How about the following:

In[277]:= 
ArbitrarilyDeepHeadPattern[
  head_Symbol] := _?(Function[
    MemberQ[
      Position[#, _head, {0, Infinity}, Heads -> True], {0 ...}]])

In[281]:= Cases[{f[1], g[f[1]], f[1, 2, 3][1], f[1][2][3][4]}, 
 ArbitrarilyDeepHeadPattern[f]]

Out[281]= {f[1], f[1, 2, 3][1], f[1][2][3][4]}
Sasha
  • 5,935
  • 1
  • 25
  • 33
3

WReach's answer made me reexamine a recursive definition, which I tried yesterday but gave up on.

I realize now that what I had actually works, it just throws an error. It is a toy compared to Leonid's fine method, but I have a fondness for terse code, so I post it here for interest or amusement. Make sure you do not have $RecursionLimit set to Infinity before running this.

Cases[
  {f, f[1], g[f[1]], f[1, 2, 3][1], f[1][2][3][4]}, 
  f // Blank@#|#0[#]@_&
]

Or even:

Cases[
  {f, f[1], g[f[1]], f[1, 2, 3][1], f[1][2][3][4]},
  p=_f|p@_
]
Mr.Wizard
  • 24,179
  • 5
  • 44
  • 125
  • Sorry to disappoint you, but the error message is there for a reason. What you get at the end is a very deeply nested pattern with depth controlled by `$RecursionLimit`. But, it will only work for expressions with head nesting roughly not exceeding that depth. This works: `Block[{$RecursionLimit = 20, p, n = 17}, MatchQ[HeadCompose[f, Sequence @@ Range[n]], p = _f | p@_]] `, but if you change `n` to `18`, it will stop working. Ironically, you'd need to set `$RecursionLimit` to `Infinity` for your construction to work. – Leonid Shifrin May 12 '11 at 15:03
  • @Leonid like I said this is just a toy, but I think it is reasonably safe to assume that the head depth is not going to exceed 256, the default $RecursionLimit. – Mr.Wizard May 12 '11 at 15:13
  • 1
    @Mr.Wizard I would not make any assumptions for the head depth, since it has nothing to do with the `$RecursionLimit`. But, at least conceptually, here is a way to cure your method. The problem is that there seems to be no way to evaluate `Alternatives` lazily (that is, not evaluate its second part until the first part is found to not match, by the pattern - matching), because, while the alternatives are tried from left to right, this is done inside the pattern-matcher and is out of our control. A lazy version of `Alternatives` would save your method. Below is a possible implementation. – Leonid Shifrin May 12 '11 at 15:44
  • 1
    Continuing... `ClearAll[lazyAlternatives]; SetAttributes[lazyAlternatives, HoldRest]; lazyAlternatives[expr_] := expr; lazyAlternatives[first_, rest__] := _?(Function[ MatchQ[#, first] || MatchQ[#, lazyAlternatives[rest]]])`. Now, you can test: `MatchQ[HeadCompose[f, Sequence @@ Range[100]], p = lazyAlternatives[_f, p@_]]` gives `True`, without any generated errors. But, of course, the code is no longer as brief as you would like, since we need this auxiliary function. For very deeply nested heads, you will still to increase the `$RecursionLimit`, for this to work. – Leonid Shifrin May 12 '11 at 16:07
  • In a sense, the method suggested by @WReach also implements a version of lazy `Alternatives`, where the laziness is achieved by a separate head `curried` and the `HoldAll` attributes of `Condition`. His version seems to be the most elegant way to achieve that. – Leonid Shifrin May 12 '11 at 16:17
  • Mr.Wizard But I quite like the ideas - so +1 – Leonid Shifrin May 12 '11 at 16:24
  • @Leonid, I love your analyses. – Mr.Wizard May 12 '11 at 23:31
  • @Mr.Wizard My suggestion does not really eliminate the problem with `$RecursionLimit`, it just moves it from definition-time to run-time. My other solution with `symbolicHead` function (linked from the other post) is also recursive, but has no such problem since `symbolicHead` is properly tail-recursive (in the Mathematica sense), and so depends on the `$IterationLimit` rather than `$RecursionLimit`, and poses no danger of stack exhaustion (apart from the fact that default `$IterationLimit` is much larger than default `RecursionLimit`). – Leonid Shifrin May 13 '11 at 04:48
1

Here is an alternative version of @Leonid's shead, to find the symbolic head of an expression. (You should use the rest of his solution as is.) My function doesn't involve any recursion, but instead it uses that Level has a special case where setting the levelspec to {-1} returns all atomic expressions, the first of which is the head itself:

shead2[expr_] := First@Level[expr, {-1}, Heads -> True];

It works in pattern matching just the same as shead does:

In[264]:= Cases[{f[1], g[f[1]], f[1, 2, 3][1], f[1][2][3][4]}, 
 x_ /; shead2[x] === f]

Out[264]= {f[1], f[1, 2, 3][1], f[1][2][3][4]}

And to help understand how it works, here is how Level behaves with levelspec set to {-1}:

In[263]:= 
Level[#, {-1}, Heads -> True] & /@ {f[1], g[f[1]], f[1, 2, 3][1], 
  f[1][2][3][4]}

Out[263]= {{f, 1}, {g, f, 1}, {f, 1, 2, 3, 1}, {f, 1, 2, 3, 4}}
Community
  • 1
  • 1
Seth
  • 154
  • 9