12

As Rich Hickey says, the secret sauce of Lisp languages is the ability to directly manipulate the Abstract Syntax Tree through macros. Can this be achieved in any non-Lisp dialect languages?

Mike
  • 19,267
  • 11
  • 56
  • 72
  • Just a note: I'm not sure if it's right to call it "*syntactic* abstraction". In fact, the syntax is probably more rigid than than of any other language (other than Whitespace): it's all just a list (and strings, etc. -- but that's beside the point). – user541686 Jun 26 '11 at 20:44
  • 1
    @Mehrdad: that's the syntax of s-expressions - not Lisp. – Rainer Joswig Jun 26 '11 at 21:02
  • @Rainer: Er, what? Then what's the syntax of Lisp? – user541686 Jun 26 '11 at 21:06
  • @Mehrdad: lists, etc. are s-expressions - a syntax for textual representation of data. Lisp is a family of programming languages. Compare for example the Common Lisp spec. Here is for example the syntax for the FLET construct from the HyperSpec ( http://www.lispworks.com/documentation/HyperSpec/Body/s_flet_.htm ) : flet ((function-name lambda-list [[local-declaration* | local-documentation]] local-form*)*) declaration* form* . There are many such syntactic forms. Some are built-in like FLET, CATCH, TAGBODY, IF ... - some are implemented as macros. Lisp syntax is defined on top of s-expressions – Rainer Joswig Jun 26 '11 at 21:13
  • @Rainer: I realize that Lisp is a family of programming languages (I was referring to Scheme in my example), but I'm still not understanding what you mean. Every language has a syntax; in Scheme, the syntax is the same as that of a list (which it has defined). What do you mean when you say that's not the syntax? – user541686 Jun 26 '11 at 21:16
  • 2
    @mehrdad: not every list is a valid Lisp program. These lists have a structure that is described in the Scheme standard, the Common Lisp standard, ... for example (lambda (foo bar) baz) is an anonymous function in Common Lisp. It expects that the first element is LAMBDA, the second element is a parameter list and then follows a sequence of forms, where the first form can be a documentation string and/or a declarations. A declaration can be a type declaration, an optimize declaration, .... – Rainer Joswig Jun 26 '11 at 21:18
  • @Rainer: I said "it's all just a list", not "all lists are Lisp"... I think you're talking about the converse of what I said, which clearly isn't true. – user541686 Jun 26 '11 at 21:25
  • 1
    @mehrdad: syntactic abstraction via macros is defined on top of lists. So it is not 'just' lists, but lists + syntax. What you have inside these lists has to follow 'rules' and in the case of macros there is a large degree of freedom - macros are implementing new rules - so it is not that rigid at all. – Rainer Joswig Jun 26 '11 at 21:46
  • 1
    @Mehdrad, do not forget the reader macros. You can implement any kind of syntax on top of them (even a C-like). – SK-logic Jun 28 '11 at 09:36
  • @rainer, technically the LAMBDA form you used as an example is not a syntactic element at all. A lambda form has the form #'(lambda (foo bar) baz). If you omit the #' part, it becomes a macro call that expands into the proper lambda form. A subtle difference, but important when discussing the syntax of Lisp. – Elias Mårtenson Jul 13 '11 at 10:40
  • @elias martenson: no. The lambda expression starts with lambda. FUNCTION is a function that extracts the functional value and returns it. For example this is valid ((lambda x) (- x 3)) 4). The same with FUNCTION around it is not valid. – Rainer Joswig Jul 13 '11 at 18:48

6 Answers6

22

Being able to "directly manipulate the abstract syntax tree" by itself is nothing new, though it's something that very few languages have. For example, many languages these days have some kind of an eval function -- but it should be obvious that that's not manipulating the abstract syntax tree, instead, it is a manipulation of the concrete syntax -- the direct source code. Incidentally, the mentioned functionality in D falls under the same category, as is CPP: both deal with raw source text.

To give an example of a language that does have that feature (but not something that would be considered macros proper), see OCaml. It has a syntactic extension system, CamlP4, which is essentially a compiler extension toolkit, and it revolves around the OCaml abstract syntax as its most important purpose. But this is still not what makes the corresponding feature in Lisps so great.

The important feature of Lisps is that the extensions that you get using macros are part of the language in the same way that any other syntactic form is. To put this differently, when you use something like if in a Lisp, there is no difference in functionality whether it's implemented as a macro or as a primitive form. (Actually there is a minor difference: in some cases it's important to know the set of primitive forms that don't expand further.) More specifically, a Lisp library can provide plain bindings and macros, which means that libraries can extend the language in a much more interesting way than the usual boring extensions you get in most languages, capable of adding only plain bindings (functions and values).

Now, viewed in this light, something like the D facility is very similar in nature. But the fact that it deals with raw text rather than ASTs limit its utility. If you look at the example on that page,

mixin(GenStruct!("Foo", "bar"));

you can see how this doesn't look like part of the language -- to make it more like Lisp, you'd use it in a natural way:

GenStruct(Foo, bar);

with no need for a mixin keyword that marks where a macro is used, no need for that !, and the identifiers being specified as identifiers rather than strings. Even better, the definition should be expressed more naturally, something like (inventing some bad syntax here):

template expression GenStruct(identifier Name, identifier M1) {
    return [[struct $Name$ { int $M1$; }; ]]
}

One important thing to note here is that since D is a statically typed language, ASTs have crept into this mental exercise in an explicit way -- as the identifier and expression types (I'm assuming here that template marks this as a macro definition, but it still needs a return type).

In Lisp, you're essentially getting something very close to this functionality, rather than the poor string solution. But you get even more -- Lisp intentionally puns over the basic list type, and unifies the ASTs with the runtime language in a very simple way: the AST is made of symbols and lists and other basic literals (numbers, strings, booleans), and those are all part of the runtime language. In fact, for those literals, Lisp takes another step forward, and uses the literals as their own syntax -- for example, the number 123 (a value that exists at runtime) is represented by a syntax which is also the number 123 (but now it's a value that exists at compile-time). The bottom line of this is that macro-related code in Lisp tends to be far easier to deal with than what other languages call "macro"s. Imagine, for example, making the D example code create N int fields in a struct (where N is a new input to the macro) -- that would require using some function to translate a string into a number.

Eli Barzilay
  • 29,301
  • 3
  • 67
  • 110
6

Lisp

The reasons LISP is "special" are...

The built-in functionality is very economical:

  • The only built-in data structures are atoms, or lists
  • The syntax is implemented in terms of the list data structure
  • There are very few "system functions"

It supports functions in such a way that new function definitions are indistinguishable from built-in functions:

  • The calling syntax is identical
  • Evaluation of arguments can be fully controlled

It supports macros in such a way that arbitrary Lisp code can always be defined in terms of a domain-specific language:

  • The calling syntax is just like custom function-call syntax, which is just like built-in function-call syntax
  • Evaluation of arguments is completely controllable
  • Arbitrary Lisp code-generation is possible
  • Macros are evaluated at runtime, so the macro's implementation can call existing code while generating new code

With the above features, you can:

  • Re-implement Lisp-within-Lisp, in very little code
  • Add any existing programming idioms in a way that is indistinguishable from built-in features

E.g. you can easily implement systems for namespaces, any data structure, classes, polymorphism, and multiple-dispatch on top of Lisp, and such features will work like they were built into Lisp.

Other languages

But it all depends on your definition. Some levels of "syntactic abstraction" are supported in other languages in quite varied ways. Some of these ways are more powerful than others, and nearly match Lisp's flexibility.

Some examples:

In Boo, you can use syntactic macros to define new DSLs that will automatically be handled by the compiler. With this, you can implement any language feature on top of existing features. The limitation compared to Lisp is that these are evaluated at compile time, so run-time code generation isn't directly supported.

In Javascript, the data structures are generic and flexible (everything is either a built-in type, or an associative array). It also supports invoking functions directly from associative arrays. With this, you can implement several language features on top of existing features, such as classes and namespaces.

Because Javascript is a dynamic language (names of function calls are evaluated at runtime), and because it exposes built-in features within the context of data structures, it is fully "reflective" and fully mutable.

Because of this, you can replace or shim the existing system functionality with your own functionality. This is often quite useful in shimming in your own runtime debugging features, or for sand-boxing (by un-defining system calls you don't want isolated code to access).

Lua is quite similar to Javascript in most of these ways.

The C++ pre-processor allows you to define your own DSL with a somewhat similar syntax to existing function calls. It does not let you control evaluation (which is the source of a lot of bugs, and why most people say C/C++ macros are "Evil"), but it does support a somewhat limited form of code generation.

The code generation support in C/C++ macros is limited because macros are evaluated before your code is compiled, and can't be controlled via C code. It is nearly completely limited to textual substitution. This greatly limits the type of code that can be generated.

The C++ template feature is quite powerful (WRT to C/C++ macros) for syntactical additions to the language. It can turn a lot of runtime code evaluation into compile-time code evalution, and can do static assertions on your existing code. It can reference existing C++ code, in a limited way.

But template meta-programming (TMP) is very unwieldy because it has a terrible syntax, is a very strictly limited subset of C++, has quite limited code generation ability, and can't be evaluated at runtime. C++ templates also arguably output the most difficult error messages you will ever encounter in programming :)

Note that this hasn't kept template meta-programming from being an active area of research in many communities. See the boost project, of which a good portion is devoted to TMP-support libraries, and TMP-implemented libraries.

Duck typing can allow you to define a syntax on objects that lets you substitute implementations at runtime. This is similar to how Javascript defines functions on associative arrays.

I can't say for Python (since I don't know it very well), but duck typing is often more limited than Javascript's dynamic features because of a lack of reflectivity, mutability, and exposure of system functionality through reflectable/mutable interfaces. For example, C#'s duck typing is limited in all these ways.

Merlyn Morgan-Graham
  • 58,163
  • 16
  • 128
  • 183
  • 1
    The point of macros in Lisp is not to control evaluation. The point of macros is to describe source transformations (procedural in many Lisps, declarative in some). – Rainer Joswig Jun 26 '11 at 21:49
  • @Rainer: Is that really worth a down-vote? If it is inaccurate, I welcome edits :) – Merlyn Morgan-Graham Jun 26 '11 at 21:55
  • @Rainer: Actually, let me edit it to appease your pedantry. My intention was to short-hand that "functions are definable with a syntax nearly indistinguishable from built-in functions. Macros complete this by additionally allowing you to control evaluation". It sounds like your point is that you can also *generate* code on the fly with macros, which is important. Is that what I'm missing? – Merlyn Morgan-Graham Jun 26 '11 at 22:04
  • 1
    @Merlyn Morgan-Graham: I simply don't understand what you are saying. I have no idea what you mean with "completeness", "limited", LISP has been written as Lisp since several decades, it is called Principia (not principa), much more powerful compared to what?, how are Javascript prototypes linked to syntactic abstraction, the C preprocessor is not working on the AST at all, duck typing is syntactic abstraction?, ... You would need to explain your points a bit - in the current form I would not think that your answer helps me in the context of the original question... – Rainer Joswig Jun 26 '11 at 22:06
  • @Rainer: I went through and elaborated on just about everything, and tried to take out the conversational tangents :) It's quite a bit messier now, but maybe it answers the original question a bit more thoroughly and unambiguously. – Merlyn Morgan-Graham Jun 26 '11 at 22:57
  • As far as the "built-in" data structures in, say, Common Lisp, you have symbols, numbers, hash tables, arrays, strings, streams, packages, ... – Vatine Jun 27 '11 at 12:02
  • @Vatine: I was under the impression that in "Lisp" that you had lists that were usable to represent both hash tables and arrays. Maybe what I said isn't true about all Lisps. – Merlyn Morgan-Graham Jun 28 '11 at 19:56
  • @Merlyn Morgan-Graham: In Common Lisp, for sure, there is a very different under-the-hood implementation between arrays, hash tables and lists. There is something approaching a unified set of functions to interact with arrays and lists (but not hash tables), but there's also "list-native" and "array-native" interactions. Lists-only is something that would surprise me in any "production-quality" lisp implementation from the last 20-25 years (maybe more), but wouldn't surprise me in a 50-year-old implementation. – Vatine Jun 30 '11 at 17:48
6

For a sake of completeness, in addition to the already mentioned languages and preprocessors:

SK-logic
  • 9,605
  • 1
  • 23
  • 35
3

I'm not sure if you'd call it "syntactic abstraction" per se, but it certainly can do much of what Lisp can do: The mixin keyword lets you convert a string into code (in a much better manner than C macros), which, when combined with templates (which are much better than those in C++) you can do pretty much anything you want.

Community
  • 1
  • 1
user541686
  • 205,094
  • 128
  • 528
  • 886
  • 1
    That's still much closer to CPP macros than to Lisp macros. (Not to mention the utterly confused mention of "hygienic" -- used in a sense that basically means "macro names are scoped", which means that, for example, Emacs Lisp has hygienic macros too...) – Eli Barzilay Jun 27 '11 at 05:12
2

Prolog would be such a language. There are many Prolog dialects. One idea is that their basic building block is a term (similar to an s-expression encoding a function). There are parsers that provide macro facilities for that.

Rainer Joswig
  • 136,269
  • 10
  • 221
  • 346
1

I would say Tcl qualifies -- well, depending on whether you consider Tcl a Lisp or not.

The standard grouping characters { } are actually just a string literal (with no variable interpolation), and there's an eval, so you can easily define your own control flow or looping syntax (and people often do).

Ken
  • 486
  • 3
  • 11