43

I know that they are implemented extremely unsafely in C/C++. Can't they be implemented in a safer way? Are the disadvantages of macros really bad enough to outweigh the massive power they provide?

Casebash
  • 7,662
  • 5
  • 41
  • 62
  • 4
    Exactly what power do macros provide that can't fairly easily be achieved in another way? – Chinmay Kanchi Sep 05 '10 at 01:43
  • 2
    In C#, it took a core language extension to wrap creating a simple property and backing field into one declaration. And this takes more than lexical macros: how can you create a feature corresponding to Visual Basic's `WithEvents` modifier? You'd need something like semantic macros. – Jeffrey Hantin Sep 18 '10 at 04:14
  • A lot of stuff the C preprocessor is used for is prehistoric and unnecessary even in C and C++. However, looking at stuff like the [Preprocessor Library](http://www.boost.org/doc/libs/release/libs/preprocessor) or like [boost_foreach](http://www.boost.org/doc/libs/1_46_0/doc/html/foreach.html) it really makes me wonder too why other languages have apparently eschewed the concept completely. – Martin Ba Jun 07 '11 at 13:02
  • 4
    The problem with macros is that, like all powerful mechanisms, it allows a programmer to break the assumptions that others have made or will make. Assumptions are the key to reasoning and without the ability to feasibly reason about logic, making progress becomes prohibitive. – dan_waterworth Jul 30 '12 at 18:09
  • 3
    @Chinmay: macros generate code. There's no feature in the Java language with that power. – kevin cline Apr 11 '13 at 04:11
  • 2
    @Chinmay Kanchi: Macros allow to execute (evaluate) code at compile time instead of at run time. – Giorgio Apr 11 '13 at 10:26
  • __Rust__ is an example of a modern language with macros [done right](https://doc.rust-lang.org/book/macros.html). – Emil Laine Nov 10 '15 at 01:20
  • - Instead of coding things directly you can code non-deterministically with logic/constraint programming that generates Lisp code for you. – aoeu256 Jul 22 '19 at 12:09

7 Answers7

63

I think the main reason is that macros are lexical. This has several consequences:

  • The compiler has no way of checking that a macro is semantically closed, i.e. that it represents a “unit of meaning” like a function does. (Consider #define TWO 1+1 — what does TWO*TWO equal? 3.)

  • Macros are not typed like functions are. The compiler cannot check that the parameters and return type make sense. It can only check the expanded expression that uses the macro.

  • If the code doesn’t compile, the compiler has no way of knowing whether the error is in the macro itself or the place where the macro is used. The compiler will either report the wrong place half of the time, or it has to report both even though one of them is probably fine. (Consider #define min(x,y) (((x)<(y))?(x):(y)): What should the compiler do if the types of x and y don’t match or don’t implement operator<?)

  • Automated tools cannot work with them in semantically useful ways. In particular, you can’t have things like IntelliSense for macros that work like functions but expand to an expression. (Again, the min example.)

  • The side-effects of a macro are not as explicit as they are with functions, causing potential confusion for the programmer. (Consider again the min example: in a function call, you know that the expression for x is evaluated only once, but here you can’t know without looking at the macro.)

Like I said, these are all consequences of the fact that macros are lexical. When you try to turn them into something more proper, you end up with functions and constants.

Timwi
  • 4,411
  • 29
  • 37
  • 33
    Not all macro languages are lexical. For example, Scheme macros are syntactic, and C++ templates are semantic. Lexical macro systems have the distinction that they can be tacked onto any language without being aware of its syntax. – Jeffrey Hantin Oct 04 '10 at 19:25
  • 8
    @Jeffrey: I guess my “macros are lexical” is really shorthand for “when people refer to macros in a programming language they generally think of lexical macros”. It is unfortunate that Scheme would use this loaded term for something that is fundamentally different. C++ templates, however, are not widely referred to as macros, presumably precisely because they are not entirely lexical. – Timwi Oct 04 '10 at 21:20
  • 27
    I think that Scheme's use of the term macro dates back to Lisp, which probably means it predates most other uses. IIRC the C/C++ system was originally called the *preprocessor*. – Bevan Oct 09 '10 at 09:12
  • 18
    @Bevan is right. Saying macros are lexical is like saying birds can't fly because the bird you're most familiar with is a penguin. That said, most (but not all) of the points you raise also apply to syntactic macros, though perhaps to a lesser degree. – Laurence Gonsalves Mar 04 '11 at 06:45
15

Macros can, as Scott notes, allow you hide logic. Of course, so do functions, classes, libraries, and many other common devices.

But a powerful macro system can go further, enabling you to design and utilize syntax and structures not normally found in the language. This can be a wonderful tool indeed: domain-specific languages, code generators and more, all within the comfort of a single language environment...

However, it can be abused. It can make code harder to read, understand and debug, increase the time necessary for new programmers to become familiar with a codebase, and lead to costly mistakes and delays.

So for languages intended to simplify programming (like Java or Python), such a system is an anathema.

Shog9
  • 8,083
  • 2
  • 45
  • 56
  • 3
    And then Java went off the rails by adding foreach, annotations, assertions, generics-by-erasure, ... – Jeffrey Hantin Sep 18 '10 at 04:13
  • 2
    @Jeffrey: and let's not forget, plenty of *3rd-party* code generators. On second thought, let's forget those. – Shog9 Sep 18 '10 at 04:30
  • 4
    Another example of how Java was simplistic instead of simple. – Jeffrey Hantin Sep 20 '10 at 22:30
  • Dynamically typed languages like Python can get along much better without macros than statically typed languages like Java. Java is the only language I know where a separate copy of `min` is needed for each numeric type. – kevin cline Jul 30 '12 at 20:25
  • 3
    @JeffreyHantin: What's wrong with foreach? – Casebash Apr 11 '13 at 04:54
  • "harder to read, understand and debug". The nightmares of debugging C code which uses many macros is more than enough reason to leave them out of any modern language. – Dunk Apr 11 '13 at 18:59
  • They can be worth their weight in gold when you need a simple syntax for something specific. But yes, they can be - and have been - abused pretty badly. @Dunk – Shog9 Apr 11 '13 at 19:01
  • @Casebash: My issue is not with foreach specifically, but that adding foreach support required revising the Java language standard. Powerful macros or pluggable compiler extensions would have eliminated the need to revise the language definition to implement a simple syntactic sugar like foreach, or for that matter a complicated one like embedding a domain specific sublanguage. – Jeffrey Hantin Apr 11 '13 at 22:11
  • 1
    @Dunk This analogy might be a stretch... but I think language extensibility is kind of like plutonium. Costly to implement, very difficult to machine into desired shapes, and remarkably dangerous if misused, but also amazingly powerful when applied well. – Jeffrey Hantin Apr 11 '13 at 22:18
  • 1
    "However, it can be abused. It can make code harder to read, understand and debug, increase the time necessary for new programmers to become familiar with a codebase, and lead to costly mistakes and delays.": Replace it with class inheritance and you get a valid statement (inheritance can be overused, creating complex, unreadable, difficult-to-maintain class hierarchies). Yet, object-orientation, classes and inheritance are widely supported. – Giorgio Apr 12 '13 at 10:11
14

But yes, macros can be designed and implemented better than in C/C++.

The problem with macros is that they are effectively a language syntax extension mechanism that rewrites your code into something else.

  • In the C / C++ case, there is no fundamental sanity checking. If you are careful, things are OK. If you make a mistake, or if you overuse macros you can get into big problems.

    Add to this that many simple things you can do with (C/C++ style) macros can be done in other ways in other languages.

  • In other languages such as various Lisp dialects, macros are better integrated with the core language syntax, but you can still get problems with declarations in a macro "leaking". This is addressed by hygienic macros.


Brief Historical Context

Macros (short for macro-instructions) first appeared in the context of assembly language. According to Wikipedia, macros were available in some IBM assemblers in the 1950s.

The original LISP didn't have macros, but they were first introduced into MacLisp in the mid 1960s: https://stackoverflow.com/questions/3065606/when-did-the-idea-of-macros-user-defined-code-transformation-appear. http://www.csee.umbc.edu/courses/331/resources/papers/Evolution-of-Lisp.pdf. Prior to that, "fexprs" provided macro-like functionality.

The earliest versions of C didn't have macros (http://cm.bell-labs.com/cm/cs/who/dmr/chist.html). These were added circa 1972-73 via a preprocessor. Prior to that, C only supported #include and #define.

The M4 macro-preprocessor originated in circa 1977.

More recent languages apparently implement macros where the model of operation is syntactic rather than textual.

So when someone talks about the primacy of a particular definition of the term "macro", it is important to note that the meaning has evolved over time.

Stephen C
  • 25,180
  • 6
  • 64
  • 87
  • 9
    C++ is blessed(?) with TWO macro systems: the preprocessor and template expansion. – Jeffrey Hantin Sep 18 '10 at 04:12
  • I think claiming that template expansion is a macro is stretching the definition of macro. Using your definition makes the original question meaningless and just plain wrong, since most modern languages in fact do have macros (according to your definition). However, using macro as the overwhelming majority of developers would use it makes it a good question. – Dunk Apr 11 '13 at 19:04
  • @Dunk, definition of a macro as in Lisp predates the "modern" dumb understanding as in the pathetic C preprocessor. – SK-logic Apr 11 '13 at 19:06
  • @SK:Is it better to be "technically" correct and obfuscate the conversation or is it better to be understood? – Dunk Apr 11 '13 at 19:19
  • 2
    @Dunk, terminology is not owned by the ignorant hordes. Their ignorance should never be taken into consideration, otherwise it will lead to dumbing down the whole domain of computer science. If someone understands "macro" only as a reference to a C preprocessor, it's nothing but an ignorance. I doubt there is a significant proportion of people who never heard of Lisp or even of, pardonnez mon français, VBA "macros" in Office. – SK-logic Apr 11 '13 at 19:28
  • @SK-logic - And macros in assembly language predated Lisp macros. – Stephen C Apr 12 '13 at 11:40
  • @StephenC, Lisp macros - 1963. I do not know what was the earliest macro assembler. Macro-11 is from 70s, and, AFAIR, previous DEC assemblers did not have any macros. Not sure about IBM stuff. – SK-logic Apr 12 '13 at 17:26
7

Macros can be implemented very safely in some circumstances - in Lisp for example, macros are just functions that return transformed code as a data structure (s-expression). Of course, Lisp benefits significantly from the fact that it is homoiconic and the fact that "code is data".

An example of how easy macros can be is this Clojure example which specifies a default value to be used in the case of an exception:

(defmacro on-error [default-value code]
  `(try ~code (catch Exception ~'e ~default-value)))

(on-error 0 (+ nil nil))               ;; would normally throw NullPointerException
=> 0                                   ;l; but we get the default value

Even in Lisps though, the general advice is "don't use macros unless you have to".

If you aren't using a homoiconic language, then macros get much trickier and the various other options all have some pitfalls:

  • Text-based macros - e.g. the C preprocessor - simple to implement but very tricky to use correctly as you need to generate the correct source syntax in textual form, including any syntactical quirks
  • Macro-based DSLS - e.g. the C++ template system. Complex, can itself result in some tricky syntax, can be extremely complex for compiler and tool writers to handle correctly since it introduces significant new complexity into the language syntax and semantics.
  • AST/bytecode manipulation APIs - e.g. Java reflection / bytecode generation - theoretically very flexible but can get very verbose: it can require a lot of code to do quite simple things. If it takes ten lines of code to generate the equivalent of a three line function, then you haven't gained much with your meta-programming endeavours...

Furthermore, everything a macro can do can ultimately be achieved in some other way in a turing complete language (even if this means writing a lot of boilerplate). As a result of all this trickiness, it's not surprising that many languages decide that macros aren't really worth the all effort to implement.

mikera
  • 20,617
  • 5
  • 75
  • 80
  • Ugh. Not more "code is data is a good thing" garbage. Code is code, data is data, and failing to properly segregate the two is **a security hole.** You would think that the emergence of SQL Injection as one of the largest vulnerability classes in existence would have discredited the idea of making code and data easily interchangeable once and for all. – Mason Wheeler Apr 11 '13 at 03:35
  • 11
    @Mason - I don't think you understand the code-is-data concept. All C program source code is also data - it just happens to be expressed in text format. Lisps are the same, except they express the code in practical intermediate data structures (s-expressions) that enable it to be manipulated and transformed by macros before compilation. In both cases sending untrusted input to the compiler could be a security hole - but that's hard to do accidentally and it would be your fault for doing something stupid, not the compiler's. – mikera Apr 11 '13 at 04:06
  • 1
    Rust is another interesting safe macro system. It has strict rules/semantics to avoid the kinds of problems associated with C/C++ lexical macros, and uses a DSL to capture parts of the input expression into macro variables to be insertrd later. It's not as powerful as Lisps' ability to call any function on the input, but it does provide a large set of useful manipulation macros which can be called from other macros. – zstewart Apr 14 '16 at 20:03
  • 1
    Ugh. Not more **security** garbage. Otherwise, blame every system with von Neumann architecture built-in, e.g. every IA-32 processor with capability of self-modifying code. I suspect whether you can fill the hole physically... Anyway, you have to face the fact in general: isomorphism between code and data is nature of the world [in several ways](http://wiki.c2.com/?DataAndCodeAreTheSameThing). And it is (probably) you, the programmer, having duty to keep the invariance of security requirments, which is not same to artificially applying premature segregation anywhere. – FrankHB Jun 24 '18 at 15:59
5

To answer your questions, think about what macros are predominantly used for (Warning: brain-compiled code).

  • Macros used to define symbolic constants #define X 100

This can easily be replaced with: const int X = 100;

  • Macros used to define (essentially) inline type-agnostic functions #define max(X,Y) (X>Y?X:Y)

In any language that supports function overloading, this can be emulated in a much more type-safe manner by having overloaded functions of the correct type, or, in a language that supports generics, by a generic function. The macro will happily attempt to compare anything including pointers or strings, which might compile, but is almost certainly not what you wanted. On the other hand, if you made macros type-safe, they offer no benefits or convenience over overloaded functions.

  • Macros used to specify shortcuts to often-used elements. #define p printf

This is easily replaced by a function p() that does the same thing. This is quite involved in C (requiring you to use the va_arg() family of functions) but in many other languages that support variable numbers of function arguments, it is much simpler.

Supporting these features within a language rather than in a special macro language is simpler, less error prone and far less confusing to others reading the code. In fact, I can't think of a single use-case for macros that can't easily be duplicated in another way. The only place where macros are truly useful is when they are tied to conditional compilation constructs like #if (etc.).

On that point, I won't argue with you, since I believe that non-preprocessor solutions to conditional compilation in popular languages are extremely cumbersome (like bytecode injection in Java). But languages like D have come up with solutions that do not require a preprocessor and are no more cumbersome than using preprocessor conditionals, while being far less error-prone.

Chinmay Kanchi
  • 6,173
  • 2
  • 39
  • 51
  • 1
    if you have to #define max, please put brackets around the parameters, so there are no unexpected effects from operator precedence... like #define max(X,Y) ((X)>(Y)?(X):(Y)) – foo Jan 13 '11 at 23:35
  • You do realise that it was merely an example... The intent was to illustrate. – Chinmay Kanchi Jan 14 '11 at 10:39
  • +1 for this systematic dealing with the matter. I like to add that conditional compilation can easily become a nightmare - I remember that there was a program named "unifdef" (??) whose purpose was to make it visible what remained after postprocessing. – Ingo Apr 11 '11 at 10:29
  • 7
    `In fact, I can't think of a single use-case for macros that can't easily be duplicated in another way`: in C at least, you can't form identifiers (variable names) using token concatenation without using macros. – Charles Salvia Mar 10 '12 at 17:20
  • Macros make it possible to ensure that certain code and data structures remain "parallel". For example, if there are a small number of conditions with associated messages, and one will need to persist them in a concise format, attempting to use an `enum` to define the conditions and a constant string array to define the messages could result in problems if the enum and the array get out of sync. Using a macro to define all the (enum,string) pairs and then including that macro twice with other proper definitions in scope each time would let one put each enum value next to its string. – supercat Aug 13 '12 at 17:53
  • There are variations on the third bullet that can't be done via functions in C, for example: `#define FOREACH(elem,itemList) for(elem=itemList;elem!=NULL;elem=elem->next)`. (OK, definition isn't that great, but I wanted to keep it short enough to put in a comment.) That sort of thing is a useful factorization in that it avoids a whole bunch of stupid errors; it wouldn't be necessary if the language could do true syntactic extensions, but C is pretty limited. – Donal Fellows Apr 11 '13 at 11:21
2

Lest start by noting that MACROs in C/C++ are very limited, error prone, and not really that useful.

MACROs as implemented in say LISP, or z/OS assembler language can be reliable and incredibly useful.

But because of the abuse of the limited functionality in C they have a gathered a bad reputation. So nobody implements macros any more, instead you get things like Templates which do some of the simple stuff macros used to do and things like Java's annotations which do some of the more complex stuff macro's used to do.

James Anderson
  • 18,049
  • 1
  • 42
  • 72
2

The biggest problem I have seen with macros is that when heavily used they can make code very difficult to read and maintain since they allow you to hide logic in the macro that may or may not be easy to find (and may or may not be trivial).

Scott Dorman
  • 1,489
  • 1
  • 11
  • 8