112

I often see in C and C++ code the following convention:

some_type val;
val = something;

some_type *ptr = NULL;
ptr = &something_else;

instead of

some_type val = something;
some_type *ptr = &something_else;

I initially assumed that this was a habit left over from the days when you had to declare all local variables at the top of the scope. But I've learned not to dismiss so quickly the habits of veteran developers. So, is there a good reason for declaring in one line, and assigning afterwards?

Robert Harvey
  • 198,589
  • 55
  • 464
  • 673
Jonathan Sterling
  • 1,606
  • 2
  • 12
  • 14
  • 15
    +1 for "I've learned not to dismiss so quickly the habits of veteran developers." That's a wise lesson to learn. – Wildcard Nov 30 '15 at 15:56

7 Answers7

102

C

In C89 all declarations had to be be at the beginning of a scope ({ ... }), but this requirement was dropped quickly (first with compiler extensions and later with the standard).

C++

These examples are not the same. some_type val = something; calls the copy constructor while val = something; calls the default constructor and then the operator= function. This difference is often critical.

Habits

Some people prefer to first declare variables and later define them, in the case they are reformatting their code later with the declarations in one spot and the definition in an other.

About the pointers, some people just have the habit to initialize every pointer to NULL or nullptr, no matter what they do with that pointer.

orlp
  • 2,476
  • 1
  • 19
  • 20
  • 1
    Great distinction for C++, thanks. What about in plain C? – Jonathan Sterling May 07 '11 at 20:35
  • Thanks for the edit! I'll be accepting your answer in a few minutes, once Stack Overflow allows me to. – Jonathan Sterling May 07 '11 at 20:40
  • @Jonathan Sterling: I reformatted my answer to provide a more clear divide between the different subjects. – orlp May 07 '11 at 20:42
  • @nightcracker No, they had (since ANSI C) to be at the beginning of a scope. –  May 07 '11 at 20:44
  • @unapersson: Excuse me, I'll change it. – orlp May 07 '11 at 20:45
  • 13
    The fact that MSVC still doesn't support declarations except at the beginning of a block when it's compiling in C mode is the source of endless irritation for me. – Michael Burr May 07 '11 at 22:03
  • 5
    @Michael Burr: This is because MSVC doesn't support C99 at all. – orlp May 07 '11 at 22:05
  • @nightcraker: I understand that, but even before C99 many compilers supported this as an extension as you mentioned in the answer, and MSVC does support language extensions to C (hey - you can use '//' to start a comment). I wish they'd make this one of them, along with several other C99-isms that align with C++. I know they *can* do it (it's already in the compiler for C++ mode), they just have no interest in it. Fair enough, but I still find it irritating. – Michael Burr May 07 '11 at 22:14
  • 4
    "some_type val = something; calls the copy constructor": it *may* call the copy constructor, but the Standard allows the compiler to elide the default-construction of a tempory, copy construction of val and destruction of the temporary and directly construct val using a `some_type` constructor taking `something` as sole argument. This is a very interesting and unusual edge case in C++... it means there's a presumption about the semantic meaning of these operations. –  May 09 '11 at 09:15
  • with VS2010 you can write (almost) C99 compiling as C++, but when you compile it as C the compiler barfs. To clarify, I have used VS2010 in C++ mode, writing and testing C99 that I have later moved to my embedded application and the code compiled without any errors. –  May 10 '11 at 12:12
  • 1
    I really do not understand why it would be a critical difference to use the copy constructor vs the =operator. Can someone give an example of what difference that would make? Aren't `int x = 0;` and `int x; x =0;` the same thing? – temporary_user_name Oct 10 '12 at 23:58
  • 2
    @Aerovistae: for built-in types they are the same, but the same can not always be said for user-defined types. – orlp Oct 11 '12 at 18:48
  • From Google's C++ Style Guide: https://google.github.io/styleguide/cppguide.html#Local_Variables – Nikolay Tsenkov May 19 '16 at 08:29
  • If a developer writes the first example in two lines because it makes a difference, then there MUST be a comment describing what that difference is. Without a very visible comment I'll consider it a bug, and not a harmless one. – gnasher729 Feb 12 '22 at 14:30
29

You have tagged your question C and C++ at the same time, while the answer is significantly different in these languages.

Firstly, the wording of the title of your question is incorrect (or, more precisely, irrelevant to the question itself). In both of your examples the variable is declared and defined simultaneously, in one line. The difference between your examples is that in the first one the variables are either left uninitialized or initialized with a dummy value and then it is assigned a meaningful value later. In the second example the variables are initialized right away.

Secondly, in C++ language, as @nightcracker noted in his answer these two constructs are semantically different. The first one relies on initialization while the second one - on assignment. In C++ these operations are overloadable and therefore can potentially lead to different results (although one can note that producing non-equivalent overloads of initialization and assignment is not a good idea).

In the original standard C language (C89/90) it is illegal to declare variables in the middle of the block, which is why you might see variables declared uninitialized (or initialized with dummy values) at the beginning of the block and then assigned meaningful values later, when those meaningful values become available.

In C99 language it is OK to declare variables in the middle of the block (just like in C++), which means that the first approach is only needed in some specific situations when the initializer is not known at the point of declaration. (This applies to C++ as well).

  • You probably should go back and reread my examples: the first set has the variables declared and defined in ***two*** lines. I probably should have made clearer that I was interested in the reasoning for writing in this way in both C and C++: as it turned out, there's a very important distinction in C++, but in today's C, the distinction you mention in your last paragraph holds. – Jonathan Sterling May 07 '11 at 20:50
  • Thank you for taking the time to answer my question: whilst some of your criticism seems to be based on confusion or misreading, your analysis of how this works in C vs. C++ is spot-on and very helpful. – Jonathan Sterling May 07 '11 at 20:51
  • 2
    @Jonathan Sterling: I read your examples. You probably need to brush up on the standard terminology of C and C++ languages. Specifically, on the terms *declaration* and *definition*, which have specific meanings in these languages. I'll repeat it again: in both of your examples the variables are declared and defined in one line. In C/C++ the line `some_type val;` immediately *declares* and *defines* the variable `val`. This is what I meant in my answer. –  May 07 '11 at 20:53
  • 1
    I see what you mean there. You're definitely right about *declare* and *define* being rather meaningless the way I used them. I hope you accept my apologies for the poor wording, and poorly-thought-out comment. – Jonathan Sterling May 07 '11 at 20:55
  • ceil(+0.5). You started good by spotting that both are definitions, but continued with 'declarations'... You cannot *declare* locals in both languages, they are only 'defined'. –  May 07 '11 at 20:59
  • @ybungalobill According to Wikibooks, “declare” actually _is_ the right terminology (http://en.wikibooks.org/wiki/C_Programming/Variables). Are they wrong? – Jonathan Sterling May 07 '11 at 21:02
  • @Jonathan Sources like that are not definitive, if you want to get into an argument with Andrey (who is pretty reliable, in my experience) you had better bone up on the C and C++ Standard documents themselves. –  May 07 '11 at 21:07
  • @unapersson I agree! If you read this comment thread, you'll see that I'm not arguing with Andrey (I acknowledged that he was correct a few comments ago). I'm actually arguing with @ybungalobill; I'm not saying he's wrong, but I am saying that there's a good reason for my misconception, if it is a misconception. – Jonathan Sterling May 07 '11 at 21:09
  • 1
    So, if the consensus is that “declare” is the wrong word, I'd suggest that someone with a better knowledge of the standard than me edit the Wikibooks page. – Jonathan Sterling May 07 '11 at 21:11
  • 2
    In any other context declare would be the right word, but since declare is a _well-defined concept_, with consequences, in C and C++ you can not use it as loosely as you could in other contexts. – orlp May 07 '11 at 21:15
  • Well put, @nightcracker. I'll definitely try to remember that the next time I use the word. So, does everyone agree that I should have said “define” instead of “declare”? If so, I'll edit the question appropriately. – Jonathan Sterling May 07 '11 at 21:18
  • 2
    @ybungalobill: You are wrong. *Declaration* and *definition* in C/C++ is not mutually exclusive concepts. Actually, *definition* is just a specific form of *declaration*. Every definition is a declaration at the same time (with few exceptions). There are defining declarations (i.e. definitions) and non-defining declarations. Moreover, normally the therm *declaration* is used all the time (even if it is a definition), except for the contexts when the distinction between the two is critical. –  May 07 '11 at 21:52
  • 1
    @Jonathan Sterling: *Declare* is the right word, even if the declaration is a definition at the same time. This is the accepted practice in C/C++ standard language: unless you care to distinguish between definitions and non-defining declarations, you just use the term *declaration*. In the context of your question, the distinction does not matter, so you can just use the term *declaration*. The wording in your current title is perfectly fine. –  May 07 '11 at 21:55
  • 1
    @AndreyT Thanks for clarifying. Your help means a lot to me. Once again, I'd like to apologize for being so overly defensive when you first answered. I was being barraged with meaningless complaints, and took your very valid criticisms as more of the same: nevertheless, there's no excuse for shunning the help and advice of a very helpful answerer. So, thank you! – Jonathan Sterling May 07 '11 at 22:01
14

I think it's an old habit, leftover from "local declaration" times. And therefore as answer to your question: No I don't think there's a good reason. I never do it myself.

4

I said something about that in my answer to a question by Helium3.

Basically, I say it's a visual aid to easily see what is changed.

if (a == 0) {
    struct whatever *myobject = 0;
    /* did `myobject` (the pointer) get assigned?
    ** or was it `*myobject` (the struct)? */
}

and

if (a == 0) {
    struct whatever *myobject;
    myobject = 0;
    /* `myobject` (the pointer) got assigned */
}
pmg
  • 143
  • 1
  • 6
4

The other answers are pretty good. There's some history around this in C. In C++ there's the difference between a constructor and an assignment operator.

I'm surprised no one mentions the additional point: keeping declarations separate from use of a variable can sometimes be a lot more readable.

Visually speaking, when reading code, the more mundane artifacts, such as the types and names of variables, are not what jump out at you. It's the statements that you're usually most interested in, spend most time staring at, and so there's a tendency to glance over the rest.

If I have some types, names, and assignment all going on in the same tight space, it's a bit of information overload. Further, it means that something important is going on in the space that I usually glance over.

It may seem a bit counter-intuitive to say, but this is one instance where making your source take up more vertical space can make it better. I see this as akin to why you shouldn't write jam-packed lines which do crazy amounts of pointer arithmetic and assignment in a tight vertical space -- just because the language lets you get away with such things doesn't mean you should do it all the time. :-)

asveikau
  • 434
  • 3
  • 6
2

In C, this was the standard practice because variables had to be declared at the start of the function, unlike in C++, where it could be declared anywhere in the function body to be used thereafter. Pointers were set to 0 or NULL, because it just made sure that the pointer pointed to no garbage. Otherwise, there's no significant advantage that I can think of, which compels anyone to do like that.

Vite Falcon
  • 121
  • 2
2

Pros for localising variable definitions and their meaningful initialisation:

  • if variables are habitually assigned a meaningful value when they first appear in the code (another perspective on the same thing: you delay their appearance until a meaningful value is avaialable) then there's no chance of them accidentally being used with a meaningless or uninitialised value (which can easily happen is some initialisation is accidentally bypassed due to conditional statements, short-circuit evaluation, exceptions etc.)

  • can be more efficient

    • avoids overheads of setting initial value (default construction or initialisation to some sentinel value like NULL)
    • operator= can sometimes be less efficient and require a temporary object
    • sometimes (esp. for inline functions) the optimiser can remove some/all inefficiencies

  • minimising the scope of variables in turn minimises average number of variables concurrently in scope: this

    • makes it easier to mentally track the variables in scope, the execution flows and statements that might affect those variables, and the import of their value
    • at least for some complex and opaque objects, this reduces resource usage (heap, threads, shared memory, descriptors) of the program
  • sometimes more concise as you're not repeating the variable name in a definition then in an initial meaningful assignment

  • necessary for certain types such as references and when you want the object to be const

Arguments for grouping variable definitions:

  • sometimes it's convenient and/or concise to factor out the type of a number of variables:

    the_same_type v1, v2, v3;

    (if the reason is just that the type name is overly long or complex, a typedef can sometimes be better)

  • sometimes it's desirable to group variables independently of their usage to emphasise the set of variables (and types) involved in some operation:

    type v1;
    type v2; type v3;

    This emphasises the commonality of type and makes it a little easier to change them, while still sticking to a variable per line which facilitates copy-paste, // commenting etc..

As is often the case in programming, while there can be a clear empirical benefit to one practice in most situations, the other practice really can be overwhelmingly better in a few cases.

Tony
  • 670
  • 4
  • 6
  • I wish more languages would distinguish the case where code declares and sets the value of a variable which would never be written elsewhere, though new variables could use the same name [i.e. where behavior would be the same whether the later statements used the same variable or a different one], from those where code creates a variable that must be writable in multiple places. While both use cases will execute the same way, knowing when variables may change is very helpful when trying to track down bugs. – supercat Apr 28 '14 at 22:50