Variable assignment in ECMAScript specification in detail

Question

I am trying to wrap my head around what the ECMAScript specification suggests about variable assignment.

Introduction:

Coming from Java, this is pretty straight forward. Variables get stored at a memory location and assigning one variable to another one copies the value of the first into the memory location of the second.

int a = 4; // 0x0001: 4
int b = a; // 0x0002: 4

In the case of reference types, this works the same, only that the value being copied is a reference.

Object a = new Object(); // 0x0001: 0x0010 => 0x0010: Object
Object b = a; // 0x0002: 0x0010 => 0x0010: Object

So, after spending some time reading through the ECMAScript specification, I still can't quite put my finger on what mental model it wants us to apply for Javascript. I am aware, that the specification does not make any implications about a memory model - but it still got to use some kind of mental variable-model.

Problem:

In the specification, there are Declarative Environment Records, which provide bindings between names and values. https://tc39.es/ecma262/#sec-declarative-environment-records-setmutablebinding-n-v-s

It attempts to change the bound value of the current binding of the identifier whose name is N to the value V.

V is an ECMAScript laguage type. This is exclusively one of the following: Undefined, Null, Boolean, String, Symbol, Number, BigInt, and Object

But as we all know

let a = { a: 4 }
let b = a

will not copy the value of a to b, but rather reference the same object. However, the specification explicitly speaks of ECMAScript language types as bound values - not references.

You might say, the variable is bound to the ECMAScript language value via the reference. But this would be implementation detail and thus the spec does not explain the difference between objects and primitives in a consequent manner.

Furthermore MDN states that primitives only differ from objects by their immutability. If primitives would get copied from one place to the other, this would not even be worth mentioning - it's a given.

So my current guess is:

In JS, variables always point to values. When copying variables, you always point to those respective values - so it's rather like setting a reference to values than values being physically copied. In that sense, the MDN explanation makes perfect sense that primitives differ by their immutability. Plus, it aligns with the greater part of the implementation in V8 (except for SMIs).

Is this correct, or am I overseeing something?

JacquesB · Accepted Answer · 2023-05-29T08:07:50.593

3

Your understanding is correct, but the explanation gets confusing because you are mixing terminology from lower-level languages like Java with the terminology from ECMAScript specification, which operate on higher level. In particular, the meaning of "value" gets ambiguous.

The conceptual model used in the ECMAScript specification is that variables are bindings between names and values. An assignment binds a name to a new value. Multiple names can be bound to the same value. Objects are a kind of value, so are strings and numbers. Some values are immutable, some are mutable.

This is all you need to know. This model does not talk about memory locations, references, reference types, copying etc. since these are implementation details.

When you say "the value being copied is a reference", you are using terminology from Java, where "value" refers to the reference, not to the object it references. But when ECMAScript talks about "value" it refers to the object itself.

JavaScript can be implemented similarly to Java - where a binding is a variable memory location, and where assignment a = b happens by copying a reference from the a variable to the b variable.

edited May 29 '23 at 08:07

answered May 28 '23 at 12:43

JacquesB

57,310
21
127
176

"This model does not talk about memory locations, references" – And neither does Java, BTW. The OP's assertion that in Java, variables get stored at memory locations is just plain wrong. There is nothing anywhere in the JLS that talks about memory locations. It would be perfectly legal for a standards-conformant Java implementation to print the values out on paper and not store them in memory at all, or change the memory locations of variables at runtime, or do whatever else it wants. – Jörg W Mittag May 28 '23 at 15:56
1

@JörgWMittag, even if things were printed, there would still have to be a scheme to organise these printouts and their contents, and allow automatic access to them, akin to "memory" as we know it. There's all kinds of weird card tricks that could be performed behind the scenes, but the effect would still have to be the same as if none of those tricks were occurring and we were dealing with our usual conceptual models about how objects work. If it didn't, people would just declare the written standard to be non-conformant or under-specified. – Steve May 28 '23 at 19:13
1

@JörgWMittag: The Java Virtual Machine Specification talks about memory locations, stack, heap, references etc. – JacquesB May 28 '23 at 19:32
@JacquesB, is there an example of a JavaScript implementation where variable names are not aliases for storage addresses, and *more importantly*, where the behaviour of the program is inconsistent with a conceptualisation in which variables are aliases for storage addresses? It would seem contrary to the tenets of programming as we know it. – Steve May 29 '23 at 08:53
@Steve: The ECMAScript spec is consistent with a model where variables are aliases for storage addresses, so I would think an implementation that was inconsistent with this model was also in violation of the spec. But an implementation could easily work differently under the hood, e.g. optimize away a temporary variable by inlining. – JacquesB May 29 '23 at 09:16
@JacquesB I totally agree. So variables point to values in an abstract/theoretic manner. That's what the binding stands for - not that let a = 4 acutally puts 4 in a value attribute of the Environment Record. For example let a = 4; let b = a; would result in a and b pointing to the same value of 4. Therefore it needs to be immutable. As for objects, there is no need for a reference variable, as the reference is basically set as part of the binding. In that sense, primitives and objects are both values that differ by nothing but immutability. – tweekz May 29 '23 at 09:38
1

@JörgWMittag While you could implement a JVM like that, I think it still violates the spec, as it is indeed very precise about certain things like references. – tweekz May 29 '23 at 09:40
@JacquesB, but if the spec doesn't *require* variable names to be aliases for storage addresses, then in principle something inconsistent with that is not necessarily inconsistent with the standard. Also I concede it can work differently beneath - and optimising compilers often do - but optimisations must still work *as if* the program was done in the manner conceived by the programmer. I'm curious whether any language exists in which it isn't valid for the programmer to treat variable names as aliases for storage locations. – Steve May 29 '23 at 09:43
@tweekz, what even *is* a "reference", if you attempt to argue that names are bound to values and not addresses? Let b = a doesn't mean "point both to the same value" in any language I know - it means put a *copy* of the value in both *places*, and it doesn't imply any change in the binding between the variable name and the place it represents. – Steve May 29 '23 at 09:48
@Steve That's because ECMAScript does not use memory concepts in its explanations. Bindings are names that refer to values. This "refering" allows for the "reference" behavior when passing objects, but also requires primitives to be immutable. I guess you can think of bindings as kind of a dynamic symbol table. The ECMAScript spec is so vague, that it allows for all kinds of implementation. – tweekz May 29 '23 at 09:53
@Steve: That should probably be a separate question, but off the top of my head, I would say variables in Prolog cannot be treated as memory locations, since they may represent multiple different values *at the same time*. – JacquesB May 29 '23 at 09:55
@JacquesB: "The Java Virtual Machine Specification talks about memory locations, stack, heap, references etc." – The OP talks about Java, not JVM. Those are two different languages. – Jörg W Mittag May 29 '23 at 10:08
@tweekz: "While you could implement a JVM like that, I think it still violates the spec, as it is indeed very precise about certain things like references." – I never said anything about a JVM. – Jörg W Mittag May 29 '23 at 10:08
@JörgWMittag: You are both pedantic and wrong. "Java" refers to the platform composed of both JVM (Java Virtual Machine) and JLS (Java Language Specification). Only your own comment talks about the JLI in isolation. And in any case, the JLS also specifies a memory model. – JacquesB May 29 '23 at 10:23
@tweekz, the part I picked up on is exactly that it "doesn't use memory concepts". That sounds like one of the fundamental differences I know between mathematics and computer science, which is that the mathematicians have no concept of storage. In their discourse, they don't usually talk about it and they typically have no explicit system for it. Whereas for programmers, the issue of how data is organised and addressed is one of the fundamental concerns. – Steve May 29 '23 at 11:32
@Steve Yeah, I am coming from Java and tinkered around a little in C++. I didn't like that idea as well. But if you read through the ECMAScript specification or the MDN statements, it only makes sense if it was that abstract model of the binding between variable name and value. They assume that a conforming implementation will solve it in a common way. Like in V8 literally everything except for small integers are objects on the heap. – tweekz May 29 '23 at 11:55

score -2 · Answer 2 · answered May 28 '23 at 13:00

-2

It sounds like the writer of the standard suffered from a conceptual muddle.

Mathematicians do things like "bind names to values" - or, if I understand their practices correctly, it's more accurate to say they bind values to symbols.

Computer scientists bind names to storage addresses - or to put it another way, storage addresses are aliased with names.

This "binding" is not the same process in each context, the counterpart to which a "name" is bound in each context is not the same, and (with all respect to @JacquesB's answer, with which I disagree) there is no equivalence between these concepts nor common hierarchy in which they exist at different levels.

answered May 28 '23 at 13:00

Steve

6,998
1
14
24

Hi Steve, I have a long journey of adapting to this more general view - but it makes sense. The model that JS applies does not make any assumptions about actual memory. So it is up to the implementor. – tweekz May 29 '23 at 09:25
@tweekz, it may not say it makes any assumptions, and it may even say it doesn't make assumptions, but I'll bet you it does make assumptions! – Steve May 29 '23 at 09:50
@Steve: There is no conceptual muddle in the ECMAScript standard - all concepts are unambiguously defined. Just don't mix concepts from different domains or different standards. – JacquesB May 29 '23 at 09:59
@JacquesB, the standard may make enough sense internally, it just won't correspond well with the conceptualisations used by those doing the actual implementations. In other words, the implementors will not just supplement the standard with implementation details, they will employ conceptualisations that don't reconcile with the standard in the first place. For example, they will bind names to storage addresses like usual, not to values. – Steve May 29 '23 at 11:24
@Steve An actual implementation does not bind names to storage addresses. It is just a conceptual model to explain how assignment works in some languages. In a typical compiler, names would first be translated into offsets relative to the stack frame, since you would have multiple storage addresses for the same variable name in case of recursive function calls. In some cases a short-lived variable could be optimized to only exist in a register, or it could be optimized away entirely by inlining. – JacquesB May 31 '23 at 06:08
@JacquesB, I have to disagree. I think there's an iceberg of difference between how expert programmers and mathematicians think about this. In a recursive call, the same name isn't "bound to multiple addresses", there is instead a change of context in which the same name does not represent the same variable in each context. This is no different than if the same name is used in two completely different functions (and each could have a different type even). Early programming languages allocated space for locals statically - they really were the same on each call, and recursion was impossible! – Steve May 31 '23 at 07:59
Another minor point, the behaviour of optimising compilers does not alter the conceptualisation which the programming language employs. A compiler does not always need to allocate main memory - bearing in mind that a register is still an addressable location amongst the repertory of storage the machine works with - but the compiled program still must behave in all important respects *as if* the allocation were made in main memory. The compiler can only optimise once the program is analysed and the variable found to be used in a way permitting a shortcut. – Steve May 31 '23 at 08:11
1

@Steve You have a point distinguishing mathematicians from *implementers* of computer systems, but the *language designers* can use the mathematical terms, and in doing so give freedom to have different implementations. – Caleth Jun 01 '23 at 08:49
@Caleth, no on the contrary, the problem in this regard is that mathematicians are less rigorous about some aspects of computation than necessary, perhaps because their field is primarily concerned with logical analysis of relatively static problems, and the amount of data they actually handle is usually minimal. What you mean by "different implementations" is "conflicting conceptualisations", and that's precisely my point that mathematicians consistently approach things in ways that do not accord with anything considered real in computing. (1/2) – Steve Jun 01 '23 at 10:01
I suspect the only languages that do accord with how mathematicians like to think, are languages which computer scientists design specifically for the use of professional mathematicians. In other words, when computer scientists apply computers to mathematics, instead of to data processing. (2/2) – Steve Jun 01 '23 at 10:04
@Steve The conceptualisations either *don't* conflict, in the sense that they are two different ways of achieving the same goal, or it doesn't matter that they conflict, because we don't require them to be consistent with each other, just consistent with the abstract definition. – Caleth Jun 01 '23 at 10:21
1

It's precisely because we only require the abstract "binding names to values" that we can have implementations which do optimisations that break the conception "a variable is a storage location" – Caleth Jun 01 '23 at 10:31
@Caleth, there is no such thing as a variable without a storage location - that's the blind spot here. What the mathematicians do is start with an implicit (and often not fully systematic) scheme of storage, and if they are really pressured they will eventually budge to something that reconciles *in its effect* with how computer scientists think, but employs odd/unreal concepts, or differing terminology for the same concepts. For example, the assignment of a value to a place, will be called "rebinding the name to a different value". And when you ask where pointers fit in, they're finished! – Steve Jun 01 '23 at 11:05
Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/146399/discussion-between-caleth-and-steve). – Caleth Jun 01 '23 at 11:05

Variable assignment in ECMAScript specification in detail

2 Answers2