18

Considering this question and the most upvoted answer, and his specific example of

public static final int THREE = 3;

might it make sense to allow this sort of usage if we added units to the declaration? I mean like this:

public static final int THREE_MINUTES = 3;

or maybe this:

public static final int THREE_GALLONS = 3;

I'm thinking in terms of stuff I'd flag in a code review. I would definitely flag final int THREE = 3 but does it seem like a generally reasonable exception to allow numbers that add unit of measure?

Onorio Catenacci
  • 2,937
  • 3
  • 26
  • 37
  • 1
    If people who voted to close would care to indicate what they consider lacking in my question, I'll try to address their concerns. – Onorio Catenacci Sep 16 '21 at 18:33
  • 53
    Why bother putting the number in the variable name? Because if you find out the 3 is wrong, now you have `THREE= 4` which is even less meaningful. – user10489 Sep 17 '21 at 00:36
  • What is the advantage of writing `THREE_MINUTES` instead of `3 /*minutes*/`? – user253751 Sep 17 '21 at 10:00
  • 14
    Why should `int THREE_MINUTES = 180;` be wrong if the system does everything in seconds? The problem here is not the missing unit, but the wrong type - `THREE_MINUTES` isn't a number, it's a time span. – Haukinger Sep 17 '21 at 11:10
  • 1
    You raise a very good point @user10489! Yes it would be better to (as Vincent points out in his answer) to name the constant in terms of what it represents. But I do believe there is some value in attaching a unit to the name to help avoid unit conversion errors. And, as I said in my question THREE=3 would be the kind of thing I would flag in a code review. – Onorio Catenacci Sep 17 '21 at 13:04
  • 1
    @user10489 - because sometimes it's very useful to have named constants. Most time related libraries have constants like MINUTE, 5_MINUTES, HOUR etc. Sometimes a constant is an actual constant and no, no one will come and change it later. – Davor Ždralo Sep 17 '21 at 15:11
  • @user10489: Such a thing was common in c. 1980 home computer BASIC, on systems where tokenized variable names took up less memory than numeric literals. – dan04 Sep 17 '21 at 18:53
  • 3
    @user10489 If the 3 is wrong, you don't change `THREE = 3` to `THREE = 4`, you change it to `FOUR = 4` and then fix all the code that fails to compile because `THREE` no longer exists. Just kidding, `THREE` should have had a meaningful name in the first place. – Keith Thompson Sep 17 '21 at 19:05
  • Another tangential remark: @user10489 if you have `3` everywhere as a literal, future changes are harder than if it is a parameter or a variable that the IDE recognizes and can be renamed – lucidbrot Sep 17 '21 at 20:26
  • @KeithThompson You got that all wrong. You just redefine 3 to 4 and then change `THREE = 3` to `THREE_REALLY_FOUR = 3`. – emory Sep 18 '21 at 22:19
  • @emory Better yet, use an old Fortran compiler that lets you change the value of the literal `3` to 4. – Keith Thompson Sep 18 '21 at 23:18
  • I'd say that your question is a specific case of a more general problem - having consistent units of measure across different variables, not just magic numbers. I know that F# has a units of measure system, but as I've never used it I can't attest to its use. IMHO having a UoM attribute on a variable should be the proper way to deal with all of this. – Peter M Sep 18 '21 at 23:19
  • 1
    It’s probably worth noting that in the answer you referenced, “THREE = 3” was cited as the thing _not_ to do. – Lawrence Sep 19 '21 at 14:17

4 Answers4

80

The issue is not only with the lack of units, but the fact that it is not clear what three of those units represent. Do you only have three minutes to complete a task? Then the constant might be better named as MAXIMUM_TASK_DURATION. Is three gallons the capacity of some container? Then we could use the name CONTAINER_CAPACITY. Your original names only add precision to what the value is, but not how it is intended to be used, which is the crux of the issue.

The lack of units in those suggested constants might also be an issue, albeit a separate one. One possibility would be indeed to add the units in the constant name. Another would be to avoid primitive obsession and use a more appropriate type, such as Duration (which is already provided by the JDK), or Volume (which could be a value object created specifically for the domain of your application).

Vincent Savard
  • 1,906
  • 1
  • 17
  • 16
  • 18
    [Tiny Types FTW!](https://darrenhobbs.com/2007/04/11/tiny-types/) – Jörg W Mittag Sep 16 '21 at 17:06
  • 5
    Unit should be specified somewhere. I would add `// [min]` to your duration definition, for example. – Eric Duminil Sep 17 '21 at 01:01
  • 22
    Eric has good point. public static final MAXIMUM_TASK_DURATION_MINUTES = 3; would at least be nice and unambiguous – Robyn Sep 17 '21 at 04:40
  • 8
    @Robyn: "unambiguous" - yes. "Nice" - quite the opposite. – Doc Brown Sep 17 '21 at 06:14
  • 1
    Should tiny types be a performance or interop problem, Checker Frameworks Units Checker (or even just the Subtyping Checker) may help: https://checkerframework.org/manual/#units-checker – René Sep 17 '21 at 07:28
  • 4
    @JörgWMittag actually, proper-type-hierarchy-for-physical-quantities FTW. But yeah, often stupid tiny types are sufficient and simpler to use, and with a sensibly designed language they also shouldn't introduce much boilerplate. – leftaroundabout Sep 17 '21 at 12:52
  • worth noting that this (semantic wrapper types) is known as `newtype` in FP land. and branding in TS land – somebody Sep 17 '21 at 15:30
  • 1
    If the language has a timespan class, like Java's [`Duration.ofMinutes(3)`](https://docs.oracle.com/javase/8/docs/api/java/time/Duration.html#ofMinutes-long-), that would be more typesafe than an `int`, and remove the need to store the units in the name. – BlueRaja - Danny Pflughoeft Sep 18 '21 at 04:16
  • The issue with "magic numbers" is not when you write them: if you are calling an api with a timeout of milliseconds and you type "3" there then its because you think 3ms is just fine. The issue is troubleshooting/debugging/changing them later. How do you find the timeout that is affecting you? Search for all "3"s in a program? If you even know it is "3" and not, say, "4"? Ok, so give it a name, e.g., `THREE_MS`, and assume you know it is a 3ms delay you're looking for. When you find it: where do you change it and how? If you change the def'n you change _all_ uses of it. Etc. etc. – davidbak Sep 18 '21 at 17:23
  • @JörgWMittag Our project called them type “tags,” and used them to great success. Of course, Java is **awful** about them—ours was in Typescript, where we could write a generic in four lines that could be tagged with any string, and fully compile-time compatible with any underlying type, chosen on-the-fly at the point it was needed. We even extended those flags to a very basic arithmetic suite for physical units. – KRyan Sep 19 '21 at 13:16
  • So why don't any time packages implement your suggestion? I've always seen constants Nanosecond, Microsecond, Millisecond, etc. with no unit, and no reference... – tuskiomi Sep 20 '21 at 17:34
18

Please always use all relevant information:

int MAXIMUM_TASK_DURATION_MINUTES = 3;

Can't remember how often I had some strange undocumented API and was wondering if it is minutes, seconds or millis.

mister x
  • 197
  • 2
  • 1
    As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-ask). – Community Sep 17 '21 at 14:12
4

THREE = 3 makes no sense

I can't imagine a good justification for having such a constant as THREE = 3 or THREE_MINUTES = 3 EDIT: there is a case, though it's language specific and pretty obscure; see the supercat's comments. However, in come contexts, and especially in tests, it may be useful to have a constant that represents some amount of... something, represented in a specific unit.

What can make sense

For example, if you need to test some computation on durations, or you need to sleep for a while, you might want to have something like

const int THREE_MINUTES_IN_MS = 3 * 60 * 1000;
...
sleep(THREE_MINUTES_IN_MS);

or

const int MINUTE_MS = 60 * 1000;
...
sleep(3 * MINUTE_MS);

or even

const int MINUTE_MS = 60 * 1000;
const int THREE_MINUTES_IN_MS = 3 * MINUTE_MS;
...
sleep(THREE_MINUTES_IN_MS);

What happens here is that the constant captures both the amount and the unit conversion, making it both easier to read and to contain the unit conversions in a place where it's easier to catch mistakes. That last point is important in my experience: if you see code sleep(3 * 60 * 100) you may guess that this is a mistake, but can't be sure, whereas MINUTE_MS = 60 * 100 is clearly a mistake. It's just more semantic information.

When it may be justified to use names like THREE_MINUTES_IN_MS

If the number is defined just because in a test you need to sleep for a while several times, in slightly different contexts, it may be preferable to use this sort of meaningless naming. It conveys a meaning actually: it clarifies that this is an arbitrary number that is not related to anything specific. You just decided that 3 seconds would be a good number to wait for "a little while". Or perhaps you are actually testing some computation on numbers, and want to make assertion 4 * millis(MINUTE_MS) == minutes(4) easier to read.

Another way it may help is that such a name is clearly disposable. If you realise that there is an unused constant FROBNICATOR_REQUEST_DELAY_SMALL it may make you uneasy - can you remove it? Should it be used somewhere? A constant THREE_MINUTES_IN_MS is clearly disposable, if it's unused, you delete it and never look back. And even more importantly, when reading, you never need to decipher its meaning, it should never surprise you.

Prefer saying what the number represents

Obviously, if the number has a meaning such as MAXIMUM_TASK_DURATION, you should use that name instead (though sometimes it may make sense to define it in terms of another constant - MAXIMUM_TASK_DURATION_MS = THREE_MINUTES_IN_MS.

Frax
  • 1,844
  • 12
  • 16
  • Sometimes it may be useful to have an object whose purpose is to be an object of a particular type holding a particular constant value, to allow a pointer expression equivalent to what what e.g. `&(static const uint32_t){3}` should represent. For example, one might do something like `fwrite(&ui32_const_2, 4, 1, outFile); outThing(outFile, &thing1); outThing(outFile, &thing2);` The number two represents the number of calls to `outThing`, and it wouldn't make sense to change the value of the passed object without changing the code, in which case one could use an object with a different name. – supercat Sep 17 '21 at 15:09
  • If the language allowed it, I'd prefer to simply write code that says "give a `uint32_t*` holding the address of any suitably-aligned chunk of storage that will always and forevermore hold the bit pattern for the `uint32_t` value 2 in the absence of Critical Undefined Behavior" without having to create a named identifier for that purpose, but unfortunately compound literal syntax only allows the creation of such objects within expressions that are evaluated at file scope. – supercat Sep 17 '21 at 15:21
  • Oh, that's an interesting example. Yeah, a constant with the only purpose of being referenced might be named like that. Though if it's just a single int like here, I'd probably use a local variable, unless it's actually performance-critical. – Frax Sep 17 '21 at 15:22
  • Unless the same value was used widely throughout a program, I'd generally declare block-scope static const objects for that purpose within each function where they would be needed. In most cases, the limited lifetime of compound literals wouldn't matter, but I find offensive the idea of using a new syntax whose semantics and performance are both worse than the earlier way of accomplishing the same task. – supercat Sep 17 '21 at 15:31
  • Please put parentheses around your macros. How many minutes in a million milliseconds ? 1000000 / MINUTE_MS? Wrong with your macro. – gnasher729 Sep 17 '21 at 21:26
  • 2
    @gnasher729 I don't see any macros in this answer – CodesInChaos Sep 18 '21 at 13:28
  • Indeed when I saw `THREE_GALLONS = 3` I immediately thought that was a waste, and my first good example that came to mind was `THREE_GALLONS_IN_OZ = 384` – corsiKa Sep 19 '21 at 17:00
  • @gnasher729 You'd be right if these were macros, but they are constants. Anyway, this was meant as more of a pseudocode than actual C++, the point is about naming. Yeah, I should have skipped the `int` and make it proper JS instead, would have been simpler. – Frax Sep 20 '21 at 08:02
0

This is indeed inadequate and should be flagged in review. To understand why, let's step back.

What's in a variable?

A variable (or constant) is an instance (value) of a given type. It conveys a variety of pieces of information through both its name and its type.

A variable is, first and foremost, a value. In your case, the underlying value is 3.

A variable with a number as an underlying value generally represents a quantity. A quantity is often the association of a number and a unit, though there are unitless quantities such as the number of iterations in a loop.

The unit can be conveyed through either type or name:

public static int THREE_MS = 3;
public static Milliseconds THREE = new Milliseconds(3);

Although the unit itself is not as important as the dimension:

public static Duration X = Duration::from_milliseconds(3);

It is notable that just because 2 variables have the same dimension does not mean that they interchangeable, and therefore it can be valuable to further refine a type-hierarchy -- when such a thing exists -- to represent specific kinds of values. This refinement can also be applied to unitless quantities, such as indices on a board: a row index is not a column index, and vice-versa.

Finally, a variable typically has a purpose which further restricts its applicability. You would not want to use the heartbeat_interval instead of the polling_interval, for example.

Applying to your example.

In general, I'd recommend employing a mix of type and name.

I myself prefer to err on the side of types, wherever practical, and routinely create "wrapper" types. In the above example, should I create a Board, I would have:

class RowIndex(int);
class ColumnIndex(int);

class Cell(...);
class Board(...);

function Board.get(RowIndex row, ColumnIndex column) -> Cell;

And this would help avoiding accidentally swapping one index for another.

I do use names, still, and thus going back to your previous example:

public static Duration REQUEST_INTERVAL = Duration::from_milliseconds(3);

Would be my favorite:

  1. A dimension type precisely conveys the dimension of the quantity.
  2. By using a dimension type, we can also precisely convey the unit of the number, at its point of construction.
  3. Finally, the name conveys the purpose.

On the point of the name, note that I did not name it REQUEST_INTERVAL_MS nor REQUEST_INTERVAL_DURATION, there is not point in repeating in the name information provided by the type.

Applying to the original example.

The original (linked) example is:

int seconds = days * 24 * 60 * 60;

The error, here, is the Primitive Obsession. Using meaningful types allows having meaningful names.

As an example, C++ std::chrono::duration<...> can express both days and seconds resolution, it uses magic numbers in its formation, where the intent is clear, and never again:

using nanoseconds = duration<int64_t, std::ratio<1, 1'000'000'000>>;
using microseconds = duration<int64_t, std::ratio<1, 1'000'000>>;
using milliseconds = duration<int64_t, std::ratio<1, 1'000>>;
using seconds = duration<int64_t, std::ratio<1, 1>>;
using minutes = duration<int64_t, std::ratio<60, 1>>;
using hours = duration<int64_t, std::ratio<3600, 1>>;
using days = duration<int64_t, std::ratio<86400, 1>>;

And from there I can do:

auto const x = seconds{ days{ 3 } };

To have x be a duration expressed in seconds initialized by a number of days.

Matthieu M.
  • 14,567
  • 4
  • 44
  • 65