4

Why would so many repositories use additional — apparently unnecessary — vertical and horizontal space in coding styles?

I almost unanimously see this:

    public function getName() {
            return 'webauthn';
    }


    public function newKey( array $data = [] ) {
            if ( empty( $data ) ) {
                    return WebAuthnKey::newKey();
            }
            return WebAuthnKey::newFromData( $data );
    }

Whereas I personally would write this:

public function getName() {return 'webauthn';}

public function newKey( array $data = [] ) {
  if ( empty( $data ) ) {return WebAuthnKey::newKey();}
  return WebAuthnKey::newFromData( $data );
}

The more compact coding style allows to see on one screen page much more details of a component and I can much faster see what the component is supposed to do.

However, the compact style seems rarely used, whereas the (over?)use of horizontal and vertical spacing are extremely popular. So I wonder: are there some reasons that would explain the need for extra spacing which I fail to understand?

Disclaimer: I am asking this here as a comment on https://stackoverflow.com/questions/74159241/why-vertical-and-horizontal-white-space-verbosity-in-coding-styles pointed out this could be the right place.

Christophe
  • 74,672
  • 10
  • 115
  • 187
  • 1
    If you need to fit that much on your screen, have you considered that the component might be too large? Additionally, there is a question of reading each statement as a separate semantic unit, and having them on separate lines makes that clearer. – user1937198 Oct 25 '22 at 19:48
  • @DocBrown it would be nice if such advice took the form of a peer reviewed answer. Might prevent a few holy wars at the code review table. And might even be good for this place. – candied_orange Oct 25 '22 at 20:42
  • @user1937198: Yes. My components have some, say 500 lines of code, as compared to 2000+ in the other style. Numbers compare with wide spread open source systems. Why would a simple if statement be clearer when dispersed on 3 or more lines? It is not that complex. – Nobody-Knows-I-am-a-Dog Oct 25 '22 at 20:57
  • 2
    In the first example, lexical blocks are indented, which means the visual layout reflects the logical structure better. This makes it easier to understand the logical structure at a glance. Although whether to use 2 or 4 or 8 spaces for indents is basically arbitrary. – JacquesB Oct 25 '22 at 20:57
  • @DocBrown: I understand, I notice the downvotes and the early closing. I do not intend holy wars. I want to understand. Really understand. But when I ask I usually get this "holy war" stuff which does not help me understand. Sad somehow. – Nobody-Knows-I-am-a-Dog Oct 25 '22 at 21:00
  • @JaquesB: Yes. I see that argument. But it is not, like, 5 nested loops. It is a simple if statement. Do I really get an advantage when having a single curly bracket on a line? Sorry, not wanting to start holy wars. Wanting to understand. I simply do not see the advantage... When there are highly nested things I do see that, but with 1 if statement? Honestly?? Maybe ?! But it costs space. Otherwise I would be able to see entire components on one screen page... – Nobody-Knows-I-am-a-Dog Oct 25 '22 at 21:02
  • I don't see why answers must be inherently opinion-based. There must be some study concerning layout that would at least provide interesting perspectives on this question, especially given that layout is not unique to code, but also applies to natural language text and to graphic design. – Steve Oct 25 '22 at 21:03
  • @Nobody-Knows-I-am-a-Dog: So you indent properly if you have five levels of nesting, just not if you only have two? Most people prefer code formatting to be consistent since it makes it easer to read and understand. – JacquesB Oct 25 '22 at 21:20
  • I admit: When using a new piece of software, I often read it in the editor, shortening accessor functions as in my example from 3 lines of code to 1 line of code. As a result I often get 1/3 file size (together with similar optimizations) and then have a compact file to see all functions in it. To me (=opinionated) this seems so natural that I always wonder why 99.5% of the code I see out there uses 3+ lines for a simple get accessor. I want to understand. That's all. No war. Not even a holy one ;-). But usually I am crucified just for the question :-o – Nobody-Knows-I-am-a-Dog Oct 25 '22 at 21:22
  • @JaquesB: Exactly. I am not interesting in making simple get accessors easier to read and understand as I understand them straight away. Only when there is a "real" need to make code better readable I use up (waste) space for this. Waste, as I seed advantages in having code which needs only 2 pages to read instead of 10. It's a trade off to me. – Nobody-Knows-I-am-a-Dog Oct 25 '22 at 21:24
  • 1
    @Nobody-Knows-I-am-a-Dog: "*I am not interesting in making simple get accessors easier to read and understand*" - fair enough, but most developers and teams prioritize making the code easier to read and understand. – JacquesB Oct 25 '22 at 21:28
  • 4
    do you like paragraphs in books? – Ewan Oct 25 '22 at 21:30
  • It depends on the language, but many languages either a) just expose fields directly rather than writing properties, or have some concept of an auto-property. C# autopropery style for example does often put the entire property on one line. If you have more than 5-10 getters though (and thus for this to be more than 20 lines of code) I would question whether their is too many getters in the module. – user1937198 Oct 25 '22 at 21:32
  • @Ewan: Yes. Provided a paragraph consists of more than 1 3-word sentence. That's exactly my point. A book where every paragraph is 1 sentence is hard to read. JaquesB: Honest question: How does a simple get accessor become "easier to read and understand" by spilling 1 line out to 3 lines? – Nobody-Knows-I-am-a-Dog Oct 25 '22 at 21:42
  • Maybe you're not a visual person and you're only reading the code line-by-line like a book? Often, one wants to be able to skim the code, and ignore the details. Of course, it helps of you write the code so that you *can* ignore everything but the immediate context. If it's written like a book page, a wall of text, it's hard to find stuff in it without actually reading through it, or build a mental model of the structure at a glance. – Filip Milovanović Oct 26 '22 at 02:39
  • I reworded slighlty your question, because “space verbosity” is a contradiction: verbosity is about putting more words, not more space. I also tried to make it more neutral and open, because the initial wording encourages opinion-based answers. – Christophe Oct 26 '22 at 05:53
  • Try code in shell script without the proper spacing ... – Laiv Oct 26 '22 at 09:35

5 Answers5

10

The real question is: why did you stop removing spaces where you did? Why not suppress the remaining line ? Why not put every statement of a function in a single line and let the editor arrange the layout to show you the long line by splitting it on several?

Another question is why the musicians put so much silence in a piece of music, when they could just make each note folllow the next without waiting?

It appears that horizontal and vertical space contribute — as much as silence in music — to add structure and to express the intent. The visual structure helps to browse quicker through large chuncks of code. Now, 2 spaces per indent or 8 spaces is a matter of taste.

The fact of always using the same layout, for example starting the if-bloc on the line following the if, has also an advantage of consistency, adopting the same style systematically. Gaining space at all cost, sometimes compressing the if and the block on one line, is not, in line with the principle of least surprise, and make the code more difficult to read for everybody else but the author.

Christophe
  • 74,672
  • 10
  • 115
  • 187
  • 2
    In some languages, they may add structure (see python), but in the one given, they just emphasize it, or automated formatters wouldn't work. Not that it makes the proper formatting any less important. – Deduplicator Oct 26 '22 at 06:19
  • 1
    @Deduplicator I think we agree, I meant adding a visual/cognitive structure (beyond syntactic requirements of e.g.python): who of us was never ever mislead by inconsistent indentation? ;-) – Christophe Oct 26 '22 at 06:55
3

Tldr

Layout is part of the syntax of programming languages, even if it isn't defined as such and the compiler ignores it.

Long read

One thing to note is that the actual syntax of many programming languages as defined in the book, does not match the perceived syntax as coders know it.

That is, in many programming languages the whitespace is not significant to the compiler. But whitespace is considered significant by coders.

If you start from that idea, then the real question is not why the code isn't more compact, but why it contains all those extra braces and semicolons (which reiterate indentations and newlines).

What you often get in C-derived languages, is a belt-and-braces approach, where the whitespace exists predominantly for the human reader, and the punctuation actually exists as a supplement purely for the compiler.

Another thing to note is that humans don't by default read like computers do. Computers parse strings serially and have no intrinsic two-dimensional representation. Humans appreciate matters like layout before they have even started reading the characters.

To emphasise the point, the human reader probably gets the gist of what's going on here, even without the semicolons and braces and brackets:

public function getName()
    return 'webauthn'

public function newKey( array $data = [] )
    if empty( $data )
        return WebAuthnKey::newKey()   
    
    return WebAuthnKey::newFromData( $data )

In fact all that additional punctuation only seems to become necessary when layout principles seems to be violated.

In terms of why this particular style of layout is most commonly used, I suggest because it is the most convenient and flexible way to represent trees or hierarchy.

Why are hierarchies important? Because basically every mainstream language follows the block-structure principle (the justification of which would be another question!).

In your "compact" version of the code, clearly a lot of layout principles have been discarded, and what exists is an anomalous layout that relies much more heavily on what I consider to be the "compiler crutches" - the things that are in the language mostly for the benefit of the compiler, not for any sensible human reader.

The example of compactification you give is simple enough, but quite obviously with more nested blocks and more statements per block, either you would have to revert to the normal style of layout (which you consider extravagant), or else the code would become very messy indeed. Messy for the human I mean - the compiler can read things that no human coder would consider reasonable.

Also, making more complicated additions to your code would require not only relevant statements to be added, but also for the overall layout to be uncompactified.

For most, the benefit of occasionally saving a line or two, is just not worth consideration.

Steve
  • 6,998
  • 1
  • 14
  • 24
  • I wish these fly-by-night downvoters would give some clue about their problem! – Steve Oct 26 '22 at 13:29
  • I didn't downvote you, but layout is not part of the syntax, one exception that comes to mind is Python, but even there it is only partially true. – greenoldman Oct 28 '22 at 06:00
  • 2
    @greenoldman, obviously I acknowledge it is not part of the defined syntax. My point is that it forms part of the implied syntax - which is why coders can still read my punctuation-free version, even though the compiler wouldn't. The fact that the whitespace and layout doesn't speak to the compiler, doesn't mean it doesn't speak to human readers. – Steve Oct 28 '22 at 06:21
  • Syntax has pretty pretty precise meaning, that fact something is harder/easier for the human to read it does not mean it is part of the syntax. I'd bet you would have difficulties reading your own code when displayed with font-size = 2px. So now as I understand font size is part of the syntax? – greenoldman Oct 29 '22 at 04:38
  • @greenoldman, if font size was being *varied*, and if the variation of the font size was amongst the *main* ways of representing something important, then yes I'd have said it was part of the syntax. I don't think I'm using the word "syntax" beyond breaking point, since I'm suggesting that carriage returns and tabs perform the same role as braces (for C), and I'm suggesting that a preference for using layout in this role is universal amongst coders (in that no coder writes everything as one line, and no coder just newlines arbitrarily around the 80 character point). – Steve Oct 29 '22 at 08:41
  • How whitespaces can be part of the syntax when lexer ignores them on spot? EOT from my side. – greenoldman Oct 30 '22 at 08:57
  • Because the human lexer doesn't ignore them, nor generally does the human writer. I've characterised this as the "implied syntax", insofar as it is not considered by the compiler but it is considered by the programmer in both the writing and reading process. I don't see what is so troublesome to grasp about this, especially as included an illustrating example. – Steve Oct 30 '22 at 19:13
1

In the first example, the visual layout reflects the logical structure better, because the lexical blocks are visually offset. This makes it easier to understand the logical structure at a glance.

Furtermore, the first example is more consistent since all lexical blocks are indented. In the second example, some blocks are indented (e.g. the body of the second function) while others are not.

You personally prioritize compactness (allowing you to see more code on the screen at a time) above readability. If that works for you, this is fine, but most teams prioritize readability and consistency above compactness, which is not usually a big concern. So this is why you see the "spacious" style more often.

Although whether to use 2 or 4 or 8 spaces for indents is basically arbitrary. Here consistency is the priority: Use what is standard for the code base or for the language. Same with placement of the opening brace - use what is standard for the code base and language.

JacquesB
  • 57,310
  • 21
  • 127
  • 176
  • I agree. The thing I do not understand is: Where **is** readability improved, eg in the example of the get accessor or a simple if statement? Especially when it comes at the additional cost that you have to scrolle through 8 pages instead of seeing the entire component on 2 pages. – Nobody-Knows-I-am-a-Dog Oct 26 '22 at 12:31
  • @Nobody-Knows-I-am-a-Dog, the problem with "readability" is finding good contrasting examples. I write a lot of SQL nowadays and I'm a one-column-per-line kind of man, but I often have to debug code written by inexpert developers, and I find usually my very first act is to uncompact their code just so that I can understand it. Your example, however, is not something I had to uncompactify to understand, so in that sense it is a straw man example of the problem, in that it is far too simple to be unreadable. – Steve Oct 26 '22 at 13:59
1

With any data structure sparse vs dense is fundamental to the time space trade off.= It's just as true when the data structure is source code. Empty space costs us space but saves us time. We consider code that takes little time to read more readable.

When you optimize source code for space you neglect time. As proof I offer all of Code Golf. =

A particular example:=

from math import*
C=sorted(input());l,h=1e-99,1/max(C)
while h-l>2e-16:m=(l+h)/2;a=[asin(c*m)for c in C[:-1]];f=pi-sum(a);l,h=[l,m,h][sin(f)/m>C[-1]:][:2]
print sum(sin(2*t)/l/l for t in a+[f])/8

vs

def segment_angles(line_segments, invd):
    return [2*math.asin(c*invd) for c in line_segments]

def cyclic_ngon_area(line_segments):
    line_segments = list(sorted(line_segments))
    lo, hi = 1e-99, 1/max(line_segments)
    while hi - lo > 2e-16:
        mid = (lo + hi) / 2
        angles = segment_angles(line_segments[:-1], mid)
        angles.append(2*math.pi - sum(angles))
        if math.sin(angles[-1]/2) / mid > line_segments[-1]:
            lo = mid
        else:
            hi = mid
    return sum([math.sin(a)/lo/lo/8 for a in angles])

These both do the same thing. But which one takes you longer to read? That's worth a little extra space.

As for your particular example, my preferred form is:

public function getName() {
    return 'webauthn';
}

public function newKey( array $data = [] ) {
    if ( empty( $data ) ) {
        return WebAuthnKey::newKey();
    }
    return WebAuthnKey::newFromData( $data );
}

The reason I prefer this is because I like being able to add lines without touching the structure.

public function newKey( array $data = [] ) {
    if ( empty( $data ) ) {
        log.info("Creating new key without data");
        return WebAuthnKey::newKey();
    }
    return WebAuthnKey::newFromData( $data );
}

Adding that line with your form would have involved moving curly braces around. That is hard on my fingers, my brain, and source control diff tools.

However, you are not the only detractor this form has. One of the most famous is Uncle Bob. Robert Martin has said, "One of my favorite things to do is getting rid of useless braces". He would say you still have too much because you left the useless braces around one line. I questioned his approach in one of my previous posts.=

At the end of the day the most important thing is being appropriate to how it's used. Source code is used by your team. So the opinion that really matters is theirs. Write stuff they can read.

P.S. As for 2 vs 4 spaces for indenting, I actually appreciate 2. But I'm one of those weird ones that rotates their monitor.

candied_orange
  • 102,279
  • 24
  • 197
  • 315
0
  • horizontal space is/was limited (80 symbols for text terminals without scrolling as example)

  • it is inconvenient for eyes to scan long lines or constantly scroll right and back each line wider then window/screen

  • one statement for one line - useful for fast understanding

https://stackoverflow.com/questions/110928/is-there-a-valid-reason-for-enforcing-a-maximum-width-of-80-characters-in-a-code

gapsf
  • 206
  • 1
  • 6