2

Suppose that we have a SQL relational database for, let's say, asset management system. It uses a table of assets (1 row per one real-world object). There can be various metadata etc. To allow the user add its own notes that are not needed to be managed by the system or provide information that the system is not able to store yet, a column for storing free-form text (called e.g. “note”) was added to the asset table. It is intended to be used sparingly, only in situations when all other means of entering the data are not sufficient.

How should be empty comment text (or “no comment” value) represented? Using NULL or an empty string? I personally think that NULL is the cleaner way to do this, but since the text will be shown in a text field that is empty when there is no note, empty string would be sufficient and allow simpler code.

jiwopene
  • 191
  • 6
  • 1
    I’d suggest narrowing down your scope when considering this. As far as the database is concerned, no value is best represented as NULL. Suppose you want a query for all empty notes. Intuitively, one would write this with a NULL. Rarely anyone would go through the line of reasoning, “this field is shown in a text area, so empty string implies no value.” Separate the representation of data from the way it is stored. – Stefan Rendevski Aug 21 '22 at 17:13
  • @StefanRendevski Please write that (particularly the last sentence) as an answer. – Philip Kendall Aug 21 '22 at 21:07
  • 2
    Does this answer your question? [SQL: empty string vs NULL value](https://softwareengineering.stackexchange.com/questions/32578/sql-empty-string-vs-null-value) – Doc Brown Aug 22 '22 at 08:42
  • @DocBrown, I would say that it does not answer my question. With e-mail addresses (example from the other question), empty string is an invalid address and NULL is no address. (At least in my opinion.) But with the comment/custom text field, I think that no comment (NULL) and empty comment (`''`) are the same. – jiwopene Aug 22 '22 at 09:37
  • @jiwopene: in the example from the other question, the OP wrote *"The problem is even when both values differ on the technical level, they are exactly the same on logical level."*. So they situation was pretty much the same as yours. – Doc Brown Aug 22 '22 at 13:29
  • 2
    ... However, the way I see it is: there are use cases like yours where a distinction between "no string" and "empty string" isn't required (and may not even make sense). For those cases, my preferred solution would be to use a NOT NULL constraint and use only the empty string. If that's not possible for some technical reason, allow both and treat both representations equally in your application. – Doc Brown Aug 22 '22 at 13:35
  • @DocBrown, I think that this approach is okay. – jiwopene Aug 22 '22 at 14:31
  • Are empty string allowed? If yes. Why? Note that min length can be enforced by DB constraints (vía CHECK). That can ease the handling of this column. Basically it will translate null into empty without any ambiguity (empty strings). – Laiv Aug 26 '22 at 17:06

3 Answers3

5

Null means there is no comment, an empty string means there's a comment with no characters. Think of it like having an empty bag (of letters, in this case) versus not even having a bag.

That being said, feel free to pick whatever is easiest for you. It's not wrong to use an empty string if that's what the form text field automatically returns when no comment was added.

Flater
  • 44,596
  • 8
  • 88
  • 122
  • *"It's not wrong to use an empty string if that's what the form text field automatically returns when no comment was added."* -- this works as long as the data is consistent. As soon as you get a `NULL` value, things get dicey. In that case, check for "is null or empty" before an OO language attempts to call an instance method on the resulting string pulled from the database. Then again, some database vendors treat an empty string as a `NULL` value (Oracle). – Greg Burghardt Aug 23 '22 at 14:57
  • @GregBurghardt: Proper null checking should be done on any nullable type. The issue isn't so much that a null might slip in in the future, but rather that one would have failed to account for possible nulls on a nullable type to begin with. So you're right, but I'm already advocating null checks regardless. – Flater Aug 23 '22 at 19:29
2

Null is the best representation for "no value" - don't let the current output requirements dictate the structure of the database.

The database design should be independent of a specific output - the requirements may differ depending on if it's being pushed out in an HTML UI, REST interface, report, or some other output mechanism. And (conversely) any sane interface built for use with a database will have an easy mechanism to translate null to the correct client representation (such as empty string).

Egret
  • 414
  • 2
  • 7
  • This is not necessarily just a question of "output requirements" - it can also be a question of semantics, when in a particular case a distinction between "no string" and "empty string" does not make any sense. – Doc Brown Aug 22 '22 at 13:41
2

It seems what the other answers are missing is that there are several real-world cases where a distinction between "no string" and "empty string" simply isn't required (and may not even make sense). Yours is probably one one those.

For those cases, my preferred solution would be to add a NOT NULL constraint to the database column and use only the empty string as the one and only representation of "no comment". Having to deal with only one representation of the same thing makes handling a lot simpler and avoids errors. For example, it avoids the necessity of adding special NULL treatment in SQL queries.

If adding the constraint is not possible for some technical reason, pick one of the two representations when saving the data, and treat both representations equally when querying them, or simply use IFNULL(comment,'') (or whatever equivalent function your DB systems offers).

Doc Brown
  • 199,015
  • 33
  • 367
  • 565
  • This is a first - I normally agree with @DocBrown answers. I agree with the first answer - if there is no comment then the field should be null. Otherwise standard database functions like COUNT will return wrong results and other queries are complicated compared with "COMMENT IS NULL". – kiwiron Aug 24 '22 at 06:40
  • @kiwiron: quite the opposite. If you allow NULL values and empty strings in a column, but both are semantically equal, the correct query will have to be `COMMENT IS NULL OR COMMENT=''`, or `ISNULL(COMMENT,'')=''`. That is definitely more complicated than `COMMENT=''`, which is sufficient with the NOT NULL constraint in place. – Doc Brown Aug 24 '22 at 10:00