8

For instance, you let the user define the notorious path variable. How do you interpret apppath = C:\Program Files\App?

This looks like a programming language adopted practice to ignore the white spaces and you leave them around the equality mark for readability, but it might be a valid variable value with white space in the application (consider, it is a suffix).

Even keys can contain whitespaces, can't they?

What is the general best practice for my application? If I have:

key-example = value-example

should I interpret the key being "key-example" or "key-example " and the value as being "value-example" or " value-example"?

Peter Mortensen
  • 1,050
  • 2
  • 12
  • 14
Val
  • 1
  • 1
  • 11
  • I've read your question two times and still don't understand it. Could you rephrase it a little or show some code along with what you are trying to achieve? – Jay Feb 01 '15 at 12:47
  • What programming language are you using for parsing the file? – Tulains Córdova Feb 01 '15 at 15:59
  • 2
    @user61852: why should that matter? The requirements for the config file should be fully independent from the programming language. Think of five different programs written in five different languages all sharing the same configuration file. – Doc Brown Feb 01 '15 at 17:02
  • 1
    @user61852 It does not matter. I tried to use `:` in place of `=` because I use this exclusively in javascript (although my question is not limited to my app, therefore it is irrelevant to ask about its language at all) but this did not help me to read the config as JSON object. JSON requires all values quoted, which is unnecessary burden on the user. – Val Feb 01 '15 at 18:35
  • @valtih I asked because some languages already have an API for config files. For example Java has a Properties class that handles those kind of key/value files. – Tulains Córdova Feb 01 '15 at 23:35

2 Answers2

14

As a user, I don't expect the whitespace on either side of the equals sign to change the value of the key or the value. See this related question on unix.SE as too how confusing the situation can be.

Don't make it harder on your users, trim whitespace from both the key and the value. If leading whitespace has a real use case for either, then let the user wrap the key or value in quotes.

dotancohen
  • 1,001
  • 1
  • 8
  • 19
11

It's up to you to define the rules for your app.

For instance, you may define that:

  • Whitespace before or after the equality sign is ignored,

  • Whitespace inside the key is forbidden,

  • Whitespace inside the value can be used only if the value is enclosed in quotes, so:

    say-hello = Hello, World!
    

    is forbidden, while:

    say-hello = "Hello, World!"
    

    is allowed, which also makes it possible to have whitespace prefixes:

    say-hello = "    Indentation is sweet."
    

Defining a format may be a complex task. For instance:

  • How do you escape quotes?

  • How do you escape the escape character you use to escape quotes?

  • How do you handle empty values?

  • What is the maximum length of a key? What about a value?

  • How do you handle multiline values?

  • What about whitespace Unicode characters other than a space (such as a non-breaking space character)?

  • What about Unicode characters which are usually not displayed on the screen? For instance, how do you deal with Unicode categories Cf or Zl?

  • What are the characters allowed in the key? For example, is:

    '
    

    a valid key?

  • Should the following line work?¹

    say-hello ꘌ "Hello, World!"
    

    Hint: the equality sign is not an equality sign, but the character 0xa60c (Vai syllable lengthener). Although few people would use this symbol instead of the equality, the more frequent case is a copy-paste from Microsoft Word (watch closely the quotation marks):

    say-hello = “Hello, World!”
    
  • etc.

This is why, unless you are completely sure that you can define a format and describe it precisely and verbosely, use a format which already exists.

JSON or XML are commonly used formats you can use in nearly every programming language. You may even abstract the underlying format by using a database. Redis, for instance, is a popular solution for key-value store.


¹ Chrome users using Windows would probably see a question mark in a square. With other browsers or with Chrome on Linux, the character appears like an equality sign and can easily be misleading: the only visual difference is that there is a tiny difference in the space between the horizontal bars.

Arseni Mourzenko
  • 134,780
  • 31
  • 343
  • 513
  • Well, I will specify it to the user. I had a simple line per equality-separated key/value pair format in mind. That is, every line is split by `=` and quotes are not the issue whereas it is unclear with whitespaces because I prefer to have them and this creates ambiguity. Saying that leading/ending whitespaces are trimmed is enough. Thanks. – Val Feb 01 '15 at 13:29
  • @valtih: Especially trimming trailing whitespace is good, IMO. Most users would not realize that it would be there as it is essentially invisible. – Bart van Ingen Schenau Feb 01 '15 at 16:35
  • @mainma That oversized equals sign renders perfectly on my win7 box in everything I've tried that's unicode aware. – Dan Is Fiddling By Firelight Feb 01 '15 at 18:52
  • @DanNeely: that's weird. On Windows 8.1, Chrome shows a question mark in a square. – Arseni Mourzenko Feb 01 '15 at 20:18
  • 1
    @MainMa Chrome has always had issues with Unicode fallback on Windows. That's been a known issue pretty much for its entire existence (~5 years?); [here](https://code.google.com/p/chromium/issues/detail?id=42984) is (one of?) the bug for it, and it might as well be wontfix at this point. It works fine in Firefox and IE. – Bob Feb 01 '15 at 20:39
  • @Bob: good to know. I already had surprises with Unicode when reviewing in Windows my answers written originally in Linux, and I was always convinced that the problems were inherent to OS, not the browser. – Arseni Mourzenko Feb 01 '15 at 21:05