0

Is there a reason why Python developers (both maintainers of Python itself and authors of modules) tend to pass special strings as arguments to functions instead of defining symbols for the same purpose? Rather than make other developers look up or guess the special values allowed for some arguments, why not define symbols to represent them?

The simplest example of this I can think of in base Python is the open() function. To specify the mode in which a file is opened, the mode argument can take one of several string values. For example, to open a file for writing:

fileObject = open('/tmp/tarfu.txt', 'w')

Why not define symbols to represent the values allowed for mode? That is, constants, enumerations, class members, or something similar.

A simple definition could be added to the same module that defines open() (io, I believe):

OPEN_MODE_WRITE = 'w'

Then it could be used with open():

fileObject = open('/tmp/tarfu.txt', OPEN_MODE_WRITE)

That would be similar to other languages, like PHP, Java, C, and others. Although the corresponding file open functions of those other languages may not be examples of the ideal, those languages have more examples of it than Python. They all commonly define symbols for such purposes. Using them helps make getting the right value easier, especially if the developer's editor has features to autocomplete or suggest symbols. If the symbols are of a specific type (as in enumerations or class members), then the types of the arguments can be checked, too.

By using strings for these special values, mistakes in code may only be detected at runtime, which may be difficult to trace back to the offending code. Using symbols would also make code easier to understand, which I believe would make it more Pythonic.

Everything I've written here in reference to strings applies to numbers as well. I was inspired to write this question after using pandas for a while. Several of its functions allow the developer to specify whether an operation applies to rows or columns of a DataFrame object by supplying the numbers 0 and 1 or the strings rows and columns for the axis argument. I'd rather it have symbols like AXIS_ROWS and AXIS_COLUMNS defined for this purpose.

  • 2
    I hope the Software Engineering site is the right place for a question like this. I know it'd be shot down on Stack Overflow because it's not a question about solving a specific problem. – Mr. Lance E Sloan Jul 19 '18 at 15:51
  • Unless there is a definitive documented Python convention from some authoritative group on the Python language, I fear this question will be closed as opinion-based. – Greg Burghardt Jul 19 '18 at 16:09
  • 1
    Very related (if not a duplicate): [Eliminating Magic Numbers: When is it time to say “No”?](https://softwareengineering.stackexchange.com/questions/56375/eliminating-magic-numbers-when-is-it-time-to-say-no) – Greg Burghardt Jul 19 '18 at 16:29
  • Sometimes there *are* symbols, e.g. [logging levels](https://docs.python.org/3/library/logging.html#levels) and flags [in `re`](https://docs.python.org/3/library/re.html). – jonrsharpe Jul 19 '18 at 16:56
  • @GregBurghardt, thanks for the link. I _think_ my question is not opinion-based. At least, it doesn't seem like it's my opinion if many other languages already do it. That's not to say that sometimes other languages don't use symbols and sometimes Python does. I'm trying to find out whether there's an advantage to using special string and numeric values over defined symbols. – Mr. Lance E Sloan Jul 19 '18 at 18:09
  • @jonrsharpe, thanks for examples of the exceptions. There are some in Python, but they don't seem to be common. – Mr. Lance E Sloan Jul 19 '18 at 18:10
  • 1
    Possible duplicate of [Eliminating Magic Numbers: When is it time to say "No"?](https://softwareengineering.stackexchange.com/questions/56375/eliminating-magic-numbers-when-is-it-time-to-say-no) – Peter K. Jul 20 '18 at 11:48

2 Answers2

3

This Python code:

fileObject = open('/tmp/tarfu.txt', 'w')

is nearly identical to this C function call:

file = fopen('/tmp/tarfu.txt', 'w');

So they apparently did it this way for historical reasons.

The mode switch is a single character. It's pretty hard to screw that up, and it's fully documented in the Python literature.

See Also
C library function - fopen()

Robert Harvey
  • 198,589
  • 55
  • 464
  • 673
  • The file opening function in other languages may not be examples of the ideal. However, many other C functions do use symbols this way, like the `mode` argument of its `chmod()` function. It requires values like `lS_IRUSR`, `S_IWUSR`, `S_IXUSR`, etc. – Mr. Lance E Sloan Jul 19 '18 at 18:03
2

In statically typed languages like C++ you can use enum to create a type which is restricted to a range of values. For example,

enum OpenMode {READ_ONLY, READ_WRITE};

creates a type that could be used as argument for some open function:

open(string filename, OpenMode mode);

Doing so does not only improve the readabilty: It is now impossible to pass anything other than an OpenMode object to open. For example, calling open("file.txt", "w") produces a compiler error.

Something like this cannot be done in Python, which is dynamically or duck typed. There are no enums (at least before version 3.4) and there is no compiler. Adding a constant OPEN_MODE_READ_ONLY "only" improves the readability. However, after adding such a constant, there are two ways to call open. You may now use

open("file.txt", OPEN_MODE_READ_ONLY)

but the old way

open("file.txt", "w")

is still valid, because OPEN_MODE_READ_ONLY and "w" have the same value. One can now argue that this is not pythonic it all, because There should be one-- and preferably only one --obvious way to do it.

Luckily, the Python language and community embraces useful changes and does not dismiss them just because "we always did it this way". Python 3.4 introduced the enum module and maybe the habit of magic string arguments disappears as soon as the enum module is established and accepted by a majority of people.

pschill
  • 1,807
  • 7
  • 13
  • Despite the “one and only one” mantra, Python has always done it both ways. Many modules are based on strings, but e.g. `re` and `socket` use named numeric constants. – amon Jul 20 '18 at 19:38