12

What are some flaws that drive you nuts in C APIs (including standard libraries, third party libraries, and headers inside of a project) ? The goal is to identify API design pitfalls in C, so people writing new C libraries can learn from mistakes of the past.

Explain why the flaw is bad (preferably with an example), and try to suggest an improvement. Although your solution might not be practical in real life (it's too late to fix strncpy), it should give a heads up for future library-writers.

Although the focus of this question is C APIs, problems that affect your ability to use them in other languages are welcome.

Please give one flaw per answer, so democracy can sort the answers.

Joey Adams
  • 5,535
  • 3
  • 30
  • 34
  • 3
    Joey, this question is verging on being not constructive by asking to build up a list of things people hate. There's potential here for the question to be useful if the answers explain *why* the practices they're pointing out are bad and provide detailed information on how to improve them. To that end, please move your example from the question into an answer of its own and explain why it's a problem/how a `malloc`'d string would fix it. I think setting a good example with the first answer could really help this question thrive. Thanks! – Adam Lear Aug 13 '11 at 05:27
  • 1
    @Anna Lear: Thanks for telling me *why* my question was problematic. I was trying to keep it constructive by asking for an example and suggested alternative. I guess I really needed some examples to indicate what I had in mind. – Joey Adams Aug 13 '11 at 06:12
  • @Joey Adams Look at it this way. You are asking a question that is supposed to "automatically" solve C API issues in a general way. Where sites like StackOverflow were designed to work such that the more common issues with programming are easily found AND answered. StackOverflow will naturally result in a list of answers for your question but in a more structured easily searchable way. – Andrew T Finnell Aug 13 '11 at 20:24
  • I voted to close my own question. My goal was to have a collection of answers that could serve as a checklist against new C libraries. The three answers so far all use words like "inconsistent", "illogical", or "confusing". One can't objectively determine whether or not an API violates any of these answers. – Joey Adams Aug 17 '11 at 01:41

5 Answers5

5

Functions with inconsistent or illogical return values. Two good examples:

1) Some windows functions that return a HANDLE use NULL/0 for an error (CreateThread), some use INVALID_HANDLE_VALUE/-1 for an error (CreateFile).

2) The POSIX 'time' function returns '(time_t)-1' on error, which is really illogical since 'time_t' can be either a signed or unsigned type.

David Schwartz
  • 4,676
  • 22
  • 26
  • 2
    Actually, time_t is (usually) signed. However, calling December 31, 1969 "invalid" is rather illogical. I guess the 60s were rough :-) In seriousness, a solution would be to return an error code, and pass the result through a pointer, as in: `int time(time_t *out);` and `BOOL CreateFile(LPCTSTR lpFileName, ..., HANDLE *out);` . – Joey Adams Aug 13 '11 at 06:48
  • Exactly. It's weird if time_t is unsigned, and if time_t is signed, it makes *one* time invalid in the middle of an ocean of valid ones. – David Schwartz Aug 13 '11 at 06:58
4

Functions or parameters with non-descriptive or affirmatively confusing names. For example:

1) CreateFile, in the Windows API, doesn't actually create a file, it creates a file handle. It can create a file, just like 'open' can, if asked to through a parameter. This parameter has values called 'CREATE_ALWAYS' and 'CREATE_NEW' whose names don't even hint at their semantics. (Does 'CREATE_ALWAYS' mean it fails if the file exists? Or does it create a new file on top of it? Does 'CREATE_NEW' means it creates a new file always and fails if the file already exists? Or does it create a new file on top of it?)

2) pthread_cond_wait in the POSIX pthreads API, which despite its name, is an unconditional wait.

David Schwartz
  • 4,676
  • 22
  • 26
  • 1
    The *cond* in `pthread_cond_wait` doesn't mean "conditionally wait". It refers to the fact that you're waiting on a [**condition variable**](https://en.wikipedia.org/wiki/Monitor_(synchronization)#Condition_variables_2). – Jonathon Reinhart Aug 27 '15 at 23:45
  • Right, it's an unconditional wait *for* a condition. – David Schwartz Aug 27 '15 at 23:49
4

Opaque types that are passed through the interface as type deleted handles. The problem is, of course, that the compiler can't check the user code for correct argument types.

This comes in various forms and flavors, including, but not limited to:

  • void* abuse

  • using int as a resource handle (example: the CDI library)

  • stringly typed arguments

The more distinct types (= cannot be used fully interchangeably) are mapped to the same type deleted type, the worse. Of course, the remedy is simply to provide typesafe opaque pointers along the lines of (C example):

typedef struct Foo Foo;
typedef struct Bar Bar;

Foo* createFoo();
Bar* createBar();

int doSomething(Foo* foo);
void somethingElse(Foo* foo, Bar* bar);

void destroyFoo(Foo* foo);
void destroyBar(Bar* bar);
2

Functions with inconsistent and often cumbersome string returning conventions.

For example, getcwd asks for a user-supplied buffer and its size. This means an application either has to set an arbitrary limit on the current directory length, or do something like this (from CCAN):

 /* *This* is why people hate C. */
len = 32;
cwd = talloc_array(ctx, char, len);
while (!getcwd(cwd, len)) {
    if (errno != ERANGE) {
        talloc_free(cwd);
        return NULL;
    }
    cwd = talloc_realloc(ctx, cwd, char, len *= 2);
}

My solution: return a malloced string. It's simple, robust, and no less efficient. Excepting embedded platforms and older systems, malloc is actually quite fast.

Joey Adams
  • 5,535
  • 3
  • 30
  • 34
  • 4
    I would not call this bad practice, I would call this good practice. 1) It is so utterly common that no programmer should be surprised by it. 2) It leaves the allocation to the caller, which excludes numerous possibilities of memory leak bugs. 3) It is compatible with statically allocated buffers. 4) It makes the function implementation cleaner, a function calculating some mathematical formula shouldn't be concerned with something entirely unrelated such as dynamic memory allocation. You think main gets cleaner but the function gets messier. 5) malloc isn't even allowed on many systems. –  Aug 17 '11 at 13:30
  • @Lundin: The problem is, it leads to programmers creating unnecessary hard-coded limits, and they have to try really hard not to (see the example above). It's fine for things like `snprintf(buf, 32, "%d", n)`, where the output length is predictable (certainly not more than 30, unless `int` is *really* huge on your system). Indeed, malloc isn't available on many systems, but for desktop and server environments, it is, and it works really well. – Joey Adams Aug 17 '11 at 14:45
  • But the problem is that the function in your example sets no hard-coded limits. Code like this is not common practice. Here, main knows things about the buffer length that the function should have known. It all suggests poor program design. Main doesn't seem to know what the getcwd function even does, so it is using some "brute force" allocation to find out. Somewhere the interface between the module in which getcwd resides and the caller is muddled. That doesn't mean that this way of calling functions is bad, on the contrary experience shows it is good for the reasons I already listed. –  Aug 18 '11 at 06:40
1

Functions that take/return compound data types by value, or that use callbacks.

Even worse if said type is a union or contains bit-fields.

From the perspective of a C caller, these are actually OK, but I do not write in C or C++ unless required to, so I am usually calling via an FFI. Most FFIs do not support unions or bit-fields, and some ( such as Haskell and MLton) cannot support structs passed by value. For those that can handle by-value structs, at least Common Lisp and LuaJIT are forced onto slow paths -- Lisp's Common Foreign Function Interface must make a slow call via libffi, and LuaJIT refuses to JIT-compile the code path containing the call. Functions that may call back into the hosts also trigger slow paths on LuaJIT, Java, and Haskell, with LuaJIT not being able to compile such a call.

Demi
  • 826
  • 7
  • 18