Why use an opaque "handle" that requires casting in a public API rather than a typesafe struct pointer?

Question

I'm evaluating a library whose public API currently looks like this:

libengine.h

/* Handle, used for all APIs */
typedef size_t enh;


/* Create new engine instance; result returned in handle */
int en_open(int mode, enh *handle);

/* Start an engine */
int en_start(enh handle);

/* Add a new hook to the engine; hook handle returned in h2 */
int en_add_hook(enh handle, int hooknum, enh *h2);

Note that enh is a generic handle, used as a handle to several different datatypes (engines and hooks).

Internally, most of these APIs of course cast the "handle" to an internal structure which they've malloc'd:

engine.c

struct engine
{
    // ... implementation details ...
};

int en_open(int mode, *enh handle)
{
    struct engine *en;

    en = malloc(sizeof(*en));
    if (!en)
        return -1;

    // ...initialization...

    *handle = (enh)en;
    return 0;
}

int en_start(enh handle)
{
    struct engine *en = (struct engine*)handle;

    return en->start(en);
}

Personally, I hate hiding things behind typedefs, especially when it compromises type safety. (Given an enh, how do I know to what it's actually referring?)

So I submitted a pull request, suggesting the following API change (after modifying the entire library to conform):

libengine.h

struct engine;           /* Forward declaration */
typedef size_t hook_h;    /* Still a handle, for other reasons */


/* Create new engine instance, result returned in en */
int en_open(int mode, struct engine **en);

/* Start an engine */
int en_start(struct engine *en);

/* Add a new hook to the engine; hook handle returned in hh */
int en_add_hook(struct engine *en, int hooknum, hook_h *hh);

Of course, this makes the internal API implementations look a lot better, eliminating casts, and maintaining type safety to/from the consumer's perspective.

libengine.c

struct engine
{
    // ... implementation details ...
};

int en_open(int mode, struct engine **en)
{
    struct engine *_e;

    _e = malloc(sizeof(*_e));
    if (!_e)
        return -1;

    // ...initialization...

    *en = _e;
    return 0;
}

int en_start(struct engine *en)
{
    return en->start(en);
}

I prefer this for the following reasons:

Added type safety
Improved clarity of types and their purpose
Removed casts and typedefs
It follows the recommended pattern for opaque types in C

However, the owner of the project balked at the pull request (paraphrased):

Personally I don't like the idea of exposing the struct engine. I still think the current way is cleaner & more friendly.

Initially I used another data type for hook handle, but then decided to switch to use enh, so all kind of handles share the same data type to keep it simple. If this is confusing, we can certainly use another data type.

Let's see what others think about this PR.

This library is currently in a private beta stage, so there isn't much consumer code to worry about (yet). Also, I've obfuscated the names a bit.

How is an opaque handle better than a named, opaque struct?

_{Note: I asked this question at Code Review, where it was closed.}

I've edited the title to something that I believe more clearly expresses the core of your question. Feel free to revert if I've misinterpreted it. — Ixrec, Aug 28 '15 at 00:03
@Ixrec That is better, thank you. After writing up the whole question, I ran out of mental capacity to come up with a good title. — Jonathon Reinhart, Aug 28 '15 at 00:32

score 34 · Answer 1 · answered Aug 28 '15 at 00:54

The "simple is better" mantra has become too much dogmatic. Simple is not always better if it complicates other things. Assembly is simple - each command is much simpler than higher-level languages commands - and yet Assembly programs are more complex than higher-level languages that do the same thing. In your case, the uniform handle type enh makes the types simpler at the cost of making the functions complex. Since usually project's types tend to grow in sub-linear rate compared to it's functions, as the project gets bigger you usually prefer more complex types if it can make the functions simpler - so in this regard your approach seems to be the correct one.

The project's author is concerned that your approach is "exposing the struct engine". I would have explained to them that it's not exposing the struct itself - only the fact that there is a struct named engine. The user of the library already needs to be aware of that type - they need to know, for example, that the first argument of en_add_hook is of that type, and the first argument is of a different type. So it actually makes the API more complex, because instead of having the function's "signature" document these types it needs to be documented somewhere else, and because the compiler can no longer verify the types for the programmer.

One thing that should be noted - your new API makes user code a bit more complex, since instead of writing:

enh en;
en_open(ENGINE_MODE_1, &en);

They now need more complex syntax to declare their handle:

struct engine* en;
en_open(ENGINE_MODE_1, &en);

The solution, however, is quite simple:

struct _engine;
typedef struct _engine* engine

and now you can straightforwardly write:

engine en;
en_open(ENGINE_MODE_1, &en);

I forgot to mention that the library claims to follow the [Linux Coding Style](https://www.kernel.org/doc/Documentation/CodingStyle), which happens to be what I follow as well. In there, you'll see that typedefing structures just to avoid writing `struct` is expressly discouraged. — Jonathon Reinhart, Aug 28 '15 at 08:44
@JonathonReinhart he's typedefing the *pointer to struct* not the struct itself. — ratchet freak, Aug 28 '15 at 08:52
@JonathonReinhart and actually reading that link I see that for "totally opaque objects" it is allowed. (chapter 5 rule a) — ratchet freak, Aug 28 '15 at 09:12
Yes, but only in the exceptionally rare cases. I honestly believe that was added to avoid rewriting all of the mm code to deal with the pte typedefs. Look at the spin lock code. It is completely arch specific (no common data) but they never use a typedef. — Jonathon Reinhart, Aug 28 '15 at 09:44
I would prefer `typedef struct engine engine;` and using `engine*`: One less name introduced, and that makes it obvious it's a handle like `FILE*`. — Deduplicator, Aug 28 '15 at 11:38

score 16 · Answer 2 · answered Aug 28 '15 at 08:46

16

There seems to be a confusion on both sides here:

using a handle approach does not requiring using a single handle type for all handles
exposing the struct name does not expose its details (only its existence)

There are advantages to using handles rather than bare pointers, in a language like C, because handing over the pointer allows direct manipulation of the pointee (including calls to free) whereas handing over a handle requires that the client go through the API to perform any action.

However, the approach of having a single type of handle, defined via a typedef is not type safe, and may cause lots of griefs.

My personal suggestion would thus be to move toward type safe handles, which I think would satisfy the both of you. This is accomplished rather simply:

typedef struct {
    size_t id;
} enh;

typedef struct {
    size_t id;
} oth;

Now, one cannot accidentally pass 2 as a handle nor can one accidentally pass a handle to a broomstick where a handle to the engine is expected.

So I submitted a pull request, suggesting the following API change (after modifying the entire library to conform)

That is your mistake: before engaging in significant work on an open source library, contact the author(s)/maintainer(s) to discuss the change upfront. It will allow the both of you to agree on what to do (or not do), and avoid unnecessary work and the frustration that results from it.

answered Aug 28 '15 at 08:46

Matthieu M.

14,567
4
44
65

1

Thank you. You didn't go into what to *do* with the handles though. I've implemented an actual *handle* based API, where pointers are never exposed, even if through a typedef. It involved an ~expensive lookup of the data at the entry of every API call - much like the way Linux looks up the `struct file` from an `int fd`. This is certainly overkill for a user-mode library IMO. – Jonathon Reinhart Aug 28 '15 at 09:41
@JonathonReinhart: Well, since the library *already* provides handles, I did not feel the need to expand. Indeed there are multiple approaches, from simply converting the pointer to the integer to having a "pool" and using the IDs as keys. You can even *switch* approach between Debug (ID + lookup, for validation) and Release (just converted pointer, for speed). – Matthieu M. Aug 28 '15 at 09:53
Reuse of integer table index will actually suffer from the [ABA problem](https://en.wikipedia.org/wiki/ABA_problem), where an object (index `3`) is freed, then a new object is created and is unfortunately being assigned index `3` again. Simply put, it is difficult to have safe object lifetime mechanism in C unless reference counting (along with conventions on shared ownerships to objects) is made into an explicit part of the API design. – rwong Aug 28 '15 at 11:25
2

@rwong: It's only an issue in naive scheme; you can easily integrate an epoch counter, for example, so that when an old handle is specified, you'll get an epoch mismatch. – Matthieu M. Aug 28 '15 at 12:43
Also, re: *"using a handle approach does not requiring using a single handle type for all handles"* Note that `typedef`s alone provide no additional type safety. So your Release mode cast-to-`uintptr_t` has the same, original problem. https://gist.github.com/JonathonReinhart/e540ba61ae1a328a66a6 – Jonathon Reinhart Aug 28 '15 at 18:33
@JonathonReinhart: Note that I created two different `struct`, and specifically avoided plain `typedef`. – Matthieu M. Aug 28 '15 at 19:25
Yes, sorry I was focusing on your comment, and forgot that you'd put the `size_t` in the `struct`. It's still a good note for someone considering just a plain `typedef size_t handle`. – Jonathon Reinhart Aug 28 '15 at 20:15
1

@JonathonReinhart suggestion: you can mention "strict aliasing rule" in your question to help steer the discussion toward the more important aspects. – rwong Aug 28 '15 at 20:35
@rwong I've asked a [separate question about that](http://stackoverflow.com/q/32279720/119527). Thanks for the suggestion. – Jonathon Reinhart Aug 28 '15 at 21:02

score 3 · Answer 3 · 2016-01-14T16:53:16.260

Déjà vu

How is an opaque handle better than a named, opaque struct?

I encountered the exact same scenario, only with some subtle differences. We had, in our SDK, a lot of things like this:

typedef void* SomeHandle;

My mere proposal was to make it match our internal types:

typedef struct SomeVertex* SomeHandle;

To third parties using the SDK, it should make no difference whatsoever. It's an opaque type. Who cares? It has no effect on ABI* or source compatibility, and using new releases of the SDK requires the plugin to be recompiled anyway.

_{* Note that, as gnasher points out, there can actually be cases where the size of something like pointer to struct and void* may actually be a different size in which case it would affect ABI. Like him, I've never encountered it in practice. But from that standpoint, the second could actually improve portability in some obscure context, so that's another reason to favor the second, albeit probably moot for most people.}

Third Party Bugs

Furthermore, I had even more cause than type safety for internal development/debugging. We already had a number of plugin developers who had bugs in their code because two similar handles (Panel andPanelNew, i.e.) both used a void* typedef for their handles, and they were accidentally passing in the wrong handles to the wrong places as a result of just using void* for everything. So it was actually causing bugs on the side of those using the SDK. Their bugs also cost the internal development team enormous time, since they would send bug reports complaining about bugs in our SDK, and we'd have to debug the plugin and find that it was actually caused by a bug in the plugin passing the wrong handles to the wrong places (which is easily allowed without even a warning when every handle is an alias for void* or size_t). So we were needlessly wasting our time providing a debugging service for third parties because of mistakes caused on their part by our desire for conceptual purity in hiding away all internal information, even the mere names of our internal structs.

Keeping the Typedef

The difference is that I was proposing that we do stick to the typedef still, not to have clients write struct SomeVertex which would affect source compatibility for future plugin releases. While I personally like the idea of not typedefing away struct in C, from an SDK perspective, the typedef can help since the whole point is opacity. So there I'd suggest relaxing this standard just for the publicly-exposed API. To clients using the SDK, it should not matter whether a handle is a pointer to a struct, an integer, etc. The only thing that matters to them is that two different handles do not alias the same data type so that they don't incorrectly pass in the wrong handle to the wrong place.

Type Information

Where it matters most to avoid casting is to you, the internal devs. This kind of aesthetic of hiding away all internal names from the SDK is some conceptual aesthetic that comes at the significant cost of losing all type information, and requiring us to needlessly sprinkle casts into our debuggers to get at critical information. While a C programmer should largely be accustomed to this in C, to require this needlessly is just asking for trouble.

Conceptual Ideals

In general you want to watch out for those types of developers who put some conceptual idea of purity far above all practical, daily needs. Those will drive the maintainability of your codebase to the ground in seeking some Utopian ideal, making the whole team avoid suntan lotion in a desert out of fear that it's unnatural and may cause a vitamin D deficiency while half the crew are dying from skin cancer.

User-End Preference

Even from the strict user-end standpoint of those using the API, would they prefer a buggy API or an API that works well but exposes some name they could hardly care about in exchange? Because that's the practical trade-off. Losing type information needlessly outside of a generic context is increasing the risk of bugs, and from a large-scale codebase in a team-wide setting over a number of years, Murphy's law tends to be quite applicable. If you superfluously increase the risk of bugs, chances are you'll at least get some more bugs. It doesn't take too long in a large team setting to find that every kind of human mistake imaginable will eventually go from a potential into a reality.

So perhaps that's a question to pose to users. "Would you prefer a buggier SDK or one that exposes some internal opaque names that you'll never even care about?" And if that question seems to put forth a false dichotomy, I'd say more team-wide experience in a very large-scale setting is needed to appreciate the fact that a higher risk for bugs will ultimately manifest real bugs in the long term. It matters little how much the developer is confident in avoiding bugs. In a team-wide setting, it helps more to think about the weakest links, and at least the easiest and quickest ways to prevent them from tripping.

Proposal

So I'd suggest a compromise here which will still give you the ability to retain all the debugging benefits:

typedef struct engine* enh;

... even at the cost of typedefing away struct, will that really kill us? Probably not, so I recommend some pragmatism on your part as well, but more so to the developer who would prefer to make debugging exponentially harder by using size_t here and casting to/from integer for no good reason except to further hide information which is already 99% hidden to the user and cannot possibly do any more harm than size_t.

It's a _tiny_ difference: According to the C Standard, all "pointer to struct" have identical representation, so do all "pointer to union", so do "void*" and "char*", but a void* and a "pointer to struct" can have different sizeof () and/or different representation. In practice, I've never seen this. — gnasher729, Jan 14 '16 at 16:44
@gnasher729 Same, maybe I should qualify that part with respect to the potential loss of portability in casting to `void*` or `size_t` and back as another reason to avoid casting superfluously. I kinda omitted it since I've likewise never seen it in practice given the platforms we targeted (which were always desktop platforms: linux, OSX, Windows). — , Jan 14 '16 at 16:47
We ended up with [`typedef struct uc_struct uc_engine;`](https://github.com/unicorn-engine/unicorn/blob/d0125eb8bf0a8c4757c93cc1482c2ba5d5fa3a70/include/unicorn/unicorn.h#L23) — Jonathon Reinhart, Jan 14 '16 at 19:20

score 2 · Answer 4 · answered Aug 28 '15 at 01:24

I suspect the real reason is inertia, that's what they've always done and it works so why change it?

The main reason I can see is that the opaque handle lets the designer put anything at all behind it, not just a struct. If the API returns and accepts multiple opaque types they all look the same to the caller and there's never any compilation problems or recompiles needed if the fine print changes. If en_NewFlidgetTwiddler(handle **newTwiddler) changes to return a pointer to the Twiddler instead of a handle, the API doesn't change and any new code will silently be using a pointer where before it was using a handle. As well, there's no danger of the OS or anything else quietly "fixing" the pointer if it gets passes across boundaries.

The disadvantage of that, of course, is that the caller can feed anything at all into it. You have a 64 bit thing? Shove it into the 64 bit slot in the API call and see what happens.

en_TwiddleFlidget(engine, twiddler, flidget)
en_TwiddleFlidget(engine, flidget, twiddler)

Both compile but I bet only one of them does what you want.

score 2 · Answer 5 · answered Sep 02 '15 at 06:58

2

Here are a situation where opaque handle is needed;

struct SimpleEngine {
    int type;  // always SimpleEngine.type = 1
    int a;
};

struct ComplexEngine {
    int type;  // always ComplexEngine.type = 2
    int a, b, c;
};

int en_start(enh handle) {
    switch(*(int*)handle) {
    case 1:
        // treat handle as SimpleEngine
        return start_simple_engine(handle);
    case 2:
        // treat handle as ComplexEngine
        return start_complex_engine(handle);
    }
}

When the library has two or more struct types that have a same header part of fields, like "type" in the above, these struct types can be considered having a common parent struct (like a base class in C++).

You can define the header part as "struct engine", like this;

struct engine {
    int type;
};

struct SimpleEngine {
    struct engine base;
    int a;
};

struct ComplexEngine {
    struct engine base;
    int a, b, c;
};

int en_start(struct engine *en) { ... }

But it's a optional decision because type-casts are needed regardless of using struct engine.

Conclusion

In some case, there are reasons why opaque handles are used instead of opaque named structs.

answered Sep 02 '15 at 06:58

Akio Takahashi

31
1

1

I think using a union makes this safer instead of dangerous casts to fields that could get moved. Check out [this gist](https://gist.github.com/JonathonReinhart/643e463d754eb7bedb2c) I put together showing a complete example. – Jonathon Reinhart Sep 02 '15 at 16:32
1

But actually, avoiding the `switch` in the first place, by using "virtual functions" is probably ideal, and solves the whole issue. – Jonathon Reinhart Sep 02 '15 at 16:38
Your design at gist is more complex than I suggested. Surely, It makes casting less, type-safe and smart, but introduces more code and types. In my opinion, it seems to become too tricky to get type-safe. I, and maybe the library author decide to follow [KISS](https://en.wikipedia.org/wiki/KISS_principle) rather than type-safety. – Akio Takahashi Sep 02 '15 at 17:20
Well if you want to keep it really simple, you could completely omit error-checking too! – Jonathon Reinhart Sep 02 '15 at 22:19
In my opinion, design simplicity is preferred than some quantity of error-checkings. In this case, such error-checkings exist only in API functions. Furthermore, you can remove type-casts by using union, but remember that union is naturally type-unsafe. – Akio Takahashi Sep 03 '15 at 00:17

score 0 · Answer 6 · answered Aug 28 '15 at 01:25

0

I believe that the attitude stems from a long-time philosophy to defend a C library API from abuse by beginners.

In particular,

Library authors know it's a pointer to the struct, and the struct's details is visible to library code.
All experienced programmers who use the library also knows it's a pointer to some opaque structs;
- They had enough hard painful experience to know not to mess with the bytes stored in those structs.
Inexperienced programmers know neither.
- They will try to memcpy the opaque data or increment the bytes or words inside the struct. Go hacking.

The long-time traditional countermeasure is to:

Masquerade the fact that an opaque handle is actually a pointer to an opaque struct that exists in the same process-memory-space.
- To do so by claiming it is an integer value having same number of bits as a void*
- To be extra-circumspect, masquerade the pointer's bits as well, e.g.
  struct engine* peng = (struct engine*)((size_t)enh ^ enh_magic_number);

This is just to say it has long traditions; I had no personal opinion on whether it's right or wrong.

answered Aug 28 '15 at 01:25

rwong

16,695
3
33
81

3

Except for the ridiculous xor, my solution provides that safety as well. The client remains unaware of the structure's size or contents, with the added benefit of type safety. I don't see using how abusing a size_t to hold a pointer is any better. – Jonathon Reinhart Aug 28 '15 at 02:45
@JonathonReinhart it's extremely unlikely that the client will actually be unaware of the structure. The question is more: can they get the structure and can they return a modified version to your library. Not just with open source, but more generally. The solution is modern memory partitioning, not the silly XOR. – Móż Aug 28 '15 at 03:29
What are you talking about? All I'm saying is that you can't compile any code that attempts to dereference a pointer to said structure, or do anything that requires knowledge of its size. Sure, you could memset(,0,) over the entire process' heap, if you'd really like to. – Jonathon Reinhart Aug 28 '15 at 08:40
8

This argument sounds a lot like [guarding against Machiavelli](http://www.gotw.ca/gotw/076.htm). If the user _wants_ to pass garbage to my API, there's no way that I can stop them. Introducing a type-unsafe interface like this hardly helps with that, as it actually makes accidental misuse of the API easier. – ComicSansMS Aug 28 '15 at 08:43
@ComicSansMS thank you for mentioning "accidental", as that's what I'm *really* trying to prevent here. – Jonathon Reinhart Aug 28 '15 at 09:48

Why use an opaque "handle" that requires casting in a public API rather than a typesafe struct pointer?

6 Answers6

Conclusion