34

One of my major complaints about C++ is how hard in practice it is to pass std library objects outside of dynamic library (ie dll/so) boundaries.

The std library is often header-only. Which is great for doing some awesome optimizations. However, for dll's, they are often built with different compiler settings that may impact the internal structure/code of a std library containers. For example, in MSVC one dll may build with iterator debugging on while another builds with it off. These two dlls may run into issues passing std containers around. If I expose std::string in my interface, I can't guarantee the code the client is using for std::string is an exact match of my library's std::string.

This leads to hard to debug problems, headaches, etc. You either rigidly control the compiler settings in your organization to prevent these issues or you use a simpler C interface that won't have these problems. Or specify to your clients the expected compiler settings they should use (which sucks if another library specifies other compiler settings).

My question is whether or not C++11 tried to do anything to solve these issues?

Doug T.
  • 11,642
  • 5
  • 43
  • 69
  • 3
    I don't know the answer to your question, but I can say that your concerns are shared; they're a key to why I won't use C++ in my projects, as we value ABI stability over squeezing out every last cycle of potential efficiency. – Donal Fellows Nov 21 '12 at 14:31
  • 2
    Please distinguish. It's hard between `DLL`s. Between `SO`s it always worked just fine. – Jan Hudec Nov 21 '12 at 14:54
  • 1
    Strictly speaking, this is not a C++ only problem. It is possible to have this problem with other languages. – MrFox Nov 21 '12 at 16:00
  • 2
    @JanHudec I can guarantee that between SOs does not work nearly so magically as you seem to indicate. Given symbol visibility and how name mangling often works, you may be more insulated from a problem, but compiling one .so with different flags/etc., and assuming you can link it in a program with other flags is a recipie for disaster. – sdg Nov 21 '12 at 18:05
  • 3
    @sdg: With default flags and default visibility it works. If you change them and get in trouble, it's _your_ problem and nobody else's. – Jan Hudec Nov 22 '12 at 06:35
  • You should definitely not be linking things that are compiled with different flags together (you should not be mixing debug and release objects in the same application). If you are building everything from source then they should be all the same flags anyway. This is really a non issue wherever I have worked. – Martin York Jan 18 '13 at 01:31
  • @LokiAstari its a non issue when you control the build process. Its a **huge** issue when trying to use third party binaries built settings incompatible with what you need to build your code (and other third party libs). – Doug T. Jan 18 '13 at 02:44
  • @DougT: Agreed. I just seem on non MS environments I always have 3rd parts source. In MS environments we usually get 2 versions of the lib (debug/release). – Martin York Jan 18 '13 at 19:04
  • @MartinYork Wouldn't closed source libraries sharing debug versions defeat the purpose? – pooya13 Jun 14 '21 at 08:04

3 Answers3

19

You are correct that anything STL - actually, anything from any third party library which is templated - is best avoided in any public C++ API. You also want to follow the long list of rules at http://www.ros.org/reps/rep-0009.html#definition to inhibit ABI breakage which makes programming public C++ APIs a chore.

And the answer regarding C++11 is no, this standard isn't touching that. More interesting is why not? The answer is because C++17 is very much touching that, and for C++ Modules to be implemented we need exported templates to work, and for that we need a LLVM type compiler such as clang which can dump the full AST to disc and then do caller-dependent lookups to handle the many ODR violating cases in any large C++ project - which, by the way, includes lots of GCC and ELF code.

Lastly, I see a lot of MSVC hate and pro-GCC comments. These are very misinformed - GCC on ELF is fundamentally, and irretrievably, incapable of producing valid and correct C++ code. The reasons for this are many and legion, but I'll quickly quote one case example: GCC on ELF cannot safely produce Python extensions written using Boost.Python where more than one extension based on Boost.Python is loaded into Python. That's because ELF with its global C symbol table is simply incapable by design of preventing ODR violations causing segfaults, whereas PE and MachO and indeed the proposed C++ Modules specification all use per-module symbol tables - which incidentally also means vastly faster process init times. And there are plenty more problems: see a StackOverflow I answered recently at https://stackoverflow.com/questions/14268736/symbol-visibility-exceptions-runtime-error/14364055#14364055 for example where C++ exception throws are irretrievably fundamentally broken on ELF.

Last point: regarding interoping different STLs, this is a big pain for many large corporate users trying to mix third party libraries which are tightly integrated to some STL implementation. The only solution is a new mechanism for C++ to handle STL interop, and while they're at it you might as well fix compiler interop too so you can (for example) mix MSVC, GCC and clang compiled object files and it all just works. I'd watch the C++17 effort and see what turns up there in the next few years - I'd be surprised if nothing does.

Niall Douglas
  • 338
  • 1
  • 3
  • Great response! I only hope Clang improves windows compatibility, and it might set a good default standard compiler. The textual inclusion/header system of C++ is horrible, I'm looking forward to the day that modules simplify C++ code organization, infintely speeds compile times, and improve compiler interoperability with ODR-violating catches. – Alessandro Stamatto Jan 17 '13 at 17:25
  • 3
    Personally, I'm actually expecting a substantial *increase* in compiler times. Traversing an intra-module AST quickly is very hard, and we'll probably need an in-memory shared memory cache of it. However, almost everything else that is bad gets better. BTW, header files are definitely staying around, the current C++ modules has interface files map 1-to-1 to header files. Also, auto-generated interface files will be legal C++, so a legacy header simply gets C macros filtered out and spat out as an interface files. Nice eh? – Niall Douglas Jan 17 '13 at 18:39
  • Cool! I have so many doubts about modules. Will the module system take into consideration Textual Inclusion vs Symbolic Inclusion? With the present include directive the compiler has to recompile tens of thousands of lines of code over and over again for every source file. Will the modules system allow someday code without forward declarations? Will it improve/ease building tools? – Alessandro Stamatto Jan 17 '13 at 20:25
  • 2
    -1 for suggesting that all third-party templates are suspect. Changing the configuration is independent of whether the thing being configured is a template. – DeadMG Jan 17 '13 at 20:37
  • 1
    @Alessandro: The proposed C++ modules explicitly disables C macros. You can use templates, or nowt. The proposed interfaces are legal C++, merely autogenerated, and can be optionally precompiled for speed of reparsing i.e. don't expect any speedup over existing precompiled headers. The last two questions, I actually don't know: it depends :) – Niall Douglas Jan 17 '13 at 21:55
  • @DeadMG: You're right that simple, C macro like, templates are safe-ish in public APIs. Things get trickier with templated member functions, and trickier again with templates taking templated parameters. And things get much trickier with templated friend declarations. If you've ever had to write a C++ mangled symbol parser, you'll see what I mean: there's lots of recursion and with the MSVC mangling especially, dozens of special corner case exceptions. Hence, I stand by what I said, best avoid the lot in public APIs. – Niall Douglas Jan 17 '13 at 22:01
  • Sorry just realised that comment doesn't explain why mangling complexity is important. It's because of ABI breakage testing tools, the ones your repository runs to ensure ABI breakage hasn't happened for some commit or branch. Complex templates confuse their subtly broken demanglers, thus making the breakage detection tool useless. This problem is far less severe on the well documented Itanium (GCC) ABI, but it still occurs. – Niall Douglas Jan 17 '13 at 22:06
9

The specification never ever had this issue. That's because it has concept called "one definition rule", which mandates that each symbol has exactly one definition in the running process.

Windows DLLs violate this requirement. That's why there are all these problems. So it's up to Microsoft to fix it, not C++ standardization committee. Unix never had this problem, because shared libraries work differently there and by default conform to one definition rule (you can explicitly break it, but you obviously only do if you know you can afford it and need to squeeze out the few extra cycles).

Windows DLLs violate one definition rule because:

  • They hardcode from which dynamic library a symbol will be used during static link time and resolve symbols statically within the library that defines them. So if the same weak symbol gets generated in multiple shared libraries and those libraries than get used in single process, the dynamic linker has no chance to merge those symbols. Usually such symbols are static members or class impedimenta of template instances and it than causes problems when passing instances between code in different DLLs.
  • They hardcode whether symbol will be imported from dynamic library already during compilation. Thus code linked with some library statically is incompatible with code linked with the same library dynamically.

Unix using ELF format exports implicitly imports all exported symbols to avoid the first problem and does not distinguish between statically and dynamically resolved symbols until static link time to avoid the second.


The other issue is of compiler flags. That issue exists for any program composed from multiple compilation units, dynamic libraries don't have to be involved. However it's much worse on Windows. On Unix it does not really matter whether you link statically or dynamically, nobody links standard runtime statically anyway (in Linux it might even be illegal) and there is no special debug runtime, so one build is good enough. But the way Microsoft implemented static and dynamic linking, debug and release runtime and some other options means they caused combinatorial explosion of needed library variants. Again platform issue rather than C++ language issue.

Jan Hudec
  • 18,250
  • 1
  • 39
  • 62
  • I'm curious to read more how Gcc gets around this? My understanding is templates need to be header-only. How can a header-only class hope to not violate the ODR when compiled into multiple binaries? – Doug T. Nov 21 '12 at 15:13
  • 2
    @DougT.: GCC has nothing to do with it. The platform ABI has. In ELF, the object format used by most Unices, shared libraries export all visible symbols and import all symbols they export. So if something gets generated in multiple libraries, the dynamic linker will use the first definition for all. Simple, elegant and working. – Jan Hudec Nov 21 '12 at 15:17
  • @DougT.: Of course you have to ensure the libraries are actually compiled against the same version of the standard library, but with careful use of versions Linux distributions seem to handle that just fine. – Jan Hudec Nov 21 '12 at 15:19
  • does that imply that there's no inlining of header-only classes like `std::vector`? MSVC tends to aggressively inline these in release builds -- exacerbating the problem. – Doug T. Nov 21 '12 at 15:25
  • split off this discussion into its own question: http://programmers.stackexchange.com/questions/176700/how-do-sos-avoid-problems-associated-with-passing-header-only-templates-like-ms – Doug T. Nov 21 '12 at 15:43
  • "In ELF, ... shared libraries export all visible symbols and import all symbols they export. So if something gets generated in multiple libraries, the dynamic linker will use the first definition for all." -- except that, of course, for *inlined* (template) code, **there are no symbols**. There's only a (multiple) inlined instruction sequence(s) and if they were compiled incompatibly, it doesn't matter if you use DLL or SO. – Martin Ba Nov 22 '12 at 12:42
  • @MartinBa: Well, if you inline different functions into different objects of your application, you are in trouble even if the objects are static. But that's your problem that you can fix and does not seem to be much trouble in practice. The main problem discussed here is that even if you provide the same definitions, DLLs will contain separate copies of things that should be merged while ELF SOs will merge them correctly. And it's these symbols (static members and class impedimenta) that cause most problems. – Jan Hudec Nov 22 '12 at 12:48
  • @Jan - Not different functions. The same function. The same function, if inlined, will exist in multiple places in multiple object files, but it won't be a "function" anymore, it will just be a sequence of machine instructions. Iff two object files would be compiled with different settings, that would influence the correct behaviour of the inlined function, there would be *nothing to merge* for ELF/SO, as the "function" isn't a function in the resulting binary. – Martin Ba Nov 22 '12 at 13:49
  • @Jan - so my point really is, iff you use incompatible (layout, e.g.) compiler settings on an ELF platform, you are in the same mess as you are on Windows, as (some) code may be inlined and that can never merged. – Martin Ba Nov 22 '12 at 13:50
  • 1
    @MartinBa: There is nothing to merge, but it does not matteras long as it's the same and as long as it is not supposed to be merged in the first place. Yes, iff you use incompatible compiler settings on an ELF platform, you get the same mess as anywhere and everywhere. Even if not using shared libraries, so it's somewhat off-topic here. – Jan Hudec Nov 23 '12 at 09:20
  • 1
    @Jan - it is relevant to your answer. You write: "... one definition rule ... Windows DLLs violate this requirement ... shared libraries work differently [on UNix] ..." but the *question asked* pertains to problems with std-lib stuff (defined in headers) and the reason there's no problem on Unix has nothing to do with SO vs. DLL but with the fact, that on Unix (apparently) there is only one compatible version of the standard library while on Windows MS chose to have incompatible (debug) versions (with extended checking etc.). – Martin Ba Nov 23 '12 at 10:30
  • 1
    @MartinBa: No, the main reason there is problem on Windows is that the export/import mechanism used on Windows can't properly merge static members and class impedimenta of template classes in all cases and can't merge statically and dynamically linked symbols. Than it's made much worse by the multiple library variants, but the primary problem is that C++ needs flexibility from linker that the Windows dynamic linker does not have. – Jan Hudec Nov 23 '12 at 10:42
  • 4
    I think that this implication that the DLL specification is broken and the corresponding demand for Msft to 'fix it' are misplaced. The fact that DLLs don't support certain features of C++ is not a defect of the DLL specification. DLLs are a language-neutral, vendor-neutral packaging mechanism and ABI to expose entry-points to machine code ('function calls') and data blobs. They were never intended to natively support advanced features of any particular language. It's not Msft's, or the DLL specification's fault that some people want them to be something else. – Euro Micelli Dec 21 '12 at 06:29
  • 1
    @EuroMicelli: It's not DLL specification's fault. DLL specification does not need to know about C++. But it is Microsoft's fault that they didn't add a mechanism to merge weak symbols across DLLs when C++ introduced them and required them to be merged. – Jan Hudec Dec 23 '12 at 09:48
  • @JanHudec With your interpretation of ODR, shared object _has_ to violate the ODR. A plug-in needs a set of entry points with a name defined by the plug-in API. Now if ODR is per process as you suggest, only one plug-in of the same kind can be loaded simultaneously. But from the manpage `dlsym` behaves like `GetProcAddress`, expect that the latter gives up if the symbol is not found in the specified library. – user877329 Jun 14 '15 at 19:16
  • @user877329: A _plugin_ indeed must. But most shared objects are _not_ plugins (shared object == dynamic library). And even for plugins, the exception to ODR really only applies to symbols loaded via `dlsym`/`GetProcAddress` and it only applies because nothing ever expects them to be the same. If plugins cause multiple versions of some symbol that is used via direct code reference like virtual method table of shared class or operator new, things are likely to get very ugly. – Jan Hudec Jun 14 '15 at 20:34
  • @user877329: Or maybe ODR should be explicitly phrased in a way that implies that whenever the code refers to `x`, the same definition for what `x` means must be used. Then it does not apply to symbols that are _only_ used by `dlsym`/`GetProcAddress`, so plugins don't have to break ODR. It still does apply to any other symbols in plugins and it better should, because having two instances of common class impedimenta is going to break stuff. – Jan Hudec Jun 14 '15 at 20:46
6

No.

There is a lot of work going on to replace the header system, feature which is called Modules and which could have an impact on this, but certainly not a big one.

Klaim
  • 14,832
  • 3
  • 49
  • 62
  • 2
    I don't think header system would have any impact on this. The problems are that Windows DLLs violate one definition rule (which means they don't follow C++ spec, so C++ committee can't do anything about it) and that there are so many variants of standard runtime in Windows, which C++ committee can't do anything about either. – Jan Hudec Nov 21 '12 at 15:08
  • 1
    No, they don't. How could they, the specification does not even mention something of that kind. Other than that, when a (Windows) program is linked with Windows dlls, the ODR is satisfied: all visible (exported) symbols must obey the ODR. – Paul Michalik Jan 14 '13 at 10:54
  • @PaulMichalik C++ does cover linking (phase 9) and it seems to me that at least load-time linking of DLLs/SOs falls within phase 9. That means that symbols with external linkage (whether exported or not) should be linked and conform to the ODR. Dynamic linking with LoadLibrary/dlopen obviously does not fall under those requirements. – bames53 Jan 16 '13 at 21:29
  • @bames53: IMHO, the specs is far too weak to allow statements of that kind. A *.dll/*.so could be seen as a "program" on its own. Than, the rules were satisfied. Something like loading other "programs" at run-time is so underspecd by the standard that any statements regarding this are pretty arbitrary. – Paul Michalik Jan 17 '13 at 06:55
  • @PaulMichalik If an executable requires load-time linking then prior to load-time linking there are external entities left unresolved and information needed for execution is missing. LoadLibrary and dlopen are outside the spec but load-time linking pretty clearly must be part of phase 9. – bames53 Jan 17 '13 at 14:28