21

I've always thought that a "common library" was a good idea. By that I mean a library that contains the common functionality that is often needed by a few different applications. It results in less code duplication/redundancy.

I recently read an article (can't find now) that said this is actually a bad idea and went as far to say it was an "anti-pattern"

While there are upsides to this approach. Versioning and managing change means regression test for the suite of apps that use this library.

I'm kind of stuck in a rut for my new (Golang) project. Code deduplication has been hammered into me over the years but I feel like I should try it this time around.

While writing this, I am beginning to think that this "common lib" approach is the result of skimming on architecture? Perhaps my design needs more thought?

Interested to hear thoughts.

jim
  • 473
  • 3
  • 8
  • 2
    My 2 cents... If you have to make changes to common api if one of the system uses it need that change, then that API or library is not common library at all – Raja Anbazhagan Jan 02 '17 at 10:35
  • 1
    I find that there is a fundamental trade-off between code duplication and coupling. Your question is a prime example of that. The balance you strike between the two will probably depend on the environment your code will ultimately execute in. – Joel Cornett Jan 03 '17 at 07:24
  • Most of the C developers that I know have a collection of utilities that they refer to as their "toolbox". It's usually not collected into a single includable library, though. It's more "pick and choose". – Mark Benningfield Jan 27 '18 at 02:02
  • 3
    Someone call Apache and fix this madness. [Apache commons](https://commons.apache.org/) – Laiv Jan 28 '18 at 11:31

5 Answers5

24

Libraries and re-use are absolutely a good thing. They have one giant downside, which is that if not carefully managed, they become the equivalent of the drawer in your kitchen that holds all of the odds and ends that don't go anywhere else.

I saw this in action when I became responsible for the first ports of an entire business unit's worth of code (mostly new to me) to 64-bit systems and doing a complete overhaul to our build and packaging, a lot of which was being done by hand and sometimes not very well.* We tended to build up what we shipped out of a stack of applications, where the client would say, "I'd like a system that does A, B, D and F, plus things M and N that you don't do yet and slightly different glue integrating them all." What all of it had in common was a junk-drawer library that had, over a couple of decades,** accumulated all of the things people thought should be re-usable. To make a long story short, a fraction of the code in the library wasn't used anywhere and was dragging a lot of dependencies into every project we shipped. We were expending a lot of valuable time building and maintaining those dependencies just so the common library would install, not because we actually needed them.

The moral is that libraries need to be treated like classes and not overloaded with too many responsibilities. Don't put your JSON parser in the same library with your linear algebra functions even if every program you're writing uses both.

Keeping them discrete has a lot of benefits, the biggest of which is that it forces your developers and packagers to come up with a detailed accounting of what their own products actually need instead of just including the junk drawer and the baggage that comes with it. When you set up a system using the built packages, the fine-grained dependencies ensure that only the necessary parts get installed. Even if you neglect your repository and continue compiling the old, crufty stuff, nothing that's no longer in use leaks into what you ship.

There are, of course, exceptions such as libc that cram a whole lot of functions into one library. That's one of the cases where the benefits to doing it that way can be reasoned out instead of blindly heeding the zealot down the hall who insists that any other way than X is always bad practice.


*In the process, I uncovered a binary that had been passed around and had not been recompiled from scratch in six years.

**There's nothing wrong with decades-old code. We had a number of critical algorithms that had been so well proven that we'd have been fools to rewrite them solely in the interest of modernity.

Blrfl
  • 20,235
  • 2
  • 49
  • 75
10

Embarrassingly I introduced a "common" library, named as such, in a team environment a couple of decades back. I didn't really understand the dynamics back then of what could happen in a loosely-coordinated team setting in just a matter of months.

When I introduced it I thought I made it clear and also documented that it's for things we'd all agree we find useful on a daily basis, that it's intended to be a minimalist library, and that the library should depend on nothing else besides the standard library so that it's as easy to deploy as possible in new projects. My thinking at the time was that it was our own little extension to the standard library for things that, in our particular domain, we found useful on a daily basis.

And it started off well enough. We started off with a math library (common/math*) of routines which we all used on a daily basis, since we were working in computer graphics which was often heavy on the linear algebra. And since we were often interoping with C code, we agreed on some useful utility functions like find_index which, unlike std::find in C++, would return an index to an element found in a sequence instead of an iterator which mimicked how our C functions worked -- things of this sort -- a little bit eclectic but minimalist and widely used enough to remain familiar and practical to everyone, and instant familiarity is an extremely important criteria as I see it in trying to make anything that is "common" or "standard" since if it truly is "common", it should have that familiar quality about it as a result of its wide adoption and daily usage.

But over time the design intentions of the library slipped out of my fingers as people started to add things they used personally that they merely thought might be of use to someone else, only to find no one else using it. And later someone started adding functions that depended on OpenGL for common GL-related routines. Further on we adopted Qt and people started adding code that depended on Qt, so already the common library was dependent on two external libraries. At some point someone added common shader routines which was dependent on our application-specific shader library, and at that point you couldn't even deploy it in a new project without bringing in Qt, OGL, and our application-specific shader library and writing a non-trivial build script for your project. So it turned into this eclectic, interdependent mess. Later on people even added GUI-dependent code to it.

But I've also found by debating what should and shouldn't go into this library that what is considered "common" can easily turn into a very subjective idea if you don't set a very hard line rule that what's "common" is what everyone tends to find useful on a daily basis. Any loosening of the standards and it quickly degrades from things everyone finds useful on a daily basis to something a single developer finds useful that might have the possibility of being beneficial to someone else, and at that point the library degrades into an eclectic mess really fast.

But furthermore when you reach that point, some developers can start adding things for the simple reason that they don't like the programming language. They might not like the syntax of a for loop or a function call, at which point the library is starting to get filled with things that's just fighting the fundamental syntax of the language, replacing a couple of lines of straightforward code which isn't really duplicating any logic down to a single terse line of exotic code only familiar to the developer who introduced such a shorthand. Then such a developer might start adding more functionality to the common library implemented using such shorthands, at which point significant sections of the common library become interwoven with these exotic shorthands which might seem beautiful and intuitive to the developer who introduced it but ugly and foreign and hard to understand for everyone else. And at that point I think you know that any hope of making something truly "common" is lost, since "common" and "unfamiliar" are polar opposite ideas.

So there's all kinds of cans of worms there, at least in a loosely-coordinated team environment, with a library with ambitions as broad and as generalized as just "commonly-used stuff". And while the underlying problem might have been the loose coordination above all else, at least multiple libraries intended to serve a more singular purpose, like a library intended to provide math routines and nothing else, probably wouldn't degrade as significantly in terms of its design purity and dependencies as a "common" library. So in retrospect I think it would be much better to err on the side of libraries which have much more clear design intentions. I've also found over the years that narrow in purpose and narrow in applicability are radically different ideas. Often the most widely applicable things are the narrowest and most singular in purpose, since you can then say, "aha, this is exactly what I need", as opposed to wading through an eclectic library of disparate functionality trying to see if it has something you need.

Also I'm admittedly at least a little bit impractical and care maybe a bit too much about aesthetics, but the way I tend to perceive my idea of a library's quality (and maybe even "beauty") is judged more by its weakest link than its strongest, in a similar way that if you presented me the most appetitizing food in the world but, on the same plate, put something rotting on there that smells really bad, I tend to want to reject the entire plate. And if you're like me in that regard and make something that invites all sorts of additions as something called "common", you might find yourself looking at that analogical plate with something rotting on the side. So likewise I think it's good if a library is organized and named and documented in a way such that it doesn't invite more and more and more additions over time. And that can even apply to your personal creations, since I've certainly created some rotten stuff here and there, and it "taints" a lot less if it's not being added to the biggest plate. Separating things out into small, very singular libraries has a tendency to better decouple code as well, if only by the sheer virtue that it becomes far less convenient to start coupling everything.

Code deduplication has been hammered into me over the years but I feel like I should try it this time around.

What I might suggest in your case is to start to take it easy on code deduplication. I'm not saying to copy and paste big snippets of poorly-tested, error-prone code around or anything of this sort, or duplicating huge amounts of non-trivial code that has a decent probability of requiring changes in the future.

But especially if you are of the mindset to create a "common" library, for which I assume your desire is to create something widely-applicable, highly reusable, and perhaps ideally something you find just as useful today as you do a decade from now, then sometimes you might even need or want some duplication to achieve this elusive quality. Because the duplication might actually serve as a decoupling mechanism. It's like if you want to separate a video player from an MP3 player, then you at least have to duplicate some things like batteries and hard drives. They can't share these things or else they're indivisibly coupled and cannot be used independently of each other, and at that point people might not be interested in the device anymore if all they want to do is play MP3s. But some time after you split these two devices apart, you might find that the MP3 player can benefit from a different battery design or smaller hard drive than the video player, at which point you're no longer duplicating anything; what initially started out as duplication to allow this interdependent device to split into two separate, independent devices might later turn out to yield designs and implementations that are no longer redundant at all.

It's worth considering things from the perspective of the one using a library. Would you actually want to use a library that minimizes code duplication? Chances are that you won't because one that does will naturally depend on other libraries. And those other libraries might depend on other libraries to avoid duplicating their code, and so on, until you might need to import/link 50 different libraries to just to get some basic functionality like loading and playing an audio file, and that becomes very unwieldy. Meanwhile if such an audio library deliberately chose to duplicate some things here and there to achieve its independence, it becomes so much easier to use in new projects, and chances are that it won't need to be updated nearly as often since it won't need to change as a result of one its dependent external libraries changing which might be trying to fulfill a much more generalized purpose than what the audio library needs.

So sometimes it's worth deliberately choosing to duplicate a little bit (consciously, never out of laziness -- actually out of diligence) in order to decouple a library and make it independent because, through that independence, it achieves a wider range of practical applicability and even stability (no more afferent couplings). If you want to design the most reusable libraries possible that will last you from one project to the next and over the years, then on top of narrowing its scope to the minimum, I would actually suggest considering duplicating a little bit here. And naturally write unit tests and make sure it's really thoroughly tested and reliable at what it's doing. This is only for the libraries that you really want to take the time to generalize to a point that goes far beyond a single project.

  • "so already the common library was dependent on two external libraries" as soon as you have a new dependency (eg OpenGL) you should have a new library (or set of libraries) specifically for that dependency. "common" should be dependency-free, and then you can have "OpenGLCommon". otherwise anyone that wants a simple utility function has to depend on OpenGL to use it, that makes no sense. I guess I could say, the dependencies of a library must be well defined and must not change. it does get complicated though when you start combining dependencies. – Dave Cousineau Dec 10 '21 at 21:08
3

There are three different categories of functions you might consider putting into libraries:

  1. Stuff worth reusing for everyone.
  2. Stuff only worth reusing for your organization.
  3. Stuff not worth reusing.

Category one is something for which a standard library should exist, but for some reason nobody got around making one (or did someone? Did you search thoroughly?). In that case consider making your library open source. Sharing your work doesn't just help others, it also helps you, because you will receive bug reports and patches from other users. When you doubt that anybody would contribute to your library, then you might be dealing with functionality which is actually category 2 or 3.

The second category are things you need over and over again, but nobody else in the world needs it. For example, the implementation of the obscure network protocol to communicate with your inhouse-developed backend system. In that case it might make sense to put that stuff into an inhouse library to improve development speed of new applications. Just make sure it doesn't get too much affected by feature creep and starts to contain stuff which actually fits into categories 1 or 3. Additionally, the advise by Blrfl regarding modularization is very good: Do not create one monolithic Conor Corp. Library. Create multiple separate libraries for separate functionalities.

Category three is functionality which is either so trivial to implement that moving it to a library isn't worth it or where you aren't sure you will ever need it again in exactly that form in another application. This functionality should stay part of the application it got developed for. When in doubt, it likely belongs into this category.

Philipp
  • 23,166
  • 6
  • 61
  • 67
1

Almost all languages have a common/standard library, so this is broadly recognized as a good idea. Using third-partly libraries for various tasks rather than re-inventing the wheel is also generally considered a good idea, although cost/benefit and the quality of the library should obviously be evaluated in each case.

Then there is the "common utilities" libraries used by an individual developer or institution across otherwise unrelated projects. This is the kind of library which could be considered an anti-pattern. In the case I have seen, these libraries just replicate functionality from standard libraries or more well-known third party libraries in a non-standard and badly documented way.

JacquesB
  • 57,310
  • 21
  • 127
  • 176
  • `these libraries just replicate functionality from standard libraries` this is not entirely a bad thing, in javascript you have added libraries that already implement existing things to add support on old js engines, you also have android support libs for older sdk's etc.. – svarog Jan 02 '17 at 12:36
  • @svarog: Are you thinking of "polyfills" which emulates functionality in new standards for older engines which doesn't support it natively? I don't see any reason to write these yourself, since there are well-known open-source libraries available for these purposes. – JacquesB Jan 02 '17 at 12:45
0

Most libraries shared across teams cause more problems than they solve. "The road to hell is paved with good intentions."

Exceptions are libraries that satisfy most of the below:

  • Have secure long-term maintenance funding
  • Have a dedicated support team / community
  • Have a bugtracker
  • Have extensive test coverage
  • Have a single, well-defined purpose
  • Have no dependencies themselves (both at build and runtime), other than standard or quasi-standard libraries
  • Have a clean distinction between public api and internals
  • Have a communication channel and process for all/many users to agree on new features and releases

Inside typical (non-startup) companies, almost none of the above conditions are present for libraries shared across teams. That's because most company teams are paid to deliver products, not libraries. Some companies do have successful sharing strategies, like Google's monorepo, but this comes with very high investment in build and testing infrastructure.

tkruse
  • 246
  • 2
  • 11