I will offer a slighty different take on this. A core library, in many cases, is an excellent idea!
If you have two separate projects, they should be in two separate code repositories. Now they depend on common functionality. Let's consider for example packet processing applications. The common functionality may include:
- Memory allocators
- Address resolution protocol
- AVL tree
- Serialization code for binary protocols
- Dynamic array
- Linux kernel style hash list with singly linked head and doubly linked middle nodes
- Hash table
- TCP/IP header processing code
- Regular linked list with doubly linked head and doubly linked middle nodes
- Logging library
- Miscellaneous (trust me, you need this for small and trivial stuff or your number of different modules will be as great as 100!)
- Packet capture library
- Packet I/O interface library
- Packet data structure
- Blocking queue for inter-thread communication
- Random number generators
- Red-black tree
- Some kind of timer implementation
Now, different packet processing applications may need a different subset of these. Should you implement one core library with one source code repository, or should you have 18 different repositories for each of these modules? Remember that these modules may have inter-dependencies, so most of these modules may depend on e.g. the miscellaneous module.
I will claim that having one core library is the best approach. It reduces the overhead of many source code repositories. It reduces the dependency hell: a particular version of memory allocators may need a particular version of miscellaneous module. And what if you want memory allocator version 1.7 depending on miscellaneous 2.5 and AVL tree version 1.2 depending on miscellaneous 2.6? You may not be able to link miscellaneous 2.5 and miscellaneous 2.6 at the same time to your program.
So, go ahead and implement the following structure:
- Core library repository
- Project #1 repository
- Project #2 repository
- ...
- Project #N repository
I have seen that switching to this kind of structure from the structure:
- Project #1 repository
- Project #2 repository
- ...
- Project #N repository
Has led to reduced maintenance and increased code sharing via non-copypaste mechanisms.
I have also seen projects using the following structure:
- Memory allocators repository
- Address resolution protocol repository
- AVL tree repository
- Serialization code for binary protocols repository
- Dynamic array repository
- Linux kernel style hash list with singly linked head and doubly linked middle nodes repository
- Hash table repository
- TCP/IP header processing code repository
- Regular linked list with doubly linked head and doubly linked middle nodes repository
- Logging library repository
- Miscellaneous repository (trust me, you need this for small and trivial stuff or your number of different modules will be as great as 100!)
- Packet capture library repository
- Packet I/O interface library repository
- Packet data structure repository
- Blocking queue for inter-thread communication repository
- Random number generators repository
- Red-black tree repository
- Some kind of timer implementation repository
- Project #1 repository
- Project #2 repository
- ...
- Project #N repository
...and the dependency hell and repository number proliferation have been genuine problems.
Now, should you use an existing open source library instead of writing your own? You need to consider:
- License problems. Sometimes the mere requirement to give credit to the author in the documentation provided may be too much, as 20 libraries will usually have 20 distinct authors.
- Different operating system version support
- Dependencies of the particular library
- Size of the particular library: is it too large for the provided functionality? Does it provide too many features?
- Is static linking possible? Is dynamic linking desirable?
- Is the interface of the library what you want? Note that in some cases writing a wrapper to provide the desired interface may be easier than rewriting the entire component yourself.
- ...and many, many other things I have not mentioned in this list
I usually use the rule that everything below 1000 lines of code that does not require something beyond the programmer's expertise should be implemented on your own. Note: the 1000 lines includes unit tests. So I certainly won't advocate writing 1000 lines of code on your own if it requires 10 000 additional lines for unit tests. For my packet processing programs, this means the only external components I have used are:
- Everything provided by a standard Linux distribution, because it's so many lines of code that it doesn't make sense to reimplement Linux. Parts of reimplementing Linux would also be beyond my expertise level.
- Bison/flex because LALR parsing is beyond my expertise level and over 1000 lines of code. I could certainly write a recursive descent parser on my own, but Bison/flex are so handy I see them as useful.
- Netmap, because it's over 1000 lines and beyond my expertise level
- Skip list based timer implementation from DPDK, because it's beyond my expertise level although it is less than 1000 lines of code (although I have alternative timer implementations not using skip lists)
Some things I have implemented on my own because they are simple include even things such as:
- MurMurHash
- SipHash
- Mersenne Twister
...because custom implementations of these can allow heavy inlining, leading to improved performance.
I don't do cryptography; if I did, I would add some kind of crypto library in the list, as writing crypto algorithms on your own may be susceptible to cache timing attacks even if you can with thorough unit testing show them to be compatible with the official algorithms.