7

I am looking into options for smoothing out our deliver and release pipeline, and would appreciate some advice on the best way to structure the source code.

This is a pretty large project, which consists of about 30 microservices. Microservices here is a very loose term, but for the sake of the question is it close enough. Currently all of this is stored in one repo.

The internal workflow is structured as follows:

1 month sprints where all code is committed to a dev branch. At the end of the month all tasks that should be moved into testing are merged into the «next major release» branch and all tasks which are critical/bugs are merged into the «active delivery» branch and tested before being selectively rolled out to customers.

Not all customers want the latest and greatest, as they want as few changes as possible to mitigate risk associated with change.

So at the end of the month we have the following:
1 up to date dev branch
1 partly up to date test/next delivery branch
1 production branch

and this is then rolled out to customers like this:
Customer A
- Installed production branched version 11
* overvridden microservice 3 from production branch 11.4
* overvridden microservice 5 from production branch 11.1
* ...

Customer B
- Installed production branched version 11.6
* overvridden microservice 7 from production branch 11.14
* overvridden microservice 15 from production branch 11.7

Customer C
- Installed production branched version 11.3

Would changing to a multi repo setup better cater to our clearly insane delivery pipeline? I am considering one repo for each service, and then one repo for each customer.

Code repo:
-MicroService 1
-MicroService 2
-MicroService 3

Customer repos:
-Link to MicroService 1 - version 1.5
-Link to MicroService 2 - version 2.5
-Link to MicroService 3 - version 1.7
-Customer specific data files (they are not under version control at all now...)

I think this would greatly improve our workflow, and move us back to the sanety of just having one branch. It should give us better control of what is actually released to an customer, gives us the ability to actually reproduce their enviroment and most importantly it should remove the need for merge day once a month.

I would greatly appreachiate some feedback on this.

Edit: The api is pretty stable between services, but the shared libs are a bit of a mixed bag. The product roughly 25 years old, so the core is stable, but as you can image in great need of some clean up as technical dept slowly has been building up.

I basically have two issues which I am trying to solve/improve

  1. Merge day is hell, the production branch can be up to a year older then the dev branch, with only a subset of what is in the dev branched merge into it. If a new feature is added in dev, which should not be in the current version, this is not moved to production. All fixes in dev which uses code from this will then cause huge issues if they need to be moved. My idea is that having multiple smaller repos will make this easier.

  2. Say there is a bug in a server for getting customer data. The bug is in some shared lib and I fix it, build the CustDataServer and all other servers which I THINK might be influenced by this, then deploy then on the customer/customers system. This means that next time someone does a bug fix, which happens to use my last bug fix, then they basically unintentionally release that into production without meaning to. Most of the time this is fine, but I feel like we have no control over what is actually running in production. ServerA is build with SharedLibA and SharedLibB, while ServerB is build with the same shared libs, but other versions. I am under impression that a multi repo approach would at least give more control in this case.

I am open for all kinds of ideas on how to clean up this mess.

Edit: I appreciate all the though out responsens so far. I agree that selecting repo style is not the issue here.

Your summary is pretty spot on, appart from the layout repos. That was something I was considering to introduce. Currently client spesific files (db and config files mostly) are stored on the client server and are not under any form of source control.

Regarding the release cycle, I agree with you that releasing more often is the best idea. Also, dont do partial releases. All or nothing. The problem is that the release cycle is outside of my control. I can influence many things, but only things which are within the company. Externaly it is hard to make changes. The reason for the partial and infreqent releases is to mittigate the risk for clients. The idea is that the less that is changes, the smaller is the chance of new bugs. The software is very critical, and bugs could litteraly cost millions per day in damages.

What if we tried doing this the other way, I if I describe the system as something to be created with the contraints that exist, what would be the ideal way to structure it?

  • Large code base which is logically split up in several units, but are (tightly) coupled
  • Yearly release of new features.
  • Features are a mix between local changes and system wide new features
  • Bugs need to be fixed asap, if critical within hours.
  • Customers to not want full releases, they want as little change as possible. In fact, they often dont want the yearly release and drag it out as much as possible
  • Critical software, bug are expensive

What would be the best way to structure the internal workflow to limit the number of pain points?

user1038502
  • 491
  • 1
  • 4
  • 8
  • Is there a lot of shared code between different microservices? Or is all shared code a microservice on its own? – Doc Brown Jun 09 '18 at 08:44
  • yes there is a lot of shared code – user1038502 Jun 09 '18 at 08:51
  • 1
    Then keeping the code in one repo is probably your best option. Otherwise, you need to restructure the shared code into libraries of its own, with stable APIs, versioning and release management on its own. Rule of thumb: individual repos are best suited for strongly decoupled products. – Doc Brown Jun 09 '18 at 08:55
  • The shared code is already structured into libs. The libs are just used a bit to frequently by others. The product is on paper a series of decoupled servers, but in reality they have very strong dependences to each other. I agree that a mono repo is ideal, but I dont see how to efficently do the deployment efficently with that. – user1038502 Jun 09 '18 at 09:04
  • You wrote "The shared code is already structured into libs" - and left out "with stable APIs, versioning and release management on its own" - as I guessed. But to focus on your problem: the key to more modular deployment is not more modular repos. The key is to have more modular deliveries. Your "microservice" approach seems to go into that direction, you need to focus on having stable APIs between them, then you can deploy them easier invididually. – Doc Brown Jun 09 '18 at 09:15
  • .. and since you are talking of "customer repos" - are you utilizing those repos for deployment? – Doc Brown Jun 09 '18 at 09:18
  • I edited the original post, as the reply was to long, in regards to your last comment, “.. and since you are talking of "customer repos" - are you utilizing those repos for deployment? «, i am not sure what you mean. – user1038502 Jun 09 '18 at 09:42
  • Possible duplicate of [Choosing between Single or multiple projects in a git repository?](https://softwareengineering.stackexchange.com/questions/161293/choosing-between-single-or-multiple-projects-in-a-git-repository) – gnat Jun 09 '18 at 09:44
  • 1
    "Merge day is hell, the production branch can be up to a year older then the dev branch" please explain how this is possible? why are you not merging the whole branch into production? – Ewan Jun 09 '18 at 09:52
  • The workflow is like this: 1.Get a bug report/feature request, try to reproduce the bug in the dev branch. 2.Fix the bug in the dev branch. 3.Merge the bug into next release branch 4.If the bug is critical, merge it into a production branch and deploy the affected servers to clients. Once in a while, the next release branch is promoted to production branch. From what I understand this happens once a year most of the time. As you can image, the difference between dev and production can be large. Why it is done like this, I have no idea. Sadomachochism ? I am trying to change this – user1038502 Jun 09 '18 at 10:01
  • 3
    @gnat: you missed that OP has an X-Y problem. They wrote literally "choosing one or multiple repos", but I am pretty sure that's not the problem here. – Doc Brown Jun 09 '18 at 10:01
  • @gnat, while interesting, I dont feel like it really helps me get closer to an solution. Doc Brown is more onto something I think – user1038502 Jun 09 '18 at 10:10
  • 1
    @user1038502: I am pretty sure you are approaching this from the wrong side. It seems your customers have too much control over what they can install in production and how they mix the different services together. What about telling him you won't fix bugs in older versions of the services, the need to install always the newest one if they want bugs to be fixed? – Doc Brown Jun 09 '18 at 10:13
  • I agree about customers having to much control, but they are huge international firms which pay a lot of money. I cant do anything about this, at least not in the short term. – user1038502 Jun 09 '18 at 10:16
  • At some point, version 2.0 may contain bug fixes for 1.9 along with potentially unwanted new features. A customer has to decide, stay on 1.9 or go to 2.0. You don't want multiple versions of 2.0. – JeffO Jun 13 '18 at 16:08

2 Answers2

3

It seems to me that one vs many repos is not the root of your problem.

It's your branching strategy.

  • If you can, switch to a recognised branching strategy such as gitflow.

  • If you are writing a bug fix for a problem in production. Make a (hotfix) branch from production. Not your develop or next release branch.

  • If your 'next release' branch hasn't been merged for a year. Don't even attempt to merge it. Just let it become the new 'production' branch.

  • Stop cherry picking merges. Merge a whole branch or not at all. If your various branches have diverged as much as you imply then you probably need to write that bug fix separately for each branch.

In regards to customers deploying different versions of the micro-services. This should not be a problem and it certainly shouldn't be related to your source control.

Save your versioned deployments separately from your source control. Even if you don't compile to binaries, you need to distinguish between the source code and the product.

Don't deploy by checking out code from source control. Make a zip file with the software + config and a deployment script of some kind. Imagine you have to post it on a cd.

I would expect live environments to have multiple versions deployed to enable zero downtime deployments and the like.

As long as each version is tested and you state which other versions it is compatible with this should not be an issue.

Ewan
  • 70,664
  • 5
  • 76
  • 161
  • 1
    Exactly what I thought when rereading the question - the branching strategy seems to be the main problem,. – Doc Brown Jun 10 '18 at 13:27
  • yes, hard to tell from the high level view as you say – Ewan Jun 10 '18 at 13:29
  • How would it help me handle this usecase? Release 2018 is rolled out to costumer. Servers are A, B, C and D Bug A is fixed, server A is updated and deployed at client. Bug B is fixed, server B is updated and deployed at client. For me to reproduce this enviroment on my computer, I need to first build the 2018 release, then get the bugfix version from Bug A and build server A, then do the same for B. While debugging I just have to hope the bug does not related to any shared code, as the files might have changed for server C and D as well, so the debug symbols might no longer be valid. – user1038502 Jun 10 '18 at 20:37
  • @user1038502 no you don't. you just get the v2018.1.2.3 zip file that the customer deployed. – Ewan Jun 10 '18 at 20:42
  • How would that work? Say all servers use the CustomMath.h file. At this point the system the client is running is like this: server A is built with CustomMath.h from version 2018.3, server B with version 2018.2 and the rest with version 2018.1. – user1038502 Jun 10 '18 at 20:56
  • no they are running with the compiled binaries from the deployment zip – Ewan Jun 10 '18 at 21:12
  • 1
    if you need to get the source code for that particular build then tag the branch/commit when you build it – Ewan Jun 10 '18 at 21:15
  • Thats the problem, how do I get that branch? The system at this point would be composed of 3 different branches. – user1038502 Jun 10 '18 at 21:16
  • when you build a particular service you had to pull something. it might not be the same code you used to build some other service but you will be able to get the exact code for those binary files – Ewan Jun 10 '18 at 21:18
  • How is that not an issue? If the bug is in CustomMath.H and the bug is triggered in ServerC, I would need to get a new version of the whole repo. There is no way I can track a bug through the whole system with just one branch. – user1038502 Jun 10 '18 at 21:21
  • 1
    are you testing your microservices individually? you need to version each one and test it against other versions of the other services – Ewan Jun 10 '18 at 21:24
  • 1
    having separate repos for each service and using a package manager for the shared dlls wont make the 'different versions of shared code' problem go away. You still need to treat each separate program as a seperate testable entity – Ewan Jun 10 '18 at 21:26
  • They are tested as a group. I dont understand what the last part of your comment. – user1038502 Jun 10 '18 at 21:27
  • you should really test each service against an environment of the previous version at least – Ewan Jun 10 '18 at 21:59
  • I still dont follow. I think part of the miss conception is that when I say shared, I mean shared code. The shared code is not in a dll. It is statically linked – user1038502 Jun 10 '18 at 22:06
  • it makes no difference. you know the version of the deployment, you tag the branch with that version when you build the service with serviceAv2018.1.2 the same version as the deployment so you can track it back – Ewan Jun 10 '18 at 22:08
  • I agree that I can track it back, but it does not make for a neat development experience. In this simple example, in order to debug a request going through the whole system I would need to get branch 2018 to debug server C and D, branch 2018.2 to debug server A and branch 2018.3 to debug server B. I would argue that this is not great. – user1038502 Jun 10 '18 at 22:15
  • 2
    its not great, but that's a factor of your customers wanting different versions. No method of building those versions can alter that – Ewan Jun 10 '18 at 22:18
1

As @DocBrown pointed out in the comments:

The original question

Monorepo vs multi repo for large project with multiple partial deliveries

may be not the real question here.

More in the middle, you write

Would changing to a multi repo setup better cater to our clearly insane delivery pipeline?

which leads to the question:

How to deal with our insane delivery pipeline.

First, I try to gather some facts to better understand your current situation:

  • You have one source repository which somehow represents the current state of your product

  • Your code is logically split up in several units, which are (tightly) coupled

  • You have different branches

    • dev
    • next
    • production
  • Several "layouts" (repos) for different customers
  • Sprints of 1 months length

You have two major pain points:

1) Merge day

2) More control over what goes into production


Of course it is hard to give here a one size fits all / glass ball solution, because even though you give a relatively good description of your status quo, it is perhaps more complicated than the 30000ft view, you gave here.

What I could offer here are some thoughts / ideas:

Overall goal: Having one current codebase for all customers

I) Work on the release cycle

I do not know for what reasons there is such a big drift between the production version and the development version.

Is there any chance to release your product more often. The more often you release, the smaller the drift is, the easier the merge should be.

II) Work on the configurability of your application

From what you write it is not obvious what the differences between versions are. Is it somehow possible to make the application more configurable? So that the customer would always get the current version of the product - but perhaps downgraded functionality for his needs.

Of course, I understand, that, the more configuration options you have the more complexity you have to deal with. This doesn't necessary make your life easier - but I see the upside of having a consistent and easier to deal with / unified codebase as a win in your situation.

III) When you have one codebase, which runs on all of our customers servers, you know what is in production and what not; so the second pain point should vanish.

But:

I honestly do not know how feasible these ideas are and how much effort you have to put in or whether the whole is worth the effort.

Since you have good paying customers and the fact, that your product is running (basically) for 25 years - and will so perhaps for another bunch of years - you should take small steps to win the race in the long run.

As a starting point, you could test, how much effort it is to have every customer running one version of one current service with configurable options. And how much the cost of complexity for configuration is.


Regarding your original question:

Stay with a monorepo. If things are as tightly coupled as they seem to be, it will be better.


Edit:

Perhaps you have to interpret the word "release" a bit more creatively. Release is a fixed point in your repo where on the release-branch is a working set of every components in as many configurations you have.

That means, that some customers get a release (bi-)yearly others more frequent but the customers within a release window get the same running code. Your goal is then: having at every time a current release ready waiting for roll out.

I understand that the customers are interested in stability and favour small changes because they assume the number of defects introduced is also small.

But on the other hand: The customer expects a working product and as long as you can guarantee that - e.g. with a tight test suite - I see no problem in making big updates. N°1 concern is to have as few (new) defects as possible and fix all (up to the point of the release) known bugs.

Aside: I worked for the last 2.5 years under conditions which were not as hard as yours, but similar. 6 customers, 6 different flavours of the underlying application. The underlying application itself consisted of over 7 separated applications which were in their own repo - but in hindsight and if I had been in the project from the beginning, I would vote for a monorepo for all of it, because of the tight coupling. We had at least monthly releases. But things piled up and releases for everything were too hard. I suggested the strategy of having "releases" independent from having them rolled out or activated, so that all projects did effectively always run on "current".

I found that this strategy brought a bit of relief for the developers.

Thomas Junk
  • 9,405
  • 2
  • 22
  • 45
  • I will try to work on the release cycle, but this is probably way over my paygrade. I think more realisticly the internal process need to be changed to better handle it. It is a long term goal for sure. – user1038502 Jun 10 '18 at 20:47
  • Configureablity would help some, but a lot of new features are really changes to old features. If it were a car, it would be like changing the engine to run on disel instead of gasoline. Not sure adding configuration, would help the health of the code base. I might be wrong here and would love some feedback – user1038502 Jun 10 '18 at 20:49
  • This is something I can not decide from this context. It is up to you to evaluate your product. How practical it is, having features which you could turn on and off. It would be like: if the config says "engine=diesel" it runs on diesel, if it says "engine=gasoline" it runs that. As I mentioned: I do not know if it is feasible. – Thomas Junk Jun 11 '18 at 05:29