"In my company, we have this distributed system over Kubernetes. Some microservices are shared among all customers, and others are up to them to upgrade. This system has to interact with A LOT of customer's services (VPNs, private APIs, databases, log aggregators, etc.), and each customer can have wildly different environments.
We follow a very common software development lifecycle:
- Our "engineering teams" work on features or bug fixes; once done
- Our QA teams test the changes to ensure they work; and then
- We have a release process where new container images are generated, and more thorough tests are done.
The reasoning behind this development -> QA -> release pipeline is sound. It allows the engineering team to focus on their tasks without being overwhelmed by an influx of customer requests. It also creates space for product evolution and ensures the quality one would expect from the thorough QA/release process.
However, when a customer requires a new feature or a bug fix, this software development lifecycle can backfire because we are not sure our changes will address all aspects of the customer's environment. For example, we may reproduce an issue, compose a fix, etc., but when deployed, the database may be using a different encoding, or the VPN server requires a specific cipher, etc.
Technically, it is not hard to test changes earlier. We can release different images, use feature flags, etc. Or customers also always have development and UAT environments.
The problem is the policy. I'd like to suggest we use a different workflow in situations where we need to test against the real environments.
So, is there some known process where a development team can test a developing change together with a customer, bypassing the default SDLC?