Data validation between microservices

Question

Consider a scenario (in .Net Core world), where a microservice A collects data from external sources and sends this data asynchronously (RabbitMQ) to microservice B (the reporting system) where this data is finally stored and analyzed etc.

As A has to reject invalid data from the sources, it therefore must first validate them itself before sending it to the AMQP queue. Since microservice B just receives this data from a queue, which is essentially a boundary, of course it must validate them again, from my understanding.

I could implement the data plus its validation logic in a shared module/library which both services (A and B) reference, but this seems like a huge smell and coupling to me.

Is there no way around implementing the validation rules twice or am I looking at this too dogmatically - or am I missing something here completely?

Is the validation process identical at each stage, and If so, what's the functional requirement for passing identical validation checks twice? The need to run the same process twice seems a bit unclear -- on the surface it seems as if one or other of those pass-throughs is redundant unless there's some use case where outcomes might differ between each pass. — Ben Cottrell, Oct 30 '21 at 08:07
Also, I wonder where your thinking is behind coupling and libraries? Given that you're using .NET Core, I wonder whether this is based upon a misunderstanding of libraries in .NET? A decision to load an assembly is always deferred until runtime. Referencing a library only affects the `.csproj` file so is only relevant at build-time (csproj is just an MSBuild file, so plays no role after a build is complete). Dependencies in .NET are entirely down to code structure so the library/assembly of any class/object your code depends on is irrelevant to coupling. — Ben Cottrell, Oct 30 '21 at 08:38
You should differentiate between data validation (string cannot be empty, e-mail address is valid) and business rule validation (orders over €100 should be approved by a manager). Business rule validation should only be performed by the microservice that belongs to that specific domain. — Rik D, Oct 30 '21 at 08:42
@BenCottrell , the services are seperate components which are connected via event bus. Although that connection is kind of internal, it is a boundary, in my view, and thus incoming data cannot be "trusted blindly" but has to be validated. — Andreas H., Oct 31 '21 at 10:04
@BenCottrell , with coupling I referred to using the same codebase for separated components as such and did not specifically address any particular technical implementation of referencing a library. — Andreas H., Oct 31 '21 at 10:09
@AndreasH. Code reuse isn't coupling, so I am rather puzzled on why you consider using libraries to be a code smell. Coupling is about being unable to isolate code - which can be a problem when it interferes with automated testing or where you suffer problems as a result of inflexibility and brittle code -- Libraries don't prevent you from isolating code so they don't have any of these impediments. — Ben Cottrell, Oct 31 '21 at 19:43

score 4 · Accepted Answer · answered Oct 30 '21 at 20:59

TL;DR: Consider Copy and Paste

You are dealing with two different interfaces in your scenario:

An interface defined by an external source, which service A depends on
An internal interface for the communication between A and B through RabbitMQ

Both interfaces might include message specifications and the message specifications might currently be exactly the same, but they are still separate interfaces and this is very important.

What will happen when the external message format changes?
If the internal message format automatically changes as well, as soon as the external message format changes, then it doesn't matter how you implement validation, your services will be tightly coupled anyways.

You achieve loose coupling by allowing both interfaces to change independently. Let the external API change, but keep in internal API as it is. Service A now needs to translate from one message format to another, but service B doesn't need to worry about such details, it doesn't care how service A got to its message. The moment this happens, you will realise that using the same shared library for the validation of both messages won't work, since they are not the same interface at all.

At this point you might challenge whether the two interfaces should look the same to begin with. Are all data points in the external message relevant to service B? Is the data formatted in a way that is most convenient for B? What would happen if you switch external data providers, would you still want to keep the same interface? Consider designing the interface between A and B independent of your external provide, depending on your system's needs instead.

Assuming that you indeed start with two interfaces that are the same (but might diverge in the future), you still don't want to do the same implementation twice. This is a scenario where copy and paste, despite its bad reputation, is a valid solution.

Thanks Helena, you pointed me directly to what I was looking for: The questions I need to ask myself to evaluate the scenario. I was looking at it just too technical, not functional (domain perspective). — Andreas H., Oct 31 '21 at 10:23

Data validation between microservices

1 Answers1