Multi Tenancy Aware Gateway routing

Question

Hello Fellow Programmers, We are converting our monolithic application into microservice based. One challenge which we are facing is that one of our component is stateful. We cannot make this component stateless as of now (Due to high migration cost) so we have decided to horizontally scale this particular component on the basis of tenant (Multi Tenant Architecture - We already have a notion of tenant in our application). For e.g. If we have 5 tenants and 2 server instances we can distribute the load by ensuring that tenant 1,2 and 3 are running on instance 1 whereas tenant 4 and 5 are running on instance 2 (You can assume the load is evenly distributed among tenants for now). Now we are facing some problems with the gateway which will be routing the requests to the appropriate instances.

We need to make the gateway aware of tenant. Now a tenant to application instance mapping might change at runtime (due to a specific instance failure or for load redistribution). Which means we need to change the mappings of gateway at runtime. This can be done if we regenerate the gateway configuration file and reload the gateway but I haven't seen any library which makes this generation easy for any gateway which makes me think this might not be how a gateway should be used and we might land into some problems in future.
This approach might have been easier (Not requiring any form of generation) If the gateways would have provided a feature to resolve a variable dynamically instead of static mapping. For e.g. instead of mapping the url to tenant, we would map a url to application instance and that instance can be resolved from tenant using a function call done by the gateway. But no gateway (Atleast the ones that I have explored) provide such feature. I have a workaround where I am using custom predicate in spring cloud gateway to dynamically resolve the mapping from local cache but I am not sure If this is how I am supposed to use a custom predicate feature of Spring gateway as I haven't seen any examples or papers where a custom predicate has been used in such a way.
I searched the internet to find how people solve multi tenant based application level routing. I can't find any blogs or papers on this topic. All I see is the application being stateless and multi tenancy is being handled at database level. But unfortunately making the current application stateless would require lot of changes and we do not have bandwidth for that.

I would like to know If someone has solved this problem before or can guide me someplace where I can find the solution for this. There are chances that we might be doing something wrong fundamentally which is why I am struggling to find solutions on internet and I would like to know what they are. Thank you in advance, I would really appreciate If someone could write their opinion on this.

score 2 · Accepted Answer · answered Apr 06 '21 at 12:22

The standard solution to this problem is to use DNS.

If you imaging your sharded service as being provisioned by multiple nodes, and each node could fail and be replaced independently then, its normal to end up with the following approach:

A list of named nodes, used for configuration purposes. Each node is provisioned by a server, and if that node fails, it is replaced by newly created node that replaces the failed node, but may have a different IP address. Each named node has its own DNS A record to provide its IP address.
Each tenant has its own DNS CNAME entry which refers to the node name as a name, and not its underlying IP address.

This approach allows both for the failure and replacement of the actual nodes providing the service, and also for the rebalancing of tenants across the service nodes, without needing to change platform wide configuration settings.

There are more involved approaches, which involves more complex service discovery, but in my experience, if you can use DNS to solve the problem, it is generally less hassle to use DNS, reserving the discovery service pattern for more complex situations.

Thanks Michael, this makes sense.. I do have a couple of questions If you could help me out here. 1). When using dns I am assuming we have to write a script to change the mapped IP address of the service when a failure is detected. 2.) If you have used a DNS service before, then can you help me out with the software you are using. I can see consul which provides dns based api for service registry. Thank you in advance :). — Tuhin Dey, Apr 13 '21 at 08:00
Ahh yes. This is really easy within one of the cloud providers environments (e.g. AWS, with route53). Less so when you are managing the DNS server yourself. There is a standard protocol for interacting with DNS servers defined in rfc3007. — Michael Shaw, Apr 16 '21 at 04:51

score 1 · Answer 2 · answered Apr 06 '21 at 01:02

I am currently working on a layout for a multi-tenant, multi-cloud-provider capable microservices architecture that might work for your needs as well. We're using internally maintained DNS servers to determine routing among our datacenters, nodes, and microservices. At the top level we have our own domain name (lets say example.com). Below that we have our tenant's name (joe.example.com). Below that we have the datacenters that joe is hosted in (china.joe.example.com, eastus.joe.example.com, etc.). Below that we have the hostname used to route to a specific node (PC254.eastus.joe.example.com). Below that we have the microservice itself (www.PC254.eastus.joe.example.com). We then have a routing scheme at the top level (www.joe.example.com) to forward to the best instance of that microservice based on proximity information garnered from GeoIP data. PowerDNS's authoritative server has most of the functionality you would need for this setup. The hardest part for us was having the instances of the microservices dynamically update their own domain names. Hopefully that at least gives you enough information to give you more ideas.

Thanks Eric :) I was wondering If updating domain name was a manual activity or you had to built a custom application to update the required gateway configurations? Also I am assuming your application was stateless due to which you didn't need to have any special handling on gateway side when an application went down? — Tuhin Dey, Apr 06 '21 at 12:12
Ha! I just now noticed your response Tuhin Dey. All state information is stored centrally via storing it to a distributed storage service. Different microservices have chosen to use different storage services but just as one example: www.joe.example.com might have a shared datastore at session-www.joe.example.com. Because all instances of www under the joe.example.com tenant store their session data to that central storage, any of them can pick up the ball at any time without the server or client having to be any wiser. It does disallow using local memory stores for state data though. — Eric Evans, Jul 12 '21 at 00:05

Multi Tenancy Aware Gateway routing

2 Answers2