1

I'm migrating my current application to multi-tenant setup.

Now I've multiple rabbitmq workers to process async job, publish and consume integration events, and other stuffs. I'm planning to use vhosts in RMQ (1 vhost per tenant per service).

Right now, I'm still not convinced about what approach to take to implement this multi tenancy setup. Here're the options that I've:

  • One docker container per tenant (which runs 1 consumer process) -> This sounds overkill to me. Also, I've to figure out, how to read active_tenants from a central tenant config service to spawn containers dynamically, in docker-compose.

  • Use multiprocessing module in Python to spawn one worker process each for one tenant. I've tried this out, and have one process fork dynamically multiple tenant process. I'm confused about monitoring stuff here. Also, to be able to handle signals for each tenant process (so that we can track & restart individual tenant process on failure), is there some standard approach? Like some fixed set of signals to worry about? Should I even do it manually, or let something like supervisor to take care of that?

  • Use supervisord inside docker to manage multiple tenant processes. But this will require pre-configured supervisor config. Can't read active tenant_id from a central tenant service probably?

Now out of these options, what really should be most robust approach? We don't really have that much scale issue. Ours is subscription based tenant model. Number of tenants probably is not going to be more than 15 or 20 in near future.

I'm also worried about lots of processes now being spawned per service, which also increases are server load requirement.

Another approach which I don't think I should take is multi-threading approach. Is it a good idea to have long running thread for each tenant under a process. Well, of course, given it's Python, it won't really scale well.

Also, is there already something available to have my rabbitmq workers spawn multiple workers for given set of connection string? I didn't find any in last 3-5 days.

Language and Frameworks used currently:- Python, Flask, Kombu (for RMQ), Postgres. And, Docker for containerization.

Let me also explain our current deployment approach, as it could it relevant in getting optimal solution:

  • Every app has defined their Dockerfile with their repository.
  • When CI/CD kicks in, jenkins builds the docker images for the app, and pushes it to docker hub.
  • We've environment specific docker-compose files (This possibly could be merged)
  • We are storing build_env (required to inject variables in docker compose), separately, and pass that to production machines, in order to run docker-compose
  • With this, jenkins now connects to prod machines, copies the deployment related files as tar archive to machine, and then runs docker-compose command on the machine.
Rohit Jain
  • 167
  • 1
  • 10
  • What is your reasoning for wanting a different process and RMQ workers for each tenant? If I were to visit the application from the context of two different tenant, how would my experiences be different (apart from the differences in content shown to me)? – Bart van Ingen Schenau May 26 '20 at 11:42
  • @BartvanIngenSchenau 1. I don't want to use same queue to publish messages for multiple tenants, as this can have 1 tenant block messages for other tenants. 2. Other approach would be to have different queues for different tenant, bounded to same exchange (using routing keys). This requires code change everywhere to listen to different queues now. – Rohit Jain May 26 '20 at 12:14
  • With `vhost` used for separating tenant, I just need to connect my application to different RMQ connection, and I'll have it working for other tenants. We can scale the workers for heavier tenant independently too. – Rohit Jain May 26 '20 at 12:15
  • Can you explain why you feel having a docker instance per tenant is overkill? – JimmyJames May 29 '20 at 15:30
  • @JimmyJames I thought having lots of docker containers might have overhead on EC2 instances. So I actually read about that some more, but it seems, it really would only have a very small overhead than what a process would have. – Rohit Jain Jun 01 '20 at 08:02
  • I'm probably aligning towards having one docker compose, spawning multiple containers by taking tenant_id as parameter (sadly docker-compose doesn't take parameters, so I would have to use env variables) – Rohit Jain Jun 01 '20 at 08:03
  • @RohitJain Docker container will have minimal overhead here, especially with a lean image (i.e. alpine). Is this currently running in a K8s environment, or are you deploying to your Docker host directly? – Juxhin Jun 03 '20 at 09:50
  • @Juxhin I'm deploying docker container on EC2s directly right now. I got it working with but in a weird way. `docker-compose` first doesn't take argument, so have to set env variable for current tenant_id. Also, have to set `COMPOSE_PROJECT_NAME` to run same docker-compose service, for multiple tenants (even using different container name, didn't work, without project name). – Rohit Jain Jun 03 '20 at 11:49
  • Tightly coupling with Docker Compose in production environment is definitely going to be problematic sooner rather than later. I'll see if I can offer up some more concrete advice later today. – Juxhin Jun 03 '20 at 12:52
  • @Juxhin Sure, thanks. Just FYI, I've added some more details about our deployment steps. – Rohit Jain Jun 03 '20 at 13:20

0 Answers0