How to limit the amount of requests by minute to a external system on a micro services environment?

Question

I'm having some problems to think on a solution to control the amount of requests by minute to a external system on a micro services environment on Kubernetes.

The scenario

This external system is an e-mail marketing application (called Responsys) that permits only some amount of requests by minute for each login. Some types of e-mails use 2 requests and some types of e-mails use just one request*.

Actually, each system that needs to send an e-mail send a message to a RabbitMQ queue and one of our micro service is responsable to consume this message, read the informations and communicate with Responsys obeying a 40 requests per minute limitation.

The actual solution

The actual working version of this integration get 20 messages by minute from the queue using a simple scheduled process. Why 20? In the worst case, this 20 e-mails will consume 2 requests. Each e-mail is processed asynchronous, so this 20 e-mails will communicate with Responsys at the same time. The e-mails that could not be processed (Responsys can throw some error), we save on a database table to be analyzed later.

This works pretty good actually, even it's not optimized because some types of e-mails uses only one request. But there is a problem on that solution that can harm our limit of requests.

The problem

Kubernetes can understand at some moment, using his performance algorithms, that one more micro service instance (that integrates with Responsys) is necessary. If this happens, this will break our request limitation, because will be two (or more) instances reading messages from the queue and trying the send e-mails through Responsys, surpassing the 40 requests per minute.

I had the idea to setup the micro service on Kubernetes to not create any replica of this micro service, assuring only one instance, because this micro service is pretty simple and specialized. I don't know how exactly do that yet, but seems very simple reading the Kubernetes documentation, but my colleges don't like the idea, because may exists some weird error scenario where two instances could exists.

So, we are trying to think on a solution besides the micro service instance, using some kind of "ticket system" read from a cache (Redis) shared by any number of micro service instances. This seems a heavy solution for a simple problem, so I would like to have some help to find another alternative for that.

^{* I simplified the problem, because the requests limitation by minute differ from two different endpoints. One of them permits 200 requests per minute and another 40 per minute. I will limit that number of requests per minute using the limit from the most restrictive endpoint.}

score 6 · Accepted Answer · answered Jul 11 '18 at 11:34

6

I think this is all about resilience :)

What happens, if the request limitation is reached? Will "responsys" just throw an error or really tell the calling microservice that it hit the limit? Will the microservice recover properly?

It should not matter how many instances of your calling microservice you have (this is the whole point of using real auto scaling there), as long as they can recover when hitting the speed limit. If e.g. the call would be a http-request, and responsys answers with a proper HTTP-Code (iirc 429). Your microservice could just "try again later". If you implement that, you can ignore the speed limit of the callee in your microservice code completely, what shouldn't be its concern anyways. If some day in the future you remove or raise the speed limit on the callee, you won't even need to touch your microservice again :)

answered Jul 11 '18 at 11:34

Alejandro Alanis

211
1
2

Hi! The Responsys have a specific message error when this limit is reached, asking to try again in a minute. It's possible to me identify this specific problem and return the message to the Queue. But my worries about this strategy is when we have something like 5000 emails in the queue insistently trying to be sent, make dozens of requests per second in vain. I'm afraid their service might block our application temporarily (or something like that). We have some routines that could send more than one thousand of e-mails at the same time. – Dherik Jul 11 '18 at 11:46
@Dherik If you get this error, just have the application exit. At most you'll end up with one of these errors per process per scheduled run. – JimmyJames Jul 11 '18 at 15:25
@JimmyJames, "application exit" like wait for a minute when the error occur and restart the e-mail sending after that? – Dherik Jul 11 '18 at 16:37
@Dherik Maybe I misunderstood. I assume you are using a schedule to execute your program on a regular basis. I really shouldn't matter how but as soon as you get this error, you should have the application wait for some time before continuing. You may or may not need to wait a whole minute so it's probably a good idea to make it configurable. – JimmyJames Jul 11 '18 at 17:13
@JimmyJames, You understand right. I'm using a Schedule. I say "1 minute" because the limit rate is controlled by each minute on Responsys. Waiting for 1 minute when the first error occur, I can guarantee that the next e-mail will be sent. – Dherik Jul 11 '18 at 17:21

Dherik · Answer 2 · 2018-07-23T13:01:48.510

0

I end up moving my one minute schedule routine from the micro service to the Kubernetes CronJob. So, each time that the job runs, only one POD (micro service instance) is called and I don't need to worry about the concurrency between them. A college suggested something similar, but using ShedLock instead of K8S CronJob.

It's not a perfect solution, but worked..

edited Jul 23 '18 at 13:01

answered Jul 18 '18 at 14:29

Dherik

2,406
20
33

How to limit the amount of requests by minute to a external system on a micro services environment?

The scenario

The actual solution

The problem

2 Answers2