1

Here is the design step by step:

  • User opens a webpage
  • Inputs few details in the form
  • Click submit
  • Request goes to API server
  • API server creates a pod in Kubernetes
  • Pod executes a script and stores the output in shared storage
  • Another pod keeps running and attached to the shared storage
  • API server waits for pod execution to complete
  • API server copies the file from Kubernetes API using the always running pod
  • Parses the file returns result to UI
  • User sees loading screen till above all steps to complete

Main challenge with this pattern is auto scaling. When pod goes to pending state because of no capacity available user needs to wait for 2-5 minutes for the auto scaling kick in and complete the pod execution.

karthikeayan
  • 135
  • 3
  • You would probably get better results if you give some comments about why it's needed to create a new pod for every request. On the surface it feels like the approach is to re-invent AWS Lambda but why? Also, if users are waiting for 2-5 minutes they will need something to stop their browser or their brains from timing out and moving on to something else. Just showing the loading screen with no info would make many users believe the service is broken. – joshp Mar 26 '23 at 16:17

1 Answers1

1

On the face of it this is a terrible design.

  • User opens a webpage

  • Inputs few details in the form

  • Click submit

  • Request goes to API server

    fine so far

  • API server creates a pod in Kubernetes

    This seems pointless. Have a worker process continually running and listening to a queue.

  • Pod executes a script and stores the output in shared storage

    Instead of shared storage, post the result back to another queue or database.

  • Another pod keeps running and attached to the shared storage

    You don't need a separate pod for everything.

  • API server waits for pod execution to complete

    If the API server is waiting the whole time, just get it to do the work. The API should return immediately after creating the offline job

  • API server copies the file from Kubernetes API using the always running pod

    ?? why pass this data around so much? why have a pod for everything? why files??

  • Parses the file returns result to UI

    why parse the file only to reserialise it again?

  • User sees loading screen till above all steps to complete

    what if they kill the browser or click refresh or back?

However, you don't really say anything about why you are using this design over:

  1. Just the API doing all the work.

    As you wait for the result anyway, there's no benefit for all this sending stuff around

  2. A queue + worker processes

    Have the api write to a queue and return immediate "Processing Your Message"

    Have a constantly running worker app pick up from the queue and do the work. Post back to another queue when done.

    have the API listen to the "done" queue and push the messages back to the correct user via websocket.

Also, there is a scenario where you design might be good, or at least the only thing that would work. Thats where you have to run some third party application which doesn't like running multiple instances at the same time or starting and stopping cleanly (i'm looking at you microsoft excel)

In that kind of scenario you need to effectively spin up a new clean machine with no left over state, write and read files because that's the only thing the application understands and then clean up everything afterwards. But even then your unit is a container, not a pod?

Ewan
  • 70,664
  • 5
  • 76
  • 161
  • This is great, appreciate your suggestions. Definitely will try to make use of these ideas. However, as a immediate solution, we just went ahead with the overprosioning solution by deploying low priority pod. Basically cluster will have n+1 nodes all the time. – karthikeayan Jun 20 '23 at 08:33