I have multiple instances on the same worker that process long-lasting tasks. Usually, those tasks last about 30 minutes - 5 hours. Tasks are stored in RabbitMQ. Workers are deployed as Kubernetes single-container deployment with multiple replicas.
The problem is deploying new changes. I see two strategies here: interrupting current processing or deploying new workers and letting existing one's exit by itself.
I chose the first strategy that lets me deploy new changes quickly. After a deploy is finished I could be ensured that all workers use the same codebase. But there are downsides. I need handling exit signal, task processing restarting, restoring state, checking for should I insert or update records, and so on.
So my question is, could I say that interrupting current processing to deploy new changes is a best-in-class solution? Are there other approaches here?