This question is different from What are best practices for testing programs with stochastic behavior? be cause it's particular to Regression testing.
We are building a chat bot, and for regression testing, we are planning to run the new system through all the conversational data that we have received till date. Then we are planning to compare the output from the new system against the output from the old system to see if the system has stopped answering any questions it was answering before.
We were having a discussion, where I was of the opinion that the standard regression testing where I define unit tests targeting features/capabilities of the product will not work for applications that are not deterministic in nature. Especially if there are new variations coming in all the time.
Another approach would be to hand create the test data, and it should be corresponding to any model enhancements that were done.
How is regression testing done is such cases where there is a machine learning model involved? What should be the ideal way of doing Regression testing?