I need some advice on an approach to log raw request/response data from a few webapps, for all operations and hits on all APIS accessed via HTTP methods (mostly HTTP POSTs), that I have hosted.
Objective: capture raw response body at the end of request-response life-cycle. The app is built using Python/Java and nginx is the web-server frontending these apps.
Approach 1: Use the lua logging extensions and capture raw response body at the end of req-resp cycle at nginx. Either directly by parsing the response body and logging it to a flatfile while relaying response back to the requestor, or by relaying it to another remote logger in a non-blocking way.
Advantages:
- very straight forward and easy to achieve.
- additional logic coded in lua runs in a very jailed env so safety is by design.
Disadvantages:
- Considerable hit on performance since the buffer sizes at web-server's end is limited and huge response bodies cause multiple buffer re-read ops.
- Programming is limited to lua, might not appeal to all.
Approach 2: Use a Python/Java based middleware which relays the request from web-server to the web-app and do the manipulation of request/response to or from the webapp. Either something in twisted/tornado etc.
Advantages:
- Not too difficult to implement
- filtering/logging logic can be coded in multiple high-level langs/frameworks
- web-server need not be bothered about having business logic or any further additional dependency.
Disadvantages:
- Operational overheads associated with maintaining a different middleware
Which of these do you feel is an elegant design? The languages and frameworks for the apps can be any of Python/Java, but the webserver would be nginx, on Linux.