Why would CPython logging use a Lock for each handler rather than one Lock per Logger?

Question

While developing my own logging library, I studied the source code of the standard logging module of CPython.

One of its features is that handlers are thread-safe. One can write logs to a file from multiple threads without the risk that the lines are corrupted.

I digged into the sources of the logging module and I figured out that to ensure thread-safety, each handler uses its own Lock that it acquire and release at each logging call. You can see this here.

What is the point of doing this rather than using a single lock surrounding the call of handlers globally? For example, why not acquire the lock in the callHandlers() function?

I see two main advantages:

Better performances as there is only one lock acquired and released
Ensure that logged messages are displayed in the same order in all handlers (thinks for example of two handlers H1 and H2 and two threads logging respectively M1 and M2, this avoid the possibility of the sequence H1.handle(M1) -> H1.handle(M2) -> H2.handle(M2) -> H2.handle(M1) to happen)

In my personal logging library, can I safely do something like this per logger:

with self.lock:
    for handler in self.handlers:
        handler.handle(message)  # No lock used in "handle()"

Or am I overlooking something else?

score 5 · Accepted Answer · answered Dec 24 '18 at 10:18

5

Logging handlers can be slow, especially if you do remote logging to another machine, using for example SocketHandler/SMTPHandler/HTTPHandler, or to third party service logging handlers. If the entire logging library uses a single lock, then logging to slow handlers (e.g. email to admin on exception) will block logging to fast handlers (e.g. local files logs).

Additionally, some logging handlers may not require locks at all, for example external logging may use timestamp to order logs instead of implicitly by serializing the handling.

answered Dec 24 '18 at 10:18

Lie Ryan

12,291
1
30
41

I don't mean using one lock for "the entire logging library", I mean using one lock per logger. If H1 is slow and H2 is fast, but both are attached to the same logger, then H1 will block H2 while doing `logger.log()` anyway, right? – Delgan Dec 24 '18 at 10:27
Also, locking on the loggers would probably not achieve thread safety as a single handler can be configured as a handle target for multiple loggers. If you lock on the loggers, then multiple threads logging to multiple logger that logs to the same handler will be indeterministic. – Lie Ryan Dec 24 '18 at 10:46
Oh, yeah, I did not think of the fact that handlers could be re-used across multiple loggers. – Delgan Dec 24 '18 at 11:09
1

And putting your first argument in others words: one lock per handler allows "parallelization" of logs handlings. For example, if using only one lock, if H1 and H2 are both slow, and if two threads need to log a message at the same time, the second thread to acquire the lock will need to wait for the first thread to emit on both H1 and H2, before logging itself to H1 + H2. Using one lock per handler, this allows second thread to emit on H1 as soon as possible, while in the same time, the other thread log to H2. – Delgan Dec 24 '18 at 11:13

Why would CPython logging use a Lock for each handler rather than one Lock per Logger?

1 Answers1