1

tl;dr - I want to stream text data from one writer process to multiple reader processes. I'm thinking of using a file to achieve this. Is it a good idea?

Using a file would avoid having to maintain client connections in the server and send data on each of those connections. The file would have a single writer which would always append and multiple readers. Data should not be lost.

I have multiple questions:

  • In general, is this a good idea?
  • If not, what do you recommend for 1 to many IPC?
  • Do I need file locking if the writer is only ever appending?
  • What about performance? I assume the OS is doing clever stuff by using memory as a cache, so it shouldn't be that bad. Or do you recommend using an in-memory file?

Here's the more detailed context: I have a typechecker written as a daemon process, which keeps checking the codebase whenever it detects changes. To obtain the errors, the user can invoke a client process on the command line to talk to the server. Currently, the client has to wait until the server has finished typechecking everything to receive the errors. Errors would be passed to clients via a monitor process which maintains connections from clients. I would like to change that and make the errors available to clients as soon as they are found. Streaming errors via the monitor is difficult with the current code. Besides, I think the file solution is simpler. Whenever the server would detect a change, it would unlink the file (clients currently reading it could finish do so safely in Unix) and create a fresh one.

Catherine
  • 19
  • 3
  • Is it important that clients see every message? – JimmyJames Jun 08 '21 at 16:39
  • https://www.gnu.org/software/coreutils/manual/html_node/tee-invocation.html – SK-logic Jun 09 '21 at 07:37
  • @JimmyJames yes, clients should see every message. – Catherine Jun 11 '21 at 09:06
  • Is there a reason you don't just write a new file on each event? How often are these events occuring? – JimmyJames Jun 11 '21 at 16:36
  • @JimmyJames You'd have anywhere between 0 and 500,000 "events" (event = error in the codebase), and it would take anywhere between seconds to minutes to find those errors. – Catherine Jul 16 '21 at 12:53
  • I think you are suggesting that having independent files would make this slower but I'm not clear on why that would be. You can have 500,000 files (I would implement archiving BTW) or one big file with 500,000 events. It's trivial to list a directory and look for events that have not yet been processed. If you have one file, you now need some mechanism to find data within the file e.g. and offset. It used to be that having a lot of files could cause issues with inodes but those limits are much higher now. – JimmyJames Jul 16 '21 at 15:10

0 Answers0