5

I have a Python client launching a subprocess in C++.

The C++ program runs several threads that need to report results to the Python client.

Knowing that both the Python client and the C++ subprocess are running on the same machine, what is the best way to communicate between them? Using communication through TCP or through files?

By communication through files I mean that the C++ side would write its results to several different JSON or XML files that the Python client would look for and parse.

Is it bad design to communicate through files? Is using TCP faster? What if my computer has a solid-state drive?

EDIT: I ended up using pipes (stdin, stdout). See this post: https://stackoverflow.com/questions/36748829/piping-binary-data-between-python-and-c

Chuque
  • 179
  • 2
  • 6

3 Answers3

6

What you are trying to do is define a method for IPC, or inter-process communication. There are many, many ways to do this.

In general, the best methods for IPC provide the following benefits:

  • Standardized, so other developers can look at your code and understand it.
  • Robust, including error-detecting mechanisms.
  • Supported by your platform (OS, programming language, etc).
  • Simple, requiring little to no framework-level code on your part.

For these reasons, I typically pick TCP/IP.

  • It is heavily standardized and everyone should know how it works.
  • It has built-in error-detection and ACK mechanisms.
  • Supported by virtually all modern platforms.
  • Most library implementations are simple: provide endpoint information, get a stream.
  • Your client and server are currently on the same machine. If future requirements change this, all you need to do is change your connection info and it will still work just fine.

As far as the payload of the TCP/IP communication, that can be anything you want: XML, JSON, serialized objects, whatever is convenient. Typically when mixing languages, however, I would go with something like XML or JSON which are platform-agnostic and human-readable (aids with debugging).

I also highly recommend writing test cases where you can plug your stream into a mock that produces or consumes data. That way you can test your interface on both ends without needing a full client/server system up and running.


Do not communicate using files unless there is literally no other way. I want to keep this answer short and to the point, so I will say this. I have worked with interfaces that were file-driven. I hated it each time, and always asked if there was a different way to do it. This is error-prone and klunky.

  • 1
    Why no mention of unix domain sockets? In fact, considering the OP only needs simplex communication, I would recommend pipes as the optimal IPC mechanism. – gardenhead Mar 05 '17 at 01:40
  • @gardenhead "I typically pick TCP/IP" that is my preference, for the reasons I enumerated. I do use sockets sometimes, but not as frequently as TCP/IP. –  Mar 05 '17 at 02:15
  • More frequently than standard io?? – svidgen Mar 05 '17 at 04:33
  • @svidgen yes, because unless I am piping executables together on the command-line, they normally do not communicate via stdin/stdout. –  Mar 05 '17 at 05:06
  • But, if one process is launching the other, like in the OP, wouldn't stdio kind of be the *default* option? ... – svidgen Mar 05 '17 at 05:11
  • @svidgen maybe, maybe not. Again, my preference is for TCP/IP: it had trivially more overheard than standard in/out, and is many times more reliable _especially_ when mixing languages/environments. –  Mar 05 '17 at 06:08
  • 4
    If you have a better idea, write your own answer! I am sure OP would love to hear why stdin/stdout is a viable option, go ahead and explain why. The question can only benefit from multiple answers that provide sound justification for alternative solutions. –  Mar 05 '17 at 06:09
  • Don't forget depending on the environment there might be a very strict firewall not allowing any program to act as a server and open listen ports. – Mario Mar 05 '17 at 07:43
  • @svidgen I also suggest you add your answer because pipes are indeed what I ended up using. – Chuque Mar 12 '17 at 18:12
  • @Chuque Alrighty ... done. – svidgen Mar 13 '17 at 17:45
2

Default to Standard IO.

Use sockets if you need to avoid the overhead of creating new processes or recreating state to do the work. And use TCP sockets if portability and extensibility are potential concerns. (I could argue that TCP sockets should be defaulted to if you need sockets at all, but I won't!)

But more importantly, don't forget that the "standard" in standard IO is not a pun! The standard streams are automatically present, easy to use, and almost universally available.


It's also noteworthy, since this sounds like it might be your use-case, that most languages provide libraries that make spawning child processes and communicating with them via STDIO almost equally simple.

svidgen
  • 13,414
  • 2
  • 34
  • 60
1

I suggest creating a SOCK_STREAM socket pair. The benefit of socket pairs is that unlike pipes, they are bidirectional. On some operating systems, pipes are also bidirectional, but you should not rely on that.

Another benefit of socket pairs is that you don't need a unique identifier such as a port number or a file name for them.

How to actually implement this: the C++ process needs to know the file descriptor of its end of the socket pair, so you could pass that integer as a command-line argument.

If your requirements change and the C++ process is launched separately and not by the Python program, you could always have a name for the socket by using Unix domain sockets.

Also, if your requirements change even more and the C++ program is run on a separate machine, it should be relatively painless to change to using TCP.

Do note that SOCK_STREAM sockets may not necessarily preserve the sizes of messages. So, if your messages are not individual bytes but rather blocks of bytes, you need to prepend the block length in network byte order to the message. This ensures that the reader knows how many bytes need to be read.

juhist
  • 2,579
  • 10
  • 14