Inter-language Communication

Question

If you have a program that has a front-end and a back-end written in two different languages, how do those two systems communicate with each other? I have no professional experience in programming, I'm still in school. I know how, for example, to communicate between two java programs if I needed to. If I had two programs on two different machines, I could use sockets and streams. But what if those programs were written in different languages? You could conceivably use files, but are there any other ways of communicating?

I've been doing some research on this and can't quite pin down an answer. Thanks for your help.

"Your questions should be reasonably scoped. If you can imagine an entire book that answers your question, you’re asking too much..." ([help/dont-ask]) — gnat, Jun 28 '17 at 19:18
@gnat I don't think that a short introduction to different ways of communication b/w programs would take an entire book. — scriptin, Jun 28 '17 at 19:42
yeah sure short introduction, four answers in 40 minutes. Wonder how many attempts at short introduction is going to be there in 4 hours... [Where to start?](https://softwareengineering.meta.stackexchange.com/a/6367/31260) — gnat, Jun 28 '17 at 19:45
Firstly it depends what kind of *pattern* you need - e.g. Request/Response, RPC, direct notification, broadcast, multicast, update-polling, reliable-vs-unreliable.. Choose some kind of transport and/or storage mechanism; for example - Sockets, Pipes, Shared Memory, Message Queues, Message Brokers, Databases, FileSystem, NoSQL Stores, Key-Value (cache) stores, REST, SOAP.. In most cases you'll need to think about message/data format, in which case consider XML, JSON, Google Protobufs. There are many possible approaches; the first job is choosing the **right** tool(s) for your specific problem. — Ben Cottrell, Jun 28 '17 at 20:34
Which programming language you use doesn't matter. What does matter is that you have an inter-process communications protocol that both programming languages support (either in the language natively, or in libraries). — Robert Harvey, Jun 28 '17 at 20:48
@robert Understood. And that protocol is going to essentially establish the rules for how certain types of data is written and read correct? — namarino41, Jun 28 '17 at 21:02
Rules that define how a language stores certain data so that the other language knows how to retrieve that data. — namarino41, Jun 28 '17 at 21:15
Yes, that would follow. Someone mentioned JSON below; the nice thing about JSON and XML are that those rules are standardized and well-understood, and you can get libraries that read and write those data formats in almost any programming language. That's not the entire protocol, of course; you still need some transport mechanism like TCP/IP or named pipes to move that data from one process to another. — Robert Harvey, Jun 28 '17 at 21:19
Quite related, but not a clear duplicate: [local communications between two apps](https://softwareengineering.stackexchange.com/q/262897/60357) — amon, Jun 28 '17 at 22:02

score 6 · Accepted Answer · answered Jun 28 '17 at 19:38

For communication between two programs (processes), we need Inter-Process Communication (IPC)

Sockets are often used even for processes on a single machine. For example, there is a lot of tooling around HTTP servers and REST APIs that can be used this way. The advantage of sockets is that they have a global address (port number), so any process can establish a connection.

Communicating via files is sometimes used, especially if some directory is used like a mail box that will be processed at a later time. In practice this is fairly complicated: The writing process needs to use unique file names, and the receiving process needs to watch the directory. And there is no easy way to send an answer.

For local processes, the primary communication mechanism is pipes. Pipes are anonymous, uni-directional streams provided by the operating system. Pipes are especially important on Unix-like operating systems like Linux where many tools are written in a way that consumes data from stdin and writes output to stdin. This allows many such tools to be chained together into a larger pipeline. To the application, pipes appear as normal input/output streams.

Since pipes are anonymous, an application cannot connect to an existing pipe. Instead, pipes are created by a parent process. When the parent spawns a child process, the pipes (and other file descriptors) are inherited from the parent. For bi-directional communication between a parent and a child or a between two sibling processes, it is common to create a pair of pipes. After the processes have been spawned, each communication partner closes one end of the pipe, so each process has one pipe for reading and one pipe for writing.

A fairly low level method of communication is shared memory. The operating system has the ability to make some amount of memory visible to multiple processes, so now everyone can read or write data freely to that memory region. However, all processes now have to agree on some binary format, and have to use some synchronization protocol to avoid overwriting each other's changes. Despite these problems, shared memory may be a good solution if multiple processes need to modify the same data very efficiently. Otherwise, it would be more common to create a server that manages the data, and accepts requests (e.g. over a socket) for data access & modifications.

I was under the impression that sockets were used for client server communication over a network. I didn't realize that sockets could be used for communication between applications on the same machine. Is this a generally accepted solution or do programmers tend toward other solutions? — namarino41, Jun 28 '17 at 20:34
@namarino It is fairly common to run servers on localhost, but only if other IPC mechanisms like pipes are not suitable. Most drawbacks of network connections (high latency) don't apply for local connections. But right now I only have two servers running: an IDE backend, and a message queue, sometimes I also have a few webservers or a database. Admittedly, that is uncommon for personal computers. But the general technique (localhost sockets) is the easiest way to make some service available to all processes, and is widely used. — amon, Jun 28 '17 at 21:59
@namarino It's perfectly valid, especially in the case when two programs may or may not be on the same machine. For example, the [X window server](https://en.wikipedia.org/wiki/X_Window_System) on Linux uses sockets for connections between the server and client. 99.999% of the time, the server and client are on the same machine, but this system affords the flexibility for this to not be the case. You can use SSH to tunnel this socket, allowing for first-class remote desktop just by having the server and client communicate across the network using a socket. — Alexander, Jun 28 '17 at 23:09
Thanks to both of you. All of this really clears everything up. — namarino41, Jun 29 '17 at 00:45

RibaldEddie · Answer 2 · 2017-06-28T23:28:24.643

You've already identified one possible solution: streams of data over sockets. They would work just as well for two processes written in different languages as if both were written in Java.

Sockets have traditionally been provided by the specifications of the underlying operating system and/or standard libraries of the OS, and traditionally most programming language runtimes provide a library which can interface with the OS provided socket implementation.

So let's assume that's the case: that you can connect a running C process and a running Java process together using local or network sockets.

Your real question is, what gets passed over the socket?

The answer is: whatever you want. The operating system will happily pass bits back and forth, you have to decide what those bits mean to your processes.

Examples might include BSON, XML, protocol buffers, Unicode, compressed word documents, PNG images, and whatever else. It's up to you as the programmer to code your Java and C programs that are reading and writing the data on the socket to know what to do with the data and what it means.

score 1 · Answer 3 · answered Jun 28 '17 at 19:17

Taking your two Java programs as a starting point, both of those programs have implicitly agreed to the format of the messages EG 8 bits to a byte, so many bytes to a word, so many words to a packet, and what the meaning of a specific word in a packet is.

However once you have that agreement then it doesn't matter what language the endpoints are written in as long as they both fill in and interpret the data in the same fashion. Thus programs written in different languages can exchange packets and everyone is happy.

Now the content of those packers can also have a higher level meaning that both ends agree on. The content could be raw binary data, simple text strings or even be formatted as higher level structures For example HTTP messages or XML or JSON data. But agreement on the meaning by both ends is the key thing.

About the only tricky part is when you have two different machines communicating with different Endianess. This only matters once the data gets to the physical medium of transmission. At which point there are standard solutions for doing this through the agreement of network byte order (see that previous link)

So are there tools that help establish that agreement between languages? Or does that agreement already exist? — namarino41, Jun 28 '17 at 19:33
Those agreements are called *[communication protocols](https://en.wikipedia.org/wiki/Communications_protocol)* — scriptin, Jun 28 '17 at 19:39
@namarino If you are using a well defined agreement (IE protocol) then you typically have library calls that do the heavy lifting for you. EG to post JSON back to a server see this C# example https://stackoverflow.com/a/10027534/31326 which is coded at high level (Java would be similar). But this implicitly assumes that there is a server at the other end that speaks HTTP and accepts POST messages. That sort of agreement is something you have to specify at a system design level. It's only if you are writing new language or a new protocol that you have to do it all yourself. — Peter M, Jun 28 '17 at 19:51

Basile Starynkevitch · Answer 4 · 2017-06-28T19:23:34.647

It is implementation specific (notably ABI & calling convention specific).

Many language implementations provide a way to call a function (in the same program and process) in some other language (often C). It could be called foreign function interface.

For Java, read about JNI.

For Python, read about Extending & Embedding the Python Interpreter.

For Ocaml, read the chapter about Interfacing C with Ocaml.

etc....

But what if those programs were written in different languages? You could conceivably use files, but are there any other ways of communicating?

Yes, it is called inter-process communication. Linux provide many ways to do that thru its system calls (listed in syscalls(2)...), pipe(7)-s, socket(7)-s, signal(7)-s, shm_overview(7), sem_overview(7) etc etc...

Read also Operating Systems : Three Easy Pieces

John Wu · Answer 5 · 2017-06-28T22:33:38.637

"Written in" != "Can communicate with"

I know how, for example, to communicate between two java programs if I needed to. If I had two programs on two different machines, I could use sockets and streams. But what if those programs were written in different languages? You could conceivably use files, but are there any other ways of communicating?

You'd use the same techniques whether the two programs are written in the same language or different languages.

The language that a system can communicate with has nothing to do with the language it is written in. Your Chrome browser, for example, is written in c++, but the language that it "talks" is HTTP. The web server, as long as it conforms to the HTTP specification, can be written in any language, and it'll still work. And there are many languages have libraries that allow them to communicate via HTTP.

Same goes for other communication protocols, including EDI, ODBC, SMTP, you name it. If you were to take Wireshare and look at what is being sent over the network, you aren't going to see any Java or C++ code. The language the program is written in is totally irrelevant.

In fact, if you did have a protocol that passed c++ or Java over the wire, there would be two huge issues:

(1) It would be incredibly difficult to deal with, because most programs, at runtime, don't even understand the language they're written in; the compiler can read that language, but it emits either machine language or some sort of intermediate language (bytecode for Java or IL for C#, for instance). It is often the case that the run time does not understand source code at all.

(2) There would be a huge security exposure, since anything sent over such a protocol would be wide open to an injection attack.

score 0 · Answer 6 · answered Jun 28 '17 at 19:10

0

The generally accepted, and most basic approach to this would be a creating a REST API. Using JSON as your data format would make things easy for you.

answered Jun 28 '17 at 19:10

TheCatWhisperer

5,231
1
22
41

5

Since you probably mostly write web applications, you may have a biased perception that RESTful APIs are very common way of communication, but I think that most programs running in the world at any given moment are communicating via non-web-based protocols, since most programs are communicating *within* their respective systems, as opposed to *between* those systems. – scriptin Jun 28 '17 at 19:37
Why Json makes easier the inter-process communication? – Laiv Jun 28 '17 at 20:01
@Laiv: From an infrastructure perspective, it doesn't. You still need some mechanism to get data from one process to another. But once you have that mechanism, you can choose whatever data protocol you want. JSON (and XML) are convenient in the sense that there are off-the-shelf parsers available for just about any programming language. – Robert Harvey Jun 28 '17 at 20:46
Well, It's fair to say that both fromats are broadly supported by many programming languages. – Laiv Jun 28 '17 at 21:00

Inter-language Communication

6 Answers6