Hopefully an expert in graph theory - which I'm not - will swing by at some time to answer this. In the meantime, here are some thoughts that are hopefully of some help.
Naive approach
Obviously, you could try to enumerate all possible paths between all relevant pairs of nodes, and then try to find a valid combination via some commonly used method (graph search over possible configurations/constraint satisfaction/evolutionary algorithms).
However, both the enumeration and the evaluation of a configuration are likely very expensive for non-trivial graphs.
Simple approach
The most simple, and certainly very efficient approach (in terms of runtime) is to just place toll booths next to - or right onto - start/goal nodes. Because you only need to place them before the start or the goal node, you will have at most 4*N booths, where N is the number of node-pairs. The 4 comes from your comment that no node has more than 4 edges.
You can take into account
- how many node-pairs a particular node is part of and
- how many edges the node has
to reduce the number of total booths. From there, you might be able to optimize further (e.g. using some BFS-variation to check if all paths end up being covered already by other node's toll booths - up to a certain depth).
If this is good enough for you, I'd certainly recommend this approach. But if it was, you probably wouldn't be asking.
(A,B) Separators
Consider only the path from A
to B
in the graph below. Clearly, a good solution would be the red nodes or the blue nodes.

Those are called minimal (a,b) separators and they can be found in polynomial time. A very quick search revealed this, though there may well be better algorithms.
Once you have all minimal (a,b) separators of all pairs, it's time to find a good combination that covers all paths. This should be much easier.
Note that while you should be able to do this in polynomial time, it may still take very long, depending on the size of the graph and the number of pairs.
Other approaches
I have not been able to leverage the fact that you have at most 4 edges per node, but it feels like the kind of thing that would allow for a more efficient approach.
Also, you should look at any further structural information you have about the graph. For example, if you can eliminate dense subgraphs that contain at most one relevant node (that is part of a pair), you might end up with a simpler graph that allows you to make strong assumptions, and hence, use other approaches. Again, I assume you would have mentioned this, but it may be worth another look.