0

Background

I am trying to write a simulator where multiple AI agents are competing and/or collaborating to achieve the goal of maximizing some utility function.

Each agent has the ability to interact with the world where it might alter the state of the environment, based on some actions it does. And as a result of such actions, a reward signal is transmitted from the environment to the actor (agent implementing the action).

Some agents are designed spectate other agents' actions and rewards, so it would not have to suffer all consequences while it learns optimal moves.

What I did initially, is defining the following methods on the environment class:

  • Interact(action, actor) that return a tuple of both reward signal and new state
  • GetState() returns current state
  • Spectate() returns a collection of what happened in terms of actor, action, original state, new state, reward obtained.

But this seems to complicate my design and prevent me from scaling the system afterwards.

I was seeking some general way for different agents and environment(s) to interact without explicitly calling methods of certain type or sending out an identifier to the actor.

Proposed Solution

So I thought of having some mailing system, where an actor (agent) send a message to the environment through a mailbox, and the environment would read the incoming message, interact with it, and return a message to sender (the actor).

Meanwhile, curious agents would read a copy of the returned message that is published for whomever is interested.

This might sound like an observer pattern, except that each agent and the environment(s) are eligible for observing interactions, both for direct interaction and for learning from others' mistakes.

That means, notifications are bidirectional, so it will be an overhead for each object to maintain a list of subscribers to notify when some event occurs. Also, since this is an AI simulation, some processes might be stochastic, e.g. spectating might not be 100% of the time for curious agents.

enter image description here

So we have multiple client classes (not sharing the same super-class) that are capable of messaging one another via what is similar to an Enterprise Service Bus

And what I called PostOffice, would have a Factory method, spawning a mailbox object for each object attempting to message other objects.

So whenever a client object attempts to mail some other object, they would lookup some sort of a directory method, and send a message to the identified mailbox address through the associated mailbox object.

Mailbox object in turn, will notify the post office object, that will forward the message to the receiver mailbox, that will hold this message until the receiver client checks for a message and reads it.

It is some sort of message queuing but at the object level not at an enterprise level

Question

  • Is there such design pattern?
  • Are there drawbacks for such approach?
A.Rashad
  • 594
  • 3
  • 19
  • 1
    "Name that thing" questions [are controversial here](https://softwareengineering.stackexchange.com/q/356397/1204). Do you have *a specific problem* that you need help with? – Robert Harvey Aug 28 '17 at 16:46
  • It is a simulator, where multiple AI Agents are expected to interact with the environment. Agent directly interacting with the environment can get current state and score of whatever action it did. While spectating agents cannot. I can write a method to pull such information, but this cannot be generalized. – A.Rashad Aug 28 '17 at 17:03
  • What prevents your client classes from taking or creating a dependence to a mailbox, if in fact messaging is one of their required capabilities? There's nothing in your diagram above to suggest that there's anything at all remarkable about these mailboxes, or that their design must differ under certain specific circumstances. – Robert Harvey Aug 28 '17 at 17:22
  • I was trying to break the dependency between post office and clients, so they consume messaging service of this post office, with a personalized mailbox. so it would be more or less an observer pattern, except that the client can both send and receive notifications – A.Rashad Aug 28 '17 at 17:32
  • That sounds like a viable question. Perhaps you can edit your post accordingly? – Robert Harvey Aug 28 '17 at 18:00
  • 2
    Sounds like an event based architecture to me. – Paul Aug 28 '17 at 21:27
  • For the drawback, you seems to do what JSM can already do for you, so youre reinventing the wheel. Otherwise it seems to me that it is a publish/subscribe pattern. With eventually 2 level of subscribtion : one for action, one for information. – Walfrat Sep 08 '17 at 12:39

1 Answers1

0

But this seems to complicate my design and prevent me from scaling the system afterwards.

I was seeking some general way for different agents and environment(s) to interact without explicitly calling methods of certain type or sending out an identifier to the actor.

Maybe the Mediator pattern could simplify your design? It's role is to contain logic for interaction between different objects, and it's often used in scenarios where a lot of different objects need to interact with lots of other different objects (complex relationships).

Please, see the "Java" section so you understand it better: https://en.wikipedia.org/wiki/Mediator_pattern

Also, use the Command pattern to create possible "actions" that might be performed by an actor in some specific environment. Don't implement the interactions between actor and environment right in your environment class. Doing this might help your project scales in a better way for new future ways of interactions.

To summarize:

  • Actor: will use the Mediator to interact with the world -> this will actually call a specific Command instance that will perform the action using the environment, and also generate the reward. The Mediator could notify all spectators about this action's result;
  • Spectator: subscribes to Mediator events in order to see actors' actions and learn from them;
  • Mediator: knows all specific Commands that can change the current Environment, and allows Actors to start an action. Also, holds a list of events made, that are available to Spectators;
  • Environment: contains it's current state and allow specific actions, that are available implemented within specific Command classes (that will be used by Mediator);
Emerson Cardoso
  • 2,050
  • 7
  • 14