How to implement a rule-based decision maker for an agent-based model?

Question

I have a hard time understanding how to combine a rule-based decision making approach for an agent in an agent-based model I try to develop.

The interface of the agent is a very simple one.

public interface IAgent
{
   public string ID { get; }

   public Action Percept(IPercept percept);
}

For the sake of the example, let's assume that the agents represent Vehicles which traverse roads inside a large warehouse, in order to load and unload their cargo. Their route (sequence of roads, from the start point until the agent's destination) is assigned by another agent, the Supervisor. The goal of a vehicle agent is to traverse its assigned route, unload the cargo, load a new one, receive another assigned route by the Supervisor and repeat the process.

The vehicles must also be aware of potential collisions, for example at intersection points, and give priority based on some rules (for example, the one carrying the heaviest cargo has priority).

As far as I can understand, this is the internal structure of the agents I want to build:

So the Vehicle Agent can be something like:

public class Vehicle : IAgent
{
  public VehicleStateUpdater { get; set; }

  public RuleSet RuleSet { get; set; }

  public VehicleState State { get; set; }

  public Action Percept(IPercept percept)
  {
    VehicleStateUpdater.UpdateState(VehicleState, percept);
    Rule validRule = RuleSet.Match(VehicleState);
    VehicleStateUpdater.UpdateState(VehicleState, validRule);
    Action nextAction = validRule.GetAction();
    return nextAction;
  }
}

For the Vehicle agent's internal state I was considering something like:

public class VehicleState
{
  public Route Route { get; set; }

  public Cargo Cargo { get; set; }

  public Location CurrentLocation { get; set; }
}

For this example, 3 rules must be implemented for the Vehicle Agent.

If another vehicle is near the agent (e.g. less than 50 meters), then the one with the heaviest cargo has priority, and the other agents must hold their position.
When an agent reaches their destination, they unload the cargo, load a new one and wait for the Supervisor to assign a new route.
At any given moment, the Supervisor, for whatever reason, might send a command, which the recipient vehicle must obey (Hold Position or Continue).

The VehicleStateUpdater must take into consideration the current state of the agent, the type of received percept and change the state accordingly. So, in order for the state to reflect that e.g. a command was received by the Supervisor, one can modify it as follows:

public class VehicleState
{
  public Route Route { get; set; }

  public Cargo Cargo { get; set; }

  public Location CurrentLocation { get; set; }

  // Additional Property
  public RadioCommand ActiveCommand { get; set; }
}

Where RadioCommand can be an enumeration with values None, Hold, Continue.

But now I must also register in the agent's state if another vehicle is approaching. So I must add another property to the VehicleState.

public class VehicleState
{
  public Route Route { get; set; }

  public Cargo Cargo { get; set; }

  public Location CurrentLocation { get; set; }

  public RadioCommand ActiveCommand { get; set; }

  // Additional properties
  public bool IsAnotherVehicleApproaching { get; set; }

  public Location ApproachingVehicleLocation { get; set; }
}

This is where I have a huge trouble understanding how to proceed and I get a feeling that I do not really follow the correct approach. First, I am not sure how to make the VehicleState class more modular and extensible. Second, I am not sure how to implement the rule-based part that defines the decision making process. Should I create mutually exclusive rules (which means every possible state must correspond to no more than one rule)? Is there a design approach that will allow me to add additional rules without having to go back-and-forth the VehicleState class and add/modify properties in order to make sure that every possible type of Percept can be handled by the agent's internal state?

I have seen the examples demonstrated in the Artificial Intelligence: A Modern Approach coursebook and other sources but the available examples are too simple for me to "grasp" the concept in question when a more complex model must be designed.

An example of a rule I tried to incorporate:

public class HoldPositionCommandRule : IAgentRule<VehicleState>
{
    public int Priority { get; } = 0;

    public bool ConcludesTurn { get; } = false;


    public void Fire(IAgent agent, VehicleState state, IActionScheduler actionScheduler)
    {
        state.Navigator.IsMoving = false;
        //Use action scheduler to schedule subsequent actions...
    }

    public bool IsValid(VehicleState state)
    {
        bool isValid = state.RadioCommandHandler.HasBeenOrderedToHoldPosition;
        return isValid;
    }
}

A sample of the agent decision maker that I also tried to implement.

public void Execute(IAgentMessage message,
                    IActionScheduler actionScheduler)
{
    _agentStateUpdater.Update(_state, message);
    Option<IAgentRule<TState>> validRule = _ruleMatcher.Match(_state);
    validRule.MatchSome(rule => rule.Fire(this, _state, actionScheduler));
}

I would be grateful if someone can point me in the right direction concerning the implementation of the rule-based part.

I am writing in C# but as far as I can tell it is not really relevant to the broader issue I am trying to solve.

Just a friendly heads up here to check out "finite state machines" in Akka.NET, specifically [switchable behavior](https://petabridge.com/blog/akka-actors-finite-state-machines-switchable-behavior/#what-is-switchable-behavior), where an actor/agent can `Become` a different state which will `Receive` and react to messages differently. — RJB, Jul 30 '21 at 23:26
@RJB Thank you very much, I was not aware of the Akka.NET framework. I'll definitely look into it! — Vector Zita, Aug 01 '21 at 10:14

Shadows In Rain · Answer 1 · 2021-07-30T12:27:15.690

I am not sure how to make the VehicleState class more modular and extensible. (...) Is there a design approach that will allow me to add additional rules without having to go back-and-forth the VehicleState class and add/modify properties in order to make sure that every possible type of Percept can be handled by the agent's internal state?

So, you want something like a plugin-oriented architecture, where new components can be added and communicate with each other, without a need to modify the "core"? The most straightforward way to achieve this is to implement an agent-local DI Container. (In the C# land, the built-in implementation should be sufficient.)

class RadioCommunicationState
{
    public RadioCommand CurrentCommand { get; set; }
}

class RadioCommandPerceptor : IPerceptor
{
    public RadioCommandPerceptor(IRemoteRadioCommandSource source, RadioCommunicationState radioState) { /*...*/ }
    public void Update() => _radioState.CurrentCommand = _source.RecentCommands.LastOrDefault();
}

class HoldPositionRule : IRule
{
    public HoldPositionRule(RadioCommunicationState radioState) { /*...*/ }
    public float PriorityScale { get; set; } = 42.314f; // for fine-tuning, read it from the config
    public float RateSelf() => (_radioState.CurrentCommand == RadioCommand.Hold) ? 1f : 0f;
    public void Apply() => _vehicleControlsState.DesiredSpeed = 0;
}

class VehicleAgent : IAgent
{
    public void Update()
    {
        foreach (var perceptor in _perceptors)
            perceptor.Update();
        var rule = _rules.Max(r => r.RateSelf() * r.PriorityScale);
        rule.Apply();
    }
}

class Program
{
    static VehicleAgent BuildVehicleAgent()
    {
        var collection = new ServiceCollection();
        collection.AddSingleton<RadioCommunicationState>();
        collection.AddSingleton<RadioCommandPerceptor>();
        collection.AddSingleton<HoldPositionRule>();
        collection.AddSingleton<VehicleAgent>();
        // ...
        return collection.BuildServiceProvider().GetService<VehicleAgent>();
    }
}

(If you are using ML, be aware that extending ML model [with new inputs and outputs] usually warrants full retraining. If that's not acceptable for performance reasons, select/design a model that supports "continual learning".)

I am not sure how to implement the rule-based part that defines the decision making process. Should I create mutually exclusive rules (which means every possible state must correspond to no more than one rule)?

The FSM-like approach (i.e. 1 rule per 1 state) is simple and composable. Multiple FSM may be used in parallel to provide better flexibility, but it only makes sense for fully independent features that require no coordination. For complex rules that require sub-steps and coordination, just use behavior trees; it also removes the need for a scheduler.

If there was another possibility of a perceptor making the vehicle agents to change their routes, would I have a IRouteAssignmentCommandSource injected in a RouteAssignmentCommandPerceptor

IRemoteRadioCommandSource is just an abstraction to plug few holes in my examples. It does not matter that much where a command comes from. But I presume all commands go through the same radio channel, so putting all the commands into the same source makes more sense.

And while we're here, let's change perceptor to interpret a command via an intermediate state. This way more perceptors may be added without touching the rules.

class TemporaryStopIntentState
{
    public DateTime Target { get; init; }
}

class HoldPositionCommand
{
    public DateTime ReleaseTime { get; init; }
}

class RadioCommandPerceptor : IPerceptor
{
    /* ... */
    public void Update()
    {
        if (_source.PendingCommand is HoldCommand command)
            _temporaryStopIntentState.Target = command.ReleaseTime;
    }
}

class HoldPositionRule : IRule
{
    /* ... */
    public float RateSelf() => (_temporaryStopIntentState.Target > DateTime.Now) ? 1f : 0f;
}

Also, with the Finite-state machine approach only one rule must be selected and applied at a time, correct? So, in our example, if both rules are "valid" (both a RadioCommand.Hold and a NewRoute exist), onle one of these will happen during the first Update cycle and the other one during the next cycle. Have I understood this correctly?

It's always the rule with the highest weight. You want to tweak the rules carefully to ensure that there is no stalemate, otherwise the behavior will be dependent on the phase of the Moon. But again, you don't need to rely on a single FSM, we prefer them because they are cheap and composable.

Commands are confusing, because they are more similar to events ("I saw a cat") rather than perceptible state ("True, I see a red traffic light right now"). Moving command handling into perceptors (as demonstrated in the code above) may solve this conceptual misalignment.

Thank you for your input and the effort you took in demonstrating your answer with a code example. So in your implementation, is the communication/interaction between agents taking place via interfaces such as the IRemoteRadioCommandSource? If there was another possibility of a perceptor making the vehicle agents to change their routes, would I have a IRouteAssignmentCommandSource injected in a RouteAssignmentCommandPerceptor, with a corresponding RouteAssignmentRule to cause the agents to change their routes with something like navigator.ChangeRoute(newRoute)? — Vector Zita, Jul 29 '21 at 21:57
Also, with the Finite-state machine approach only one rule must be selected and applied at a time, correct? So, in our example, if both rules are "valid" (both a RadioCommand.Hold and a NewRoute exist), onle one of these will happen during the first Update cycle and the other one during the next cycle. Have I understood this correctly? — Vector Zita, Jul 29 '21 at 22:01
Shadows In Rain I greatly appreciate the effort you put in answering the question in details and with a well thought code demonstration. I am deeply sorry for not awarding to you the bounty :-( , since after some contemplation I concluded that the answer provided by JimmyJames is more in line with the approach followed by the traditional agent-based implementations (e.g. the examples from AI: A Modern Approach seminal course book) and the model architecture I'm striving for my current project. I'm looking forward to interacting with you again! :-) I am very grateful for your assistance. — Vector Zita, Jul 31 '21 at 13:23

JimmyJames · Accepted Answer · 2021-07-30T19:03:56.487

This is where I have a huge trouble understanding how to proceed and I get a feeling that I do not really follow the correct approach. First, I am not sure how to make the VehicleState class more modular and extensible.

While you have a number of specific questions here, what I see is a lot of general issues with the code you have shown here. A lot of the struggles your are having seems to be related to not using OO concepts properly. The point of creating interfaces and classes is to manage complexity. The main problem is that there's almost no encapsulation here and your objects are largely 'property bags' i.e. glorified maps/dictionaries. Let's start with your last example VehicleState. To start with, I don't think you need a VehicleState, or VehicleStateUpdater on Vehicle. Vehicle objects already have state and the updater of that state should be itself. For now let's look at VehicleState and I'll come back to that, later:

public class VehicleState
{
  public Route Route { get; set; }

  public Cargo Cargo { get; set; }

  public Location CurrentLocation { get; set; }

  public RadioCommand ActiveCommand { get; set; }

  // Additional properties
  public bool IsAnotherVehicleApproaching { get; set; }

  public Location ApproachingVehicleLocation { get; set; }
}

This is functionally equivalent to a hardcoded dictionary with types. Moreover, a lot of this doesn't make sense to me. Whether another vehicle is approaching is not part of the vehicle state. I would expect that to be part of the Percept (perceptor) state. Approaching vehicle location is part of the state of the approaching vehicle, which would already have a Location as part of it's state.

I don't understand the point of exposing the setter for Location. Something more like update which tells the vehicle to update it's state such as the Location would be typical here. Likewise, a setter for Cargo should be replaced with Load and Unload methods.

RadioCommand seems overly specific and you should probably have a Command interface and a Radio type with a GetCommand method. The HoldPositionCommandRule then becomes a HoldPositionCommand. Command and Rule might look like this:

public interface IRule
{
    bool public canMove();

    // ...
}

public class Command
{
    List<IRule> Rules { get; set; }

    public addRule(IRule rule) {
       getRules.add(rule)
    }
}

Then you can have:

public class HoldPositionRule : IRule
{
    bool public canMove()
    {
        return false;
    }
}

I put together a small working example using Python. Before anyone gets too excited, I'm not proposing this as a the best code ever written or as a really interesting agent example. My priority here is to keep things simple and short. To that end, I've eliminated route and hardcoded it as a straight line 30 units long. There are two rules: follow with minimum 2 units distance and hold position. I put three vehicles on the road and stop the simulation when the first one ('A') gets to the end. On every 10th cycle, the lead truck is given a 'hold' command and every 3rd, it's given the 'follow' command. The perceptor can only see vehicles that are ahead of it (or next to it, which shouldn't occur in this example.)

class HoldPositionRule:
    def can_move(self, distance_to_next):
        return False


class FollowingRule:
    def can_move(self, distance_to_next):
        return distance_to_next is None or distance_to_next > 2


HOLD = HoldPositionRule()
FOLLOW = FollowingRule()


class Command:
    def __init__(self, *rules):
        self._rules = rules

    def can_move(self, distance_to_next):
        for rule in self._rules:
            if not rule.can_move(distance_to_next):
                return False

        return True


class Radio:
    def __init__(self):
        self.command = Command(FOLLOW)


class Location:
    def __init__(self, position):
        self.position = position

    def update(self, rate):
        self.position += rate

    def arrived(self):
        return self.position >= 30

    def __repr__(self):
        return str(self.position)

    def distance(self, other):
        return other.position - self.position


def distance(a, b):
    return a.location.distance(b.location)


class Environment:
    def __init__(self):
        self.vehicles = []

    def add(self, vehicle):
        self.vehicles.append(vehicle)

    def visible(self, perceptor):
        return [v for v in self.vehicles if distance(perceptor, v) >= 0]


ENVIRONMENT = Environment()


class Perceptor:
    def __init__(self, vehicle):
        self.vehicle = vehicle
        self.location = vehicle.location
        self.visible = []

    def update(self):
        self.visible = [v for v in ENVIRONMENT.visible(self) if v is not self.vehicle]


class Vehicle:
    def __init__(self, id, location):
        self.id = id
        self.radio = Radio()
        self.location = location
        self.perceptor = Perceptor(self)
        self.rate = 1

        ENVIRONMENT.add(self)

    def arrived(self):
        return self.location.arrived()

    def look(self):
        self.perceptor.update()

    def update(self):
        visible = [distance(self, v) for v in self.perceptor.visible]
        distance_to_next = min(visible) if len(visible) else None

        if self.radio.command.can_move(distance_to_next):
            self.location.update(self.rate)

    def __repr__(self):
        return f"{self.id}: {self.location}"


truckA = Vehicle("A", Location(0))
truckB = Vehicle("B", Location(-1))
truckC = Vehicle("C", Location(-2))

vehicles = [truckA, truckB, truckC]

for cycle in range(100):
    if truckA.arrived():
        break

    for v in vehicles:
        v.look()

    if cycle % 10 == 0:
        truckB.radio.command = Command(HOLD)
    elif cycle % 3 == 0:
        truckB.radio.command = Command(FOLLOW)

    for v in vehicles:
        v.update()

    print(cycle, vehicles)

This is probably more simple that what you want to do but if you start with something along these lines and get it working, you can modify it to be more complicated. The two parts that will likely be the most challenging is how your perceptor pulls in environment details and how you apply rules based on the output of the perceptor. A lot of this depends on the complexity of the simulation. I would recommend starting simple and rework only when it's inadequate.

Thank you for your thorough response. So the Vehicle agent should implement its capabilities not via a state class but via public methods (i.e. getRadio)? So if I want to incorporate a perceptor that notifies vehicle if a cargo deposit is nearby, the update method should be something like update(bool canMove, bool isAdjacentToCargoDeposit)? If another perceptor notifies vehicle about a new route it must take (i.e. sent by the Supervisor), should I rewritte the update method as update(bool canMove, bool isAdjacentToCargoDeposit, Option newRoute)? And code respective rules? — Vector Zita, Jul 29 '21 at 21:16
Should the IRule interface define new methods (something like IRule { bool canMove();, bool isAdjacentToCargoDeposit();, Option IsAssignedNewRoute; } ? I am trying to understand how to expand this with more rules and perceptors and this is where I am a little unsure, so I would like to make sure I have understood this properly. — Vector Zita, Jul 29 '21 at 21:22
About the rules themselves, should they be mutually exclusive? I mean, in your example if at least one rule states that vehicle can't move, then canMove is set to false. Should I make sure that the rules do not contradict themselves, i.e. a rule can not state canMove is false but another one that canMove is true? I think I am starting to understand the design principles behind your example, it's just those little details which make me nervous that, due to my lack of a deeper understanding, I might interpret them the wrong way, so I would be grateful if you could clarify them a bit more. — Vector Zita, Jul 29 '21 at 21:29
@VectorZita I just want to be clear that I have very little practical experience with agent design but I have a lot of interest. I'm also a little impaired because my C# skills are extremely weak. Would you be OK with Python? — JimmyJames, Jul 29 '21 at 21:32
OP code vacillates between being an abstract framework and a specific model. It is only creating confusion. For now, forget about implementing a general, abstract, hollow `I`nterface framework. Not needed to adhere to the Open/Closed principle; so no `IAgent` just `Vehicle` for example. Extracting a more general framework will be easy enough given a working "vehicle" simulation. — radarbob, Jul 30 '21 at 00:52
@radarbob I agree. My first cut was an attempt to not stray too far from the OPs code. I think I can give better (working even) example with Python. — JimmyJames, Jul 30 '21 at 16:19
I'm sure there's missing "state" definitions. `Rule.canMove()` is a symptom; why not also `canStop`, `cannotStop`, ... OP was wise to recognize where that was going. A vehicle can be `stopped`, `moving`, `loading`, `unloading`, etc. `Vehicle` may have other object aspects enumerated ('loaded', 'empty'). "Movement" is 1 property of the object's complete state - the values of all object properties (public and not) at a given moment; it is not part of a vehicle's state if it is not in `Vehicle`. Ditto for all "shuttling cargo" domain classes. That's Single Responsibility thinking. — radarbob, Jul 30 '21 at 18:19
OP, generally think of individual objects exposing (via a method or property call) its own, and only its own, state. Rules/algorithms will analyze this "bigger picture" state and act. A given Rule API, method signature, is the same no matter who executes it - vehicle or supervisor. Consistent Rule API allows for moving the code if needed. This makes me think about the [Visitor Design Pattern](https://www.oodesign.com/visitor-pattern.html). — radarbob, Jul 30 '21 at 18:48
@radarbob "A vehicle can be stopped, moving, loading, unloading, etc." Perhaps but it all depends on what the simulation is intended to model. Just like a driving simulation is unlikely to account for the relative softness of asphalt in the sun versus in the shade. The model is as simple or as complex as the modeler chooses to make it. There model is only wrong if it doesn't meet the designers needs. It doesn't even need to align with reality. I've played plenty of video games where you can turn around and go backwards in mid-air after jumping. — JimmyJames, Jul 30 '21 at 19:09
@JimmyJames I greatly appreciate the complete working example you included in your answer. It is close to what I had in mind, although I was struggling to put into code due to my lack of experience and relevant knowledge. The bounty is yours :-) Having said that, I can understand the objections raised by radarbob, since, for example, if there was another ability for the rule to check (e.g. LoadCargo), then I would have to include both CanMove and LoadCargo methods in the definitions of the rules and the commands. Hopefully I'll soon be able to wrap my head around this design approach. — Vector Zita, Jul 31 '21 at 13:32
@VectorZita Awesome thanks. I really had to pare this down to make it a reasonable length for an SE answer. Given I used Python and bare access when I would probably prefer methods, I think that's a fair demonstration of how much code can go into creating something like this. I just wanted to give you a working base to build upon. Use versioning and start hacking on it. — JimmyJames, Jul 31 '21 at 22:51

How to implement a rule-based decision maker for an agent-based model?

2 Answers2