Cognitive Modelling With Mathematical Logic
SDxWiki

Cognitive Modelling with Mathematical Logic

Related pages:

Some Background Assumptions

Data Structures

The basic idea behind the AI for a game is to look at what's known about the current world situation, and then find sequences of actions that can be expected to lead to desirable new situations that fullfill or further a character's goals (goal-directed behavior). Oz's constraint solving tools can help with many parts of this.

Situation (Cognitive World Model)

We model the world with a set of variables, which we call a "situation record". (In Oz, a record is a mapping of tags to variables -- like a struct.) Fields of a situation record can themselves be records, or even varying-length lists. We'll assume that the members of our situation record are restricted to having values that are natural numbers (positive or zero). We will further restrict each variable to a finite subset of the natural numbers, because this is adequately expressive and fits Oz's finite domain analysis tools. We'll follow Funge's terminology and call these variables the fluents of the model. Oz allows us to define the domain of a variable directly, and then narrow the domain by applying constraints. If a domain of a variable can be narrowed down to a single value, then that variable is said to be fully determined, otherwise it is indeterminate but (hopefull) constrained.

If a situation record's fluents are all determined, then the record describes a specific situation. If some of them are undetermined, then the situation record actually represents a set of situations. This is a flexible way of dealing with uncertainty by representing large sets of possible situations compactly.

We define fluents by describing their name, domain, and an optional inference constraint. The domain is simply a statement of the range of numbers that the fluent can have. The inference constraint tells us how to infer the value of a situation fluent from its value in a previous situation. In the simplest case, these simply state an equality between the two states; in some cases, the constraint is more general, reflecting uncertainty about the character's ability to extrapolate the evolution of the fluent.

Action

Actions are described by a name, a precondition axiom, an effect axiom, a set of input variables, and an effector.

The precondition axiom is a boolean expression on the situation fluents, evaluating to true when an action is possible and false otherwise. Although in general constraints are not functions but rather statements of relationships, you can think of the precondition axiom as a function on the situation, producing a boolean value.

The effect axiom is a set of constraints tying (in general) a previous situation's fluents to a new situation's fluents. The previous situation is the one in which the preconditions for the action were evaluated; the new situation is the one that we are postulating as a result of the action.

The input variables allow for parameterized actions. For instance, the action move might have the variables X and Y. Or, at a more likely higher level, attack might have an input variable target. When searching for new hypothetical situations, we try different combinations of inputs for each action, letting the inputs vary over their domains. In effect, one parameterized action can actually spawn multiple possible concrete actions.

The effector is simply a procedure that is called, with the selected input variable values, to send execution requests to the game engine. (For testing purposes, the effector will store new fluent values for the next simulated situation.) Effectors play no role in the planning process; they are only invoked after we have selected a plan.

Decision

TBS

Goal

TBD

Thinker Server

The top-level algorithm has these elements:

Situation/Decision Space Search

The search tree is rooted in the current situation, and consists of hypothetical situations that are generated by selecting actions and action inputs. Action inputs should be fairly restricted in domain, and can be chosen freely, although some choices will conflict with the situation and thus be rejected. For example, it is impossible to walk east when standing next to a wall to the east. TODO: This means that action inputs cannot be chosen freely; by parameterizing actions, I have simplified the model, but now have to introduce complexity to constrain the inputs based on the fluents. Selecting actions and inputs leads to new hypothetical situations. (In fact, indeterminate situation records represent sets of possible situations rather than individual situations. TODO: Should contained sets be combined in some way?)

The search tree can thus be visualized as consisting of situation nodes, connected by edges that each represent a particular choice of action and inputs.

The search algorithm can probably be implemented using Oz's predefined search engines, making it easy to experiment with different search methods. Potential solutions are those which satisfy the characters current goal(s), which must be stated in terms of fluents. By measuring degree of satisfaction, bounded-best search algorithms can be employed, which are probably much better suited to this problem than simpler methods. In fact, satisfaction can itself be considered a fluent, representing a state of the character's mind. Thus, goals are modelled by constraints that relate a satisfaction measure to fluents.

Starting with the current situation, which reflects sensory input from the game engine:

These elements can be though of as occuring in approximately the order listed, although in fact some of them occur in an interleaved fashion due to the way that Oz's constraint propagation works. Following subsections discuss each element in turn, sometimes describing different possible levels of complexity for that step.

Throughout this algorithm discussion, we assume that we're planning a sequence of steps based on our current knowledge of the world's situation, without further input. The entire algorithm of course needs to performed repeatedly, and in between these iterations we receive new input from the game engine to update our notion of the world's state. Each search is independent of the previous ones, searching from what's known of the latest situation.

When I discuss specific examples, I'll assume we're modeling the pilot of a spaceship.

Infer current situation knowledge

We start with the current situation as input. This does not have to be a specific situation, because we might not know everything we possibly could about the world. The state of our computation space is determined by this current situation record. We'll refer to this record as S0.

As a first step, we propagate constraints that model relationships which the character believes must be true in a consistent world. These constraints relate fields of S0 to each other. These can be useful because we might be able to infer information that we don't have. However, we have to be careful with these -- if one of these constraints fails, then we're essentially saying that the character can't think about this situation because it seems impossible! Note: Such constraints might be an embellishment better left alone, initially. Therefore, these constraints must be applied speculatively.

JDH: It strikes me that you need some form of 'fail safe' mechanism to allow a character to not become paralyzed by a world state that they believe impossible - one could model how comfortable a character is with such a situation by the action performed by the 'fail safe'. It seems to me that this first step will be of vital importance to the performance of the model - you'd either have to spend a long time gathering information from the game world so that you have a complete enough information set to proceed or you'll have to spend a long time searching the decision space because you'll have lots of solutions (because each unknown must be distributed across all possible values). That last bit in parentheses is my understanding of what you've said so far - and it made me think of something: can we introduce probabilities for the likely values of unknown fields and thus reduce the search space?

Infer possible actions

Next, we propagate an important set of constraints which determine what actions are possible. We have precondition axioms that tell us the preconditions for all the actions that we know about. These test fields in S0 and, for each action, determine if it's possible to perform it or not. Since the S0 might not be completely determined, the answer might be "maybe" for some actions. (One way to think of this is that there's a boolean-valued field for each action, stored in S0, i.e. a variable with initial domain {0#1}. The constraints try to narrow this domain, but may not be able to if their input variables are insufficiently determined.) For instance, if our current engine temperature is too high to allow use of the engine, then any action that requires use of the engine can be ruled out right away.

When the constraints have all done their utmost (the computation space has become "stable"), then we move on to the distribution step. You might think I'm hiding a lot of complexity in that sentence, but Oz makes that step very easy. Defining the constraints may be hard work, but propagating them is done entirely automatically. In fact, we'll be adding a lot more constraints later as we elaborate on our algorithm.

Distribute on actions

Now we have to make a decision -- what action do we take?

What we want to do is select from the possible actions, and determine the next state record that results for each one. We have a set of effect axioms that tell us how we expect the world to change when we take a particular action. We want to try some or all of these, and in each case find the resulting next situation record. There's room for endless creativity in guiding the order of action selection for efficient reasoning.

(More to come. For now, assume that we blindly try each action that hasn't been ruled out, in random order.)

Distribute on action inputs

This step can use standard Oz distribution functions, since we're working with finite domain variables and we're free to select them arbitrarily. (Contradictions with constraints on inputs will automatically cause failure of a branch.)

Infer effects of actions

Here's what we do for each action that we try out. We create a new situation record, S1. Initially, all variables in S1 are completely undefined, i.e. nothing is determined about the next situation yet. We apply the effect axioms for our selected action, which are just more constraints. These constraints relate variables in S0 to S1, so they should have the effect of defining domains for some variable in S1.

side note: The quality, i.e. narrowness, of these definitions in S1 will probably be dependent on how much we knew about the initial situation, and maybe also how certain we are about the action's effects. Hmm, this goes beyond anything Funge wrote about. :->

Infer remainder of next state

TODO: This step should happen only after effect axioms are applied, and only in speculatively, in preparation for further search from a hypothetical situation.

Next, we have to figure out whatever we can about all the other variables in S1. In the simplest case, we assume that the only thing that can change in the world is due to our own actions. This means that, for all variables in S1 that weren't constrained by our effect axioms, we copy their current states from S0.

A more sophisticated approach is to have additional constraints that relate S0 variables to S1 variables. An example: We know our engine temperature can only change at a certain rate in either direction. If we know something about our engine temperature in S1, then we can restrict the domain of the engine temperature appropriately in S0. If we know the temperature exactly in S0 (well, within the granularity of our model), and the state of other systems that affect it, then might be able to predict the new temperature exactly. Since we may be doing further action planning starting from S1 very soon, this could be important information!

We also might have rules that tell us what we expect other objects in our world to do. For instance, the nearby space station will continue hurtling along its orbit. The enemy behind us will continue pursuing us.

We could have sophisticated sub-systems of constraints to infer what an opponent will do, based on our knowledge (or guesses) of their intentions. This starts to get pretty heavy-duty, though, because we're now talking about our cognitive model's model of our opponent's cognitive model. :-)

In any case, the default constraint for a variable in S1 should be that it is the same as in S0. (Note: Don't use Oz's equals operator, that really equates them. Use the equality constraint operator instead!)

Constraints imposed by effect axioms in a new hypothetical situation can be in conflict with inference constraints. Thus inference constraints have to be considered optional. If we now an action's output fluents, we could simply avoid applying inference constraints to those outputs. However, this is difficult without a front-end processor for the effect axiom expressions. Alternatively, inference constraints can be applied speculatively. This will be expensive, but also holds out the possibility that the constraints could combine (in cases where an action's effects are not fully determined) to further narrow the domain of a fluent, improving the quality of our situation prognosis. (TODO: Is this logically valid for all actions?)

Recurse to determine next action

Now that we have S1, an assumed next state based on an action that we might take, we can start looking farther ahead. We use S1 as a new starting state for a further action to take.

This is where our model diverges somewhat from the constraint solving described in Oz's tutorials. (I.e., this is where I keep getting stuck in my head. Finally I've written down enough stuff to have exposed the culprit explicitly!) The examples I've studied all try to narrow the range of a set of variables down to a solution. In our case, we just deal with uncertainty, and generate new sets of variables describing possible situations at a different time in the future.

Note: Is it possible or desirable to use the more conventional model to direct the top level of the planner? We break out the from S the variables that represent goal fulfillment, "solve" for these, continually narrowing their domains as we approach solutions. All this other stuff is subordinate to the actual variables of the system being solved. Hmm.

Direct search based on value

Each situation Sn can be evaluated for its desirability. Comparisons between them can be used to decide which one to explore next.

Limit depth of search

We need to limit the time spent searching for good action sequences. Each time that we are about to select an action for further investigation, we can check whether we should instead backtrack. In general, our model won't have "solutions", so this is essential.

Funge to Oz: Mapping concepts to mechanisms

Preliminary thoughts. For now, assume that each fluent's value is from a finite domain, representable by a non-negative integer. This allows us to approach the problem using Oz's finite domain problem solving tools.

;Situation: Solved root of a constraint problem. In Oz, solving a constraint problem produces a single output, which is a record. The record may contain nested records. The leaf fields of the root record are the variables of the problem. If all of the fields are determined (i.e. have specific, single values), then the record denotes a specific situation. If a leaf field is not determined, then in Oz's finite domain constraint system, it has a value which denotes the set of possible specific values that it could possibly have. (This could be a min-max interval, a set of specific values, or a combination of the two.) If one or more leaf fields are not determined, then the situation record represents a (possibly very large!) set of possible world situations. Funge's "interval-valued epistemic fluents" concept thus maps perfectly to the undetermined variables in a constraint store. ;Fluents: The leaf elements of the situation record. ;Defined fluents: A defined fluent is a fluent that is defined in terms of other fluents. Having these in the situation is obviously redundant, and should probably be avoided. However, they can be represented as variables in the local space of the problem-solving procedure, because they're convenient for computation. Funge cautions against circular dependencies, and insofar as we treat these definitions as temporary variable introductions rather than constraints, he is correct. See next item for constraints. ;State constraints: These are modelled exactly as constraints on fluents in Oz. In other words, a state constraint states a required relationships between fluents. The propagation step of problem solving applies constraints to narrow the possible domains of undetermined variables. In a hypothetical situation that is being considered by the planner, failure of a constraint indicates that the situation is considered impossible. (It could also mean that hte situation is completely undesirable, thus not open to further consideration, but this should be done only cautiously to limit computation. Over-use of this technique could lead to a character's inability to deal with unpleasant situations!) ;Precondition axioms:asdfasd ;Effect axioms:lajks

Some open questions

History

More to come!