[MUD-Dev] Solutions for smarter NPCs (long)

Robert Zubek rob at cs.northwestern.edu
Sun Feb 1 14:19:56 New Zealand Daylight Time 2004


From: Björn Morén

> I'm curious about what technical solutions you have tried in order
> to get NPCs act more interactively and smart. Better AI to support
> a more vivid world. Both regarding combat behavior for NPC
> monsters and conversation behavior for NPC merchant / quest
> assignment types.

> Have anyone found it fruitful to use a functional/logic
> programming language like Prolog or Mercury? I guess most use a
> state machine abstraction in regular C/C++? I also guess a lot has
> been done by scripting (Python, Lua, Javascript), but how far can
> you get with that? What about neural nets? Fuzzy logic?

Ah, character AI, a topic near and dear to my heart. :) I'll happily
go into some of my experiences with that; and I'd be curious to see
how they relate to everyone else's.

First, smart behavior in the world. I've tried simple finite state
machines, behavior-based systems, and combinations of behavior-based
and symbolic reasoning.

FSMs are notorious for their limitations. I'm especially not a fan
of what I call a "parliamentary behavior" - only one state in the
FSM has the floor at any given time. The selected state takes care
of handling all perception and controlling all action, and it has to
decide when to give it up.

This causes obvious problems, because now every state has to know
about all the possible conditions in which it should yield the floor
- and those are usually numerous. Suppose I have a ConversationState
that's happily active, talking away with some other character, when
suddenly, a tiger attacks! The ConversationState should immediately
yield the floor to the RunAwayState, or the agent will get
eaten. But in order to do that, the ConversationState has to
*always* be looking out for tigers and other dangerous
things. Indeed, ConversationState has to know about all the triggers
to all the other states, and needs to be constantly monitoring
them. Otherwise, a filibuster occurs, and the agent becomes tiger
food.

Given a large number of states in a typical system, this doesn't
scale - because every state has to be on the lookout for every
possible condition that requires transition to every other state. So
behavior engineering becomes brittle - when we go in to change a
behavior, we are faced with the danger of having to adjust all the
transitions related to it, which increases the danger of introducing
bugs.

An attractive alternative to this is to use behavior-based systems,
which I've done in the form of action behaviors. An action behavior
networks is made up of behavior modules and arbitration
modules. Each behavior module works in parallel, analyzing
perceptory inputs, and suggesting activity (e.g. keep moving south
at 1 m/s) along with an activation level for this activity on a
scale from 0 to 1. Outputs of the behaviors then get funneled into
arbiters, who reconcile the different suggested activities. The
arbitration depends on the kind of an activity; for example,
movement directions can be translated into 3D vectors and summed up,
other actions can be selected based on most-active or first-active
criteria, actions can be allowed to subsume or inhibit each other's
outputs, and so on. This entire network gets run many times a second
- 10 Hz minimum for real-world applications - so that behaviors and
arbiters can re-evaluate even the most quickly changing situations
and produce the appropriate response.

This action behavior network works more like a rowdy debate than a
parliamentary session. :) Each behavior continuously monitors the
inputs it cares about, and produces an action whose activation is
proportional to its importance. So, for example, the
ConversationBehavior can become medium-active when the agent is
bored, and the RunAwayBehavior could be always looking out for
enemies, and produce a run-away action that's not very active until
the enemy comes to within 50m, at which point the activation level
spikes.

In my experience this works much better than simple FSMs
themselves. Most importantly it prevents the filibuster problem, as
mentioned before. In behavior networks, each behavior always says
what it wants to do, and it speaks more loudly the more important it
considers the situation; then there's a set of arbiters continually
re-evaluate which behavior to listen to.

---

Unfortunately, both FSM and behavior-based systems are notoriously
bad at handling complex decomposable tasks - such as decomposing a
high-level goal into a sequence of sub-goals, and those into action
steps. That's where explicit reasoning systems come into play.

Reasoning systems are often combined with behavior-based systems in
the form of two-tiered architectures: the reasoner (or planner) sits
on top, analyzes the situation and decomposes it into
immediately-achievable goals, and the behavior system sits at the
bottom, running asynchronously, and simply tries to fulfill whatever
goals the reasoner decided.

Hybrid architectures are very common in robotics, and get built in a
large variety of tools and languages. My own hybrid architecture for
a MUD agent was written in Lisp, using a truth-maintenance system
for the reasoning engine, and a simple behavior network for the
reactive layer.

But in the end, it doesn't seem like most game situations are
complex enough require explicit reasoning. Usually a static
inference network, built at design time, should do just fine. :)

---

As for conversations, they're complicated. :) I've never liked the
Adventure-style fixed-grammar text interfaces - they feel about as
natural as marching in lockstep. But if we abandon fixed grammars,
we're left with Eliza-style pattern matchers, which are way too
simple to be useful.

I've had some success fooling people with Eliza, by taking advantage
of domain limitations. The agent was a bot for Counter Strike, and
conversations in this multiplayer shooter tended to be disjointed,
schizophrenic, and paying as little attention to the conversation
history as they do to grammar. In other words, it was an environment
ideally suited for Eliza. :)

As for 'real' conversation, it's an area of active research. There's
some interesting AI work on finite-state dialog models that might be
applicable to games - although games have a peculiar constraint in
that they concentrate on believability and suspension of disbelief,
which doesn't have to go together with complex linguistic
abilities. :)

I'm personally a big fan of probabilistic finite-state models, which
seem to fit very nicely in this domain. In particular, for my degree
(Ph.D. in AI), I'm building a system for natural language engagement
in stereotyped social interactions in games. I'm using Markov
models, which are both quite powerful and exceedingly inexpensive
(compared to more traditional reasoning methods), to model
interaction protocols. The idea is that a hierarchy of a large
number of these independent process descriptors can lead to a more
believable conversation - the specialized protocols help it advance
like a real conversation (so it's much better than Eliza), and the
hierarchy allows for gentle failure at the edge of
competence. Unfortunately, I don't have much on the web about it,
but if you send me an email I can send you a copy of my thesis
proposal, with all the meaty details. :)

---

Finally, languages. All the behavior-network work was done in a
custom language called GRL (the generic robot language), which got
compiled down to Scheme, C, or Basic, or UnrealScript, depending on
the project. One could write behaviors in a language like C++ from
scratch, of course. But it wouldn't be pleasant.

I tend to evaluate languages on two dimensions. One, how does it
function as the description of computation - is it fast, what kind
of memory management does it support, and all that standard CS
stuff. Two, and what I think is more important in game design, is
how easy does it make it to describe system behavior. And this is
the Achilles' heel of many general programming languages.

The problem is, mechanics of a language may obscure the workings of
the system. When I'm writing conversation protocols for my agent, I
have a difficult enough time working within the constraints of my
engine; I do not want to be bothered with pointer chasing, memory
management, OOP limitations, and all those details. I want to be
writing the interesting stuff, and let the computer take care of
everything that's mundane or error-prone. My philosophy is - build a
language that will make it easy to describe the desirable system,
and then build my system in that language. :)

In the end, for my conversation engine I have a two-language
setup. Conversation protocols are written in a compact
domain-specific language (definition language, not Turing-complete)
built on top of Lisp, and those gets translated into C++ and
compiled with the rest of the engine. And because Lisp knows all
about the protocols at definition time, it can produce code that
does all the obvious optimizations, such as laying out objects in
memory ahead of time and converting all object lookups into O(1)
operations.

By the way, this is where I think Lisp-based languages excel - in
building a domain-specific language, because they make it so easy to
redefine Lisp itself. It's not necessarily a benefit of functional
programming per se, but rather the particular plasticity of Lisp
itself, with its ability to treat code as data and vice versa.

---

Anyway, that's my story. :) All of the systems above (except for the
conversation system) I've described in my papers - they're available
for download at http://www.cs.northwestern.edu/~rob/publications/ if
anyone's interested. :)

Best,

Rob

--
Robert Zubek
http://www.cs.northwestern.edu/~rob
_______________________________________________
MUD-Dev mailing list
MUD-Dev at kanga.nu
https://www.kanga.nu/lists/listinfo/mud-dev



More information about the MUD-Dev mailing list