[MUD-Dev] Re: DevMUD considerations and the Halloween article

Chris Gray cg at ami-cg.GraySage.Edmonton.AB.CA
Wed Nov 4 22:37:31 New Zealand Daylight Time 1998

[Jon Leonard:]

 >What sort of environment does it need?  That's probably a good place to
 >start with for figuring out what the "standard" DevMUD VM interface is.

The MUD language it is for is mostly just a real simple strongly typed
structured language. References from a chunk of code to other things
is done via 32 bit "references", which are typed database IDs. A few
of those are reserved for "builtin" functions contained in the MUD
server, like 'SubString', 'Length', etc. There are over 300 of them,
since a lot are used to do the multimedia stuff. They are used for all
"input/output" that the language has. I did some simple optimizations
like having special bytecodes to push/pop locals at small offsets, and
to push small constants. As I mentioned in an earlier post, it does
run-time code modification. E.g. the first time a call to a builtin is
executed, that call is replaced by one with a different opcode, and the
'reference' to the builtin is replaced by the address of the native
code for the builtin. Such changes are never saved to the database -
the original code is reloaded if the in-memory copy is ever purged, or
needs to be redone for some reason. There are a bunch of branch
instructions, a bunch of arithmetic/logical ones, some string comparison
ones, two special forms of case (switch) statements, and a dozen codes
for database access. Delete the codes for database stuff, change the
string stuff to whatever semantics is needed (mine is copy-on-reference,
with bunches of special cases to reduce it), and do something with the
references, and it could run anywhere.

Execution is just a big case (switch) statement, with icky hand-done
code for the various instructions. There is a symbolic disassembler
that can chase down 'references', and can translate local/parameter
references into symbols if the header for the function hasn't been

Currently, my byte-code stuff hasn't been translated to C. Its still in
the original Draco from my AmigaMUD system. I don't expect that work
to be hard - probably a day or two.

I'm not sure I've answered your question here.

Note that my system's "object model" is entirely within the database -
the language itself has no specific object-oriented things in it, and
so none are seen in the byte-code. Perhaps I should briefly go through
that model so folks will know what I'm talking about. Those familiar
with it can skip this. I've posted small pieces of my MUD language
code before, but consider this example:

    Me()@hits := Me()@hits + 1;

'Me' is a builtin function that returns a reference to the database
record for the active character (PC or NPC). That reference is of
type 'thing' (yes, that's the official type name!). 'hits' is here
assumed to be another database reference, this one of type 'property int'.
A 'thing' is stored in the DB as a small fixed structure (includes a
reference to any single parent) and a varying size array of property-
value pairs. Thus, if a 'thing' in the database has a property 'hits',
then that array will contain 'hits' as the property reference, and some
integer number as the value. This array is flexible, in that property-value
pairs can be added and removed at any time. So, the above code, running
on my byte-code machine, does:

    - run builtin 'Me', save value on stack
    - push reference to property 'hits'
    - run builtin 'Me', save value on stack
    - push reference to property 'hits'
    - execute 'getproperty' code
	This will call on the database routines, which will retrieve the
	'thing' referenced by 'Me' (might need reading from disk), and
	then search for the 'hits' property in the property-value list
	of that thing. If the property is not found, the parent pointer
	of the thing is followed, and this is repeated until a value
	for 'hits' is found. This is my inheritance. If no value is
	found, a default value (0 for ints) is returned. The returned
	value is pushed on the stack in place of the two arguments.
    - push constant 1
    - execute add: pop two ints, push the sum
    - execute 'putproperty' code
	This is another database call. The top 3 stack elements identify
	the 'thing' to put the property to, the property to put, and the
	value for that property. If the property already exists in the
	thing, then its value is updated. If not, it is added to the
	array of property-value pairs. Note that this will break any
	inheriting of the property from ancestors. Such properties can
	also be deleted at will, thus re-inheriting from ancestors.

I think it comes to 24 bytes of byte-code.

So, my object model is quite different from, say, C++'s or Java's. It
is more expensive at run-time, but is completely run-time flexible,
and saves huge amounts of space, by inheriting *values* rather than

 >Discussion for what we want in a VM is called for.  Discussion of how a
 >VM module should interact with the rest of DevMUD is even more important.


As seen, my VM doesn't have much in it - its mostly done by builtin
calls and database calls. This works for me, but more VM support will
likely be needed for other object models. How much will depend on
what that object model is like. If the support is all done with calls
to native-code functions, then its trivial to add. If we end up like
C++/Java, then byte-codes with constant offsets to the required fields
are also trivial. If we want their style of virtual functions, then
we likely need a byte-code which takes an object reference, and some
kind of identification of the required function, and does the call.
That identification might be an inheritance level and an offset. If
we want to do run-time lookup based on names, then all offsets get
replaced by references to string constants (which can be done in-line
in the byte-code).

    - does the VM definition include function headers?
    - does the VM definition include the object model?
    - does the VM definition specify all builtin functions?
    - what datatypes does the VM support?
    - how much checking of byte-code does the VM do before it tries
	to execute it? (mine does none)
    - how much checking is done as byte-code is executed (e.g. things
	like stack overflow, divide-by-zero, invalid references, etc.)

To start some discussion, here is a list of external references from
my byte-code engine to other parts of the system:

    - memory allocation/free (for stacks)
    - lookup of function header by PC value (for tracebacks)
    - output to the active client, if any (for tracebacks, errors)
    - lookup of references (for tracebacks)
    - lookup of builtin functions by reference
    - checks for run-time exceeded
    - references to active agent status for security issues
    - reference to other interpreter entry point for non-byte-code calls
    - reference to a system abort routine for conchecks
    - lookup of functions by reference, for calls
    - references to various utility routines, like string operations,
	'fixed-point' arithmetic, etc.
    - references to the dozen or so database routines

The file exports only one function - the top-level entry point to do
byte-code execution of a given function. The state in the file (my
system isn't multi-threaded, so they are just static variables) consists
of a pointer to the current function, pointers to the top and bottom of
the byte-code stack, and the SP, FP and PC byte-code registers.

Whew! Sorry for the length, but I hope this gives folks something
solid that we can talk about, in terms of what a VM is all about.

Don't design inefficiency in - it'll happen in the implementation. - me

Chris Gray     cg at ami-cg.GraySage.Edmonton.AB.CA

More information about the MUD-Dev mailing list