[MUD-Dev] Re: Missing the point: OpenMUD, Gamora, Casbah, etc.

Jon A. Lambert jlsysinc at ix.netcom.com
Mon Oct 26 16:27:39 New Zealand Daylight Time 1998

> From: Cynbe ru Taren <cynbe at muq.org>
> Subject: [MUD-Dev] Re: Missing the point:  OpenMUD, Gamora, Casbah, etc.
> Date: Monday, October 26, 1998 4:18 PM
> [NB: mud-dev's spam-filter is gonna bounce this message.  One of you
>      might want to forward it, if it seems worth the effort. ]
> | Duly noted. There are probably *lots* of lessons to be learned from all
> | these projects, and others which were not mentioned. I also think we
> | should try to talk with these people about mistakes, tips and other
> | pointers. Why don't we start with you, Bruce? :) 
> | 
> |  1 Which are the most important features DevMUD should support, judging
> | 	from your background?
> |  2 What are the key abstractions of the problem domain, in your opinion?
> | 	(hmm, fuzzy question)
> |  3 What to watch out for?
> |  4 What did you do wrong?
> |  5 What did you do right?
> |  6 Any random wisdom you can share with us?
> Heh, that sounds like an -invitation- to talk about Muq.  Normally
> I try to keep from boring people with my private obsessions. :)
> Re (1): "Which are the most important features DevMUD should support, judging
>         from your background?"
> I haven't been following MudDEV closely enough to feel qualified to
> prescribe.  Here are some things I've done for Muq which I'm happy
> with, and which might be worth considering:
> *  I've gone to a 64-bit architecture internally, on all platforms,
>    using gcc "long long" on 32-bit machines.  This eases interoperation
>    between mixed 32- and 64-bit machines on the net, eases porting of
>    dbs between 32 and 64 bit machines, and avoids portability issues
>    related to different arithmetic precisions on different platforms.
> *  I've implemented transparent integration of bignums and fixnums:
>    The app programmer doesn't need to switch to an incompatible library
>    of bignums to get precision beyond 64 bits, it just happens automatically.
>    This is traditional in high-level languages, but involves an efficiency
>    hit which makes it unpopular in low-level languages.  You have to decide
>    whether you'd rather have a small increment in speed at the price of
>    transparently returning nonsense to many arithmetic operations.
> *  I've implemented a variety of distributed-programming support hacks
>    aimed at allowing a WAN-distributed set of Muq servers look like a
>    single compute engine with a single address space:
>    *  Some critical operations are automatically rerouted over the
>       Net if done on a remote object.  Parameters to such operations
>       are automatically serialized, and at the far end, references to
>       remote objects automatically become proxies.  Eventually, I'd
>       like to have -all- operations work identically on local and
>       remote objects, but implementing just a handful such as getProperty
>       make a world of difference.
>    *  For sanity, all users get automatically and transparently assigned
>       Diffie-Hellman public and private keys, and all communications over
>       the Net are automatically authenticated using exponential key
>       exchange and a MAC message hash.  (Note that the Diffie-Hellman
>       patent ran out last year.  Yay!!)  In addition, all network traffic
>       between users is automatically encrypted with 256-bit twofish
>       encryption.  (For twofish, see http://www.counterpane.com: Twofish
>       is Bruce Schneier's entry in the NIST competition for the "next DES",
>       and I'd guess it's the front-runner, given that he literally Wrote
>       The Book on Applied Cryptography.)
> *  I've implemented reasonably fair scheduling, so that a single user doesn't
>    take over the system if s/he spawns lots of threads:  Each user gets a
>    roughly equal number of machine cycles, which are then divided between
>    her/his threads.  This might or might not be an issue with DevMUD, depending
>    just where it is going.
> *  I've implemented a secure bytecode-generation API which allows untrusted
>    people to generate binary executables without endangering the
>    security or reliability of the system:  This allows admins to allow
>    random users to implement new programming syntaxes by writing a simple
>    compiler.  The standard system syntax compiler is implemented in softcode
>    as an example.  The API hides the difference between softcoded functions
>    and hardcoded bytecodes, so the softcoded compilers don't have to know
>    which is which, which simplifies them enormously -- and which also makes
>    it much simpler to move functionality from softcode to hardcode and back
>    during development.
> *  I've implemented a soft MMU so that objects can transparently swap from
>    disk to ram without user attention, rather than requiring them to all
>    sit in ram at all times;  They can also be freely moved around in ram.
>    At present, the utility of this is vastly reduced by the fact that the
>    garbage collector is a simple mark-and-sweep which touches all objects,
>    forcing them all into ram:  Muq needs a generational garbage collector
>    to make the soft MMU ("diskbase") code as useful as intended.  The diskbase
>    code is designed and implemented with extreme attention to efficiency
>    issues.  (And, incidentally, is also a separate module designed to be
>    usable in other systems.)
> *  I've implemented a tag-based architecture:  The system can distinguish
>    ints from chars from floats from objects by inspection.  This is in
>    contrast to the Java (say) approach where they can be distinguished
>    only by location -- variables that can hold ints can never hold objects
>    &tc.  The Java restriction buys efficiency at the cost of making life
>    harder for the application programmer:  When passing integers via
>    generic mechanisms, they must be specially wrapped in protective
>    objects, a nuisance task which does nothing to improve the app
>    programmer's life.  You have to decide whether you want simplicity
>    or efficiency the most.
>      Tags also allow me, when adding two fixnums and getting arithmetic
>    overflow, to transparently return a bignum containing the correct
>    result.  Java-style architectures can never do this because the type
>    of the result must always be uniquely determined by the types of the
>    inputs, hence they are forced to return incorrect results or an
>    exception in these cases, never the correct answer.
> *  At the softcode level, I've assigned every user a thread responsible
>    for animating her/his possessions (plus another to handle the user
>    shell, if the user is logged in).
>      I consider this better than having a single thread animate all
>    objects because it lets the user rehack her/his daemon thread to
>    taste.  If it crashes, all that happens is one user's possessions
>    cease to respond.
>      I consider this better than having a thread for every object
>    partly for efficiency reasons (users can have thousands of objects
>    each), but mostly because having thousands of threads per user
>    introduces tremendous problems with doing correct locking, fixing
>    hung locks, locating runaway/hung threads &tc which the typical
>    novice user isn't going to be up to with today's software
>    technology.  (After using Netscape's IFC Java classes for awhile,
>    I'm convinced most -professional- programmer's can't handle multiple
>    threads effectively at this point. :( )
>      I do all user communication/interaction via messages sent
>    daemon-to-daemon.
>      These messages are NOT handled as remote procedure calls, with
>    the caller blocking until response is recieved:  This would too
>    often lock up the user when something doesn't respond, or else
>    force us to spawn one thread per outstanding RPC, which would get
>    us back into a nightmare of locking, synchronization and cleanup
>    of hung threads and locks.
>      Instead, I introduce the idea of a TASK, more lightweight than
>    a thread:  A task is essentially an RPC plus a continuation to be
>    called when reply is recieved.  The per-user daemons have a little
>    db of outstanding tasks, which get executed in the daemon thread
>    as appropriate replies are recieved.  (Replies are matched to tasks
>    using automatically assigned integer IDs.)  Syntax is provided to
>    make this about as concise and efficient as a RPC.  (The daemon
>    also transparently takes care of retries, since muqnet is built
>    on UDP, and a calls a separate error continuation upon timeout.)
> *  Also at the softcode level, I support procedurally defined dbs:
>    Rooms and objects don't have to exist in the db when no players
>    are present, but instead can be created on demand based on
>    appropriate code.
>      This essentially requires identifying rooms by something more
>    like the path to them (or coordinates, if you prefer) than by
>    conventional pointers, and arranging that code driven by that
>    path/coord be given if necessary a chance to create the room on
>    demand any time someone tries to reference it.  Not too hard if
>    you design it in from the start, but very hard to retrofit.
>      In this context, by-hand user building in the middle of phantom
>    procedurally defined landscapes can be handled cleanly via an
>    exception hashtable which is checked before invoking the procedural
>    generation machinery:  The user can walk miles out into landscape
>    defined only by fractal-style procedures, and build a hut there,
>    without ever being aware of the difference between the procedurally
>    and discretely defined parts of the world.
>      (Designing the procedural generation code itself so that it is
>    cleanly modular and easy to incrementally modify is a fascinating
>    design problem which I don't get into here...)
> *  I do NOT migrate objects between servers.  I think this is a recipe
>    for disaster in a WAN-distributed multi-administrator environment.
>    Objects stay on their home servers and are transparently accessed
>    remotely, avoiding such issues as:
>    *  What happens when your object is on a remote server and it crashes?
>    *  How do you find, update and trust a wandering object?
>    *  How do you defend against malicious servers hoarding wandering objects?
> *  I do NOT garbage collect across servers:  If an object has no local
>    pointers to it, it gets recycled.  (Java is taking the opposite
>    approach, which I predict will be stillborn or a disaster.)  Two
>    reasons for this design decision:
>    *  It is administratively unacceptable that an admin and/or user
>       should be unable to delete data and recover storage place just
>       because some server in darkest mongolia claims to still have a
>       pointer to it.
>    *  Real WAN-distributed worlds are going to have servers going down
>       and coming back up and rolling back to old dbs all the time: It
>       will never be possible to be sure some rolled-back db won't show
>       up with a pointer to some object which looks garbage-collectable.
>    What I DO do is to add about 100 random guard bits to objects when
>    they are created:  Remote atttempts to access them must include
>    these 100  bits in the reference.  If they don't match the guard
>    bits in the object, the pointer is stale and/or forged, and ignored
>    or errored.
> *  I support the CommonLisp exception system, which is considerably
>    more sophisticated than anything else I've seen.  People tend to
>    treat exception systems as a design and implementation afterthought.
>    I think this is a mistake, expecially in the mud context:
>      As we move to more complex worlds with per-user custom-programmed
>    daemons, distributed operation over the Internet, sophisticated
>    procedurally defined worlds &tc, "failures" will become more common,
>    and flexible, intelligent responses to changing conditions will become
>    ever more important.
>      The CommonLisp (and hence Muq) exception systems go far beyond a
>    simple report-error-and-unwind-stack hack:  Instead they establish
>    in essence a blackboard system in which strategies for handling
>    various situations can be registered, situations described, and
>    fallback responses to difficulties (including contacting the user)
>    selected.
>      One result is that ongoing computations become much more
>    inspectable:  One can peer into the exception system and see the
>    current goals being pursued, the strategy being used, and the
>    fallback strategies available if that strategy should fail.
>      This sort of facility, in my opinion, arguably represents the
>    next major programming advance beyond OOP, or an equally important
>    paradigm in its own right, which will give us the ability to write
>    robust applications which, (say) if they don't find the "page" recipient
>    online on the mud, can automatically check IRC and ICQ and last-
>    email-read and such and either automatically reroute to a different
>    medium or else present a cogent list of alternative tries to the
>    user instead of just "no response from host". 
>      Of course, teaching people to think beyond OOP will take
>    decades. :)
> *  I've configured Muq so a single script will strip out all traces
>    of crypto, and simply unpacking a small tarfile over the source
>    tree will re-install it.  Important if you want to protect a WAN-
>    distributed world with proper crypto while still being legally
>    exportable.  (The problem is primarily legal, but it does have
>    design implications for the software.)
> Re: (2) "What are the key abstractions of the problem domain, in your opinion?"
>  	(hmm, fuzzy question)
> I'm not sure I can address that just as posed, but some crucial design
> issues in my mind are
> *  In general, the interaction between conventional computer language
>    design issues, on the one hand, and the issues of
>    * multiple users
>    * persistence
>    * distribution
>    on the other.
>      In my experience, the features which keep distinguishing the mud
>    environment from the typical programming environment are that
>    -> Muds are multi-user and must be robust in the face of a minority
>       of disaffected users devoted to abusing the system.  Designs like
>       C++ and Smalltalk and such don't address this problem.
>    -> Muds are persistent, and must address maintainability of large
>       persistent heaps over time.
>    -> (I believe) muds are increasingly going distributed, rather than
>       being single-machine systems.
>    Example:  What happens when you redefine a class with existing instances?
>    Example:  Is class A on server Sa the same as class B on server Sb?
>    Example:  Who should be allowed to redefine a message, or define a new
>              method for it?
>    One reason I like CommonLisp as a semantic reference is that it
>    -does- address, for example, what should happen when you redefine
>    a class with thousands of existing instances.  This is critical to
>    practical maintainance of a large programmable virtual world, but
>    completely unaddressed by most OO systems.  (Another reason I like
>    CommonLisp as a semantic reference is that Lisp has evolved in an
>    incremental, upward-compatible fashion for 50 years, consistently
>    staying at the cutting edge of software technology.  Algolic
>    languages tend to get thrown away and replaced rather than
>    incrementally improved, which is bad for long-lived virtual
>    communities.)
> *  More specifically, issues of privacy, responsbility, authority &tc.
>    In my opinion, at least some of these need to be systematically
>    addressed at the design level.
>    *  I consider user privacy important.
>         One design response to this is to establish a completely
>       transparent authentication and encryption layer that
>       automatically protects all user-user interactions.
>         Another response to this is user-replacable shells ala Unix,
>       which make systematic bugging of the user shell more difficult.
>         Yet another is to avoid routing user-user communications
>       through the host room/world when the users on other servers,
>       instead sending them direct user-to-user.  (This also reduces
>       needless load on the host world server.)
>    *  There are a variety of policy issues which need to be extensively
>       configurable, ranging from "who is allowed to pick up this object?"
>       to "who is allowed to shut down this server?".  I believe the
>       software architecture has to address these systematically rather
>       than just saying, "well, you've got the source, hack it how you
>       like".  Source code plus an editor isn't the ideal administrative
>       interface!  We need to have a systematic approach to introducing
>       delegation of policy decisions to objects specifically designated
>       for the purpose all through the architecture, coupled with a UI
>       design for easily locating and editing these objects.
>         (Java's security managers are vaguely groping in this direction,
>       but only half-sense the problem and half-address it, at best.)
> | Re: "3 What to watch out for?
> |      4 What did you do wrong?
> |      5 What did you do right?"
> I'll interpret these questions fairly freely.   :)
> *  In an exponential age, quick turn-around is important, so as to
>    have the product out the door and people building on it before
>    it becomes irrelevant.
>      Muq was planned as a one-year project, and has stretched to
>    six.  It would have been better to turn it around faster if at
>    all possible, although I'm not sure how I could have done that,
>    short of omniscience.
>      Running late has had some good aspects, of course -- I just
>    had a chance to fold in twofish encryption, for example, which
> *  I've wound up re-designing and re-implementing major sections
>    several times.  Short of godlike omniscience, there's no way
>    to avoid learning new things as one goes along, and sometimes
>    needing to take advantage of them, but one still wants to do
>    one's best to avoid this.
>      In my case, I started with fuzzball tinyMuck as a standard,
>    then switched to Scheme, then switched to CommonLisp.  The
>    motivation in each case was that I wound up needing a design
>    reference that covered more ground than the preceding one.
>    (For example, Scheme doesn't have arrays.)
>      I believe that, as much as possible, basing one's design on
>    an existing standard where that standard is relevant is a
>    Good Thing.  It lets one leverage existing practice and
>    experience (and perhaps source and tools and documentation)
>    and it avoids lots of agonizing over whether design option
>    A is slightly better than B:  Unless the advantage is REALLY
>    clear, you stick with your reference.
> *  I've done AMAZING amounts of systematic testing as I've gone
>    along, relative to current programming practice.  I started
>    a test framework at the same time as a I started the implementation,
>    and no module is complete until I have integrated into the test suite
>    code exercising most of the major cases.  I'm up to about 3500 tests
>    at this point, I typically run them several times a day during
>    development, and I'm just tremendously convinced this is the way
>    to go.  (I've done this on my last three major projects:  An
>    optimizing C compiler suite comparable to gcc/bison/etc, a large
>    scientific visualization application, and Muq.  I've been extremely
>    satisfied with the results every time.)
>      A test suite like this is amazingly good at turning up stupid
>    mistakes promptly while you still remember exactly what you
>    modified:  It means most debugging consists of going directly to
>    the offending code, backing out the changes, and re-applying them
>    one by one, rather than thrashing for hours in gdb or whatever.
>      It also gives your computer something to do during coffeebreaks. :)
>    (BTW, the bugs detected are almost never caught by a test which
>    is looking for them.  But checking thousands of comuptations
>    exercising large parts of the system for proper results just
>    statistically detects broken code with high probability.) 
>     I strongly recommend such a parallel test suite effort as an
>    integral part of any > 10,000 line project, starting from day
>    one.
> *  For whatever reason, I've had very poor luck with attempting to
>    cooperate with others on Muq:  I've several times spent a week
>    or two putting in the hooks so someone else could contribute to
>    the coding effort, and wound up writing it off as a loss.
>      One apparently either needs someone with leadership karma
>    involved, or to write off group participation.
>      The historic record of volunteers cooperating on mudservers
>    seems spotty at best.  (Especially prior to the server attracting
>    a significant user population:  Once it's a dominant paradigm,
>    lots of people will contribute at least minor fixes, of course.)
> *  I've focussed on providing generic functionality rather than
>    code dedicated to a specific game or whatever:  I think this
>    has helped keep Muq arguably relevant despite the five-year
>    development overrun.
>      I've avoided, for example, writing low-level graphics software
>    rendering code, figuring that cheap hardware would be along by
>    and by and render it mostly irrelevant.  Cards like VooDoo seem
>    to me to have vindicated that approach.
>      I've also avoided writing cool, short-turnaround, instant-
>    gratification sorts of code, figuring that there will be lots
>    of people cranking that stuff out, and concentrated on trying
>    to produce the larger-scale, gritty, less sexy infrastructure.
>    I think that has also paid off, by and large.  E.g., the first
>    module I wrote was my persistent heap, and there still seems to
>    be no real competitor to it absent perhaps the Texas persistent
>    store.
> |  6 Any random wisdom you can share with us?
> *  If I were doing a new system starting now, I think I'd look very hard
>    at finding a way to leverage the just-in-time technology being developed
>    by groups like the Kaffe team for Java.  The Java Virtual Machine
>    is by no means limited to running Java:  It's a fairly generic bytecode
>    engine recieving a lot of implementation attention which may be worth
>    leveraging.
>      On the other hand, I don't see how to mate persistent dbs with it
>    cleanly and I'm not comfortable with it mandating silent arithmetic
>    overflow, so I'm happy going my own way for Muq.
> *  Every large Java program I've seen has wound up compensating for the
>    lack of multiple inheritance by doing cut-and-paste duplication of code.
>    Not good for maintainability of the system.  The bigger your software
>    system gets, the harder it will be to maintain the presense that no
>    two sets in the database overlap.  (Which is mathematically what must
>    be true for multiple inheritance not to be needed.)  We're already seeing
>    various hacks for doing multiple inheritance in Java without really
>    doing it properly.  If you want to build large software systems, I
>    advise doing it right the first time, and supporting multiple inheritance.
> *  Be wary of programming religions.  Software is about complexity, and
>    there -are- no Silver Bullets for NP-complete problems, and never will
>    be.  Anyone who claims to have a way of making it all easy is leading
>    you down the garden path.
>      E.g., top-down design sounds cool, but results in overly specialized
>   low-level code which can't be re-used in other designs, and which can't
>   gracefully adapt to new requirements.
>     Ultimately, there's no solution but iterative approximation, which
>   (imho) is one reason quick turn-around on software projects is imporant.
> *  Every order-of-magnitude increase in the size of a software system
>    produces qualitatively new problems which call for qualitatively
>    new tools and solutions:
>      Variables aren't needed in 1-line programs
>      Subroutines aren't needed in 10-line programms.
>      OOP isn't needed in 100-line programs. 
>    Multi-user distributed virtual worlds are entering a new scale of
>    software system:  I think it is almost dead certain that our
>    existing tools and solutions are going to break, and we're going
>    to have to reach beyond OOP and build ourselves new concepts and
>    tools to make it all work.
>      Just what they might be is extremely hard to guess, (my 'task'
>    mechanism is one try, and my above pointer to leading-edge
>   "exception handling" systems is another possibility), but expect
>   to see routine approaches bogging down on you, watch for emerging
>   patterns in the work-arounds you develop, and be prepared to name
>   and systematize them, and then provide appropriate software tools
>   to support them.
> And remember, we have a duty to enjoy it! :)
>  Cynbe

More information about the MUD-Dev mailing list