[MUD-Dev] Re: TECH: reliablity (was: Distributed Muds)

Bruce bruce at puremagic.com
Thu Apr 26 22:48:17 New Zealand Standard Time 2001

Derek Snider wrote:

> I still don't think periodic traversal is a bad thing... at least
> this way you know that the memory has not become corrupt.
> Also, in the case of a memory leak, leaked memory would be paged
> out.


> If something went crazy and started allocated memory like mad, your
> program would coredump quickly instead of freezing for a minutes or
> two until all virtual memory is exhausted.
> As I said... better to have enough RAM than to rely on virtual
> memory.  Also... your program should check the return status of
> malloc().  If malloc fails, then something bad is going on, and it
> should fail gracefully... not coredump.

Or, you could follow sound, basic software engineering principles and
work hard to ensure that these types of things didn't happen.  Memory
and resource management, as well as correct execution, aren't things
that can or should be entrusted to luck, especially within a long
running server process.  Crashes or system crippling bugs simply
aren't acceptable. (And there's nothing like that sinking feeling in
your gut while you watch the server crash and write out a big core
file. :()

For memory management, there are several approaches that you can take,
both in terms of handling the management of memory at runtime
(refcounting, GC, manual management, pool-based management for short
tasks, etc), as well as many techniques for testing for bugs in your
implementation, like Purify, Boehm's GC in leak detection mode,
various malloc debugging libraries, custom code that ties into your
refcount routines to help you detect and fix refcounting bugs.  The
lists could go on.  This is a relatively unexplored topic within the
mud community as far as I can tell and there are a lot of interesting
techniques and tools that I've seen and used elsewhere that I've been
trying to apply to my work on mud-type stuff.

For correct execution, you really need to ensure that you know your
problem space, that you've done due diligence in researching it and
doing some sample studies and prototypes, written a solid
specification, and written testing code that tests for both
functionality as well as regressions in bugfixes.  Doing those types
of tests for Cold turned up many many bugs and have made the addition
of new features much simpler. No longer do I have to worry as much as
long as I know that our current test suite covers a particular area
well.  Code audits or frequent code reviews are also a blessing here.
It is rare that Brad or myself commit anything to Cold's CVS without
having had the other review it.

There's also the whole argument in favor of components and those types
of systems which were heavily promoted by John Buehler some months
ago, and with good reason.

  - Bruce

MUD-Dev mailing list
MUD-Dev at kanga.nu

More information about the MUD-Dev mailing list