[MUD-Dev] Re: MUD Development Digest

J C Lawrence claw at under.engr.sgi.com
Wed Apr 8 19:22:46 New Zealand Standard Time 1998


On Mon, 6 Apr 1998 05:43:59 PST8PDT 
Justin McKinnerney<xymox at toon.org> wrote:

> It seems to me that dealing with memory managment at user level
> rather than allowing the system to do it at kernel level would be
> far less efficent.  Espicially on the more proven operating systems
> (aka Solaris or IRIX).

Not true -- factually a very very very long way from true.  Dig up the
original texts by Marcus Ranum on the area (and tell me if you find
them, I'm still looking).  This comes up yet again every year in
r.g.m.* when a new freshmen class hits memory management in their CS
classes.

At an application level, especially for applications with large and
active working sets, most OS'es perform abyssmally.  The problem is
that they have no choice *BUT* to perform abysmally.  The problem is
out of the OS'es control: heap fragmentation.

OS'es manage memory in pages.  They don't bother with anything smaller 
or larger.  They only bother with pages.  User-space bumph keeps track 
of the bigger and smaller stuff, not the OS kernel.  

A page is typically 4K, but can be significantly larger (rarely
smaller).

Take an application that has, say, 5 million objects it managers.
Those objects are stored in memory, allocated off the heap.  In the
typical case most of the object won't be accessed most of the time,
and a much amsller subset will be accessed (comparitively)
frequently.  Of course the selection of those more active objects is
seemingly random (what objects do players like to pla with, what
rooms, what areas?).  The result is that the spread of those objects
thru memory is also random.  

That collection of memory pages storing those active objects is known
as the "working set" for the application (actually just the collection 
of objects is also known as the "working set", but we're not
interested in that definition here).

Sooner or later (sooner) the working set of the application is going
to exceed the total number of physically backed (in RAM) memory pages
allocated to that process.  When that happens, the system will start
to page fault.   is memory pages will be written out to swap (disk),
and new pages will be read into RAM to replace them.  This is what
memory management at an OS level means.

Consider a trivial case:

  The application's working set comprises N memory pages.

  The OS has RAM allocated to the process for N-1 pages.

  The application does an utterly trivial loop, updating one byte in
each memory page of the working set:

    while (1) {
      for (page = 0; page < N; page++) {
        *page = *page + 1;
      }
    }

The system will page trash itself to death.  Given the (typical) LRU
cache used by most OS'es, every single access will cause a page fault
as the next page that loop will touch will be the oldest page in the
cache (ignoring the impact of other processes and their own page fault
generation).  Voila!  Your machine just became unusable, and there's
not a damn thing you can do about it, and there's not a damn thing
that the OS can do about it either.  Sorry.

Yes, its an artificial and contrived case, but its not *that*
artificial or contrived.  Consider the case where N is the number of
physically backed pages for the application, and where the working set 
for the application is 2N.  Now run a loop ala:

    while (1) {
      // random returns an random integer from 0 - #
      page[random (2N)] = page[random (255)];  
    }

Your system will almost beat itself to death.  The more interesting
thing about this is that its actually very close to the real case of a
MUD server.  The pattern of object accesses by a server (and its
players) is pretty damned close to random when looked at from the
memory page level...

Now take the case of a disk-based DB with an intelligent cache.  The
cache has an interesting effect: It (largely) concentrates the entire
working set of the application in a minimal number of memory pages.
Instead of the various component objects for the working set being
(pessimally) scattered randomly thru a large number of memory pages
(worst case, one page per object), instead the working set objects are
concentrated in the cache with little or no waste space.  Bingo.  What
was a working set of say, 500 memory pages now suddenly fits in 30
memory pages, and guess what: Your system is no longer page faulting
(as badly) as all the objects it is accessing are already in RAM...

This is the lesson that Marcus Ranum demonstrated so well for MUDs
with UberMUD, and which was learnt and followed by much of the Tiny-*
clan, MOO, and most recently Cold.

One thing you can do is dig up the performance figures and backing
data Brandon Gillespie and Miro drop every so often in r.g.m.* for
Cold...

> This is beside the fact that unless you know for certain that the
> total size of all running processes (and whatever tables the kernel
> is handling) is definately less than the size of total physical
> memory (meaning you should give yourself 8-16 megs slack in most
> operating systems for the file system itself). It seems to me that
> it would be a pipe dream to try to make sure everything stays
> running in memory only. And even if you do make sure it's smaller,
> many UN*X implentations are smart about paging "dead" memory to keep
> it free for running processes. The only exception that I can think
> of would be Linux, where I don't think they do any smart paging to
> keep memory clear for any running or potential new processes that
> may need it (they only page when forced, unless smart paging is
> something in the 2.1 kernel?).

Linux'es paging algorithm is fairly decent.  Free RAM is divided in
two sections: that allocated to processes and their heaps, and file
system cache.  By default everything not needed by processes and the
kernel goes to FS cache.  When process heap starts to compete woth the
FS for RAM pages, an LRU cache kicks in on the process side, with
older/inactive pages being swapped out and the new free spaces being
allocated to heap or FS depending on demand (process always wins over
FS).

> Threads quite often also make debugging something of an
> adventure. This is actually something I'm currently dealing with as
> I am doing some work on Flight Unlimited 2. (Getting threadlock
> under one compiler, getting a complete bailout on the other)

I'll note here that my server, idling, uses just under 30 threads.
Activity raises the thread count.  Threads are not excessively complex
or tiresome to work with, they merely require care and attention to
detail.  

--
J C Lawrence                               Internet: claw at null.net
(Contractor)                               Internet: coder at ibm.net
---------(*)                     Internet: claw at under.engr.sgi.com
...Honourary Member of Clan McFud -- Teamer's Avenging Monolith...



More information about the MUD-Dev mailing list