[MUD-Dev] Persistant storage.... My current idea.

J C Lawrence claw at under.engr.sgi.com
Fri Apr 3 09:57:35 New Zealand Daylight Time 1998


On Fri, 3 Apr 1998 01:18:08 PST8PDT 
Ben Greear<greear at cyberhighway.net> wrote:

> On Wed, 1 Apr 1998, J C Lawrence wrote:
>> A very simple cheat which performs very nicely is to do something
>> as follows:
>> 
>> Every record in the DB is identified by a unique record # of a
>> signed integer type.
>> 
>> The DB consists of two files, the database itself, and an index.
>> 
>> The index file is an array of the following structures:
>> 
>>   struct { 
>>     off_t record_pos; // Offset of the record in the DB 
>>     size_t record_len; // Length of the record in the DB 
>>   }
>>
> I still see no reason to store the index on disk.  

Learn the lessons of Marcus Ranum and UnterMUD.  Diask-based servers
can and do significantly out-perform in-RAM servers due to the lower
rate of page faults -- despite the file IO overhead.  Of the current
server architectures Cold is a perfect example in point.  cf Brandon's 
and Miro's oft referenced performance figures for Cold.

> It just means
> another write everytime I modify the DB significantly.  My hash
> table will be exactly that however, just stored in RAM instead of
> disk.

Sure.  You can update the index file on every record write, you can
lazy write it, you can do other forms of cacheing, you can delay all
writes until full-scale DB commit time... There are many possible
approaches.  The value of the seperate index is that the cost of
accessing any record, no matter where it is in the DB is now
constant.  Other indexiong forms have other expense patterns.

>> The N'th structure in the index file, located by seeking to an
>> offset of (N *sizeof (struct)) and then reading sizeof (struct)
>> bytes, holds the data for the record in the DB with a record number
>> of N.

> The object id numbers will not be synchronus, indeed, they will vary
> wildly as the high bits will be the server id, and the low bits will
> be the object id on that server.  I'll probably use longs or two
> integers, eight bytes at any rate.

Really?  Does this mean that objects migrate between servers?  Think
this thru:

  Object A is defined on Server X.

  A inherits from objects Q, R, S, and T, also defined on X.

  You now move A to server Y.

  Do Q, R, S and T, __and__ all their other children follow?

  What about all the objects that remain on X that inherit from those
objects too?

  Do you merely make copies of A, Q, R, S, and T, on Y, and leave the
originals on X?

  What do you do when the state of A  updates on X to also update Y?

  What do you do when the state of A updates on Y to also update X?

  What about is someone re-programs Q causing all children to also
alter?

   What happens when server Z comes along and wants to take/copy A etc 
from Y?

Please, have a good long look at COOL, and realise its strengths and
weaknesses.  COOL's model is expensive on net traffic, but objects tay
where they are defined, allowing their contexts to be known and well
defined.  All object calls are then RPC's across the net.  

This is not to say that moving the objects is automatically a bad
idea.  It has significant benefits over RPC'ing everythinbg -- it just
also requires significantly more engineeing effort to maintain logical
consistency.

>> New records are added into either the first free space in the DB
>> file that is large enough to hold them, or some "best fit"
>> equivalent.  As old records are deleted this opens "spaces" that
>> new records (with potentially very different record numbers) will
>> be inserted into.  Order has no importance in the DB file -- that
>> is what the index file is for.

> I'm just going to put them on the end.  I'd wrather waste disk space
> than take the time to be clever.  Besides, as the objects' images on
> the disk will change, I'll need some room to grow and shrink w/out
> affecting the neighbors.

Why?  If you use the suggested model there is very little overhead to
finding a free location for an object (when I did this I merely
maintained a free block list for a total expense of perhaps 20 LOC).
There is also no concern with object size changes:

  Object A exists.
  Object A changes into A'.
  A new space in the DB is found to hold A'.
  A' is written there.
  The entry for A in the DB is marked as deleted.
  The index for Z is updated to point at the new A'.

Neighbors are never affected.

>> I now have a dedicated 'net connection at home.

> Damn, I'm jeleous...

56K dialup, static IP, $60/mo (www.rahul.net (Yes, Rahul Dhesi of
DECUS fame)).  

>...gotta give Cox and US west a swift kick in the
> arse..saw 2 MBS connection (wireless T1) for $360 a month..but it
> would have to make money to be attractive to me....  Just wait for
> DSL or the cable modems..whichever get here first :)

ADSL has not quite reached my part of the SF Bay area (its close), but
that's $180 for 384Kbps symmetrical (it varies slightly depending on
who you pick for your CLEC: PacBell, Covad, or NorthPoint).  IDSL
(which I'd be more likely to go for), has yet to establish a firm
pricing model here alas.

--
J C Lawrence                               Internet: claw at null.net
(Contractor)                               Internet: coder at ibm.net
---------(*)                     Internet: claw at under.engr.sgi.com
...Honourary Member of Clan McFud -- Teamer's Avenging Monolith...



More information about the MUD-Dev mailing list