[MUD-Dev] Persistant storage.... My current idea.

J C Lawrence claw at under.engr.sgi.com
Wed Apr 1 14:58:09 New Zealand Daylight Time 1998


On Sat, 28 Mar 1998 21:53:09 PST8PDT 
Ben Greear<greear at cyberhighway.net> wrote:

> I'm currently planning on using one file, with certain blocks
> allocated within it for objects.  It will basically be a linked list
> represented as a file.

> I'll have headers that point the the seek position of the next and
> previous element in the file.  It will also keep track of the used
> and total space allocated for that element.

> If I need to update, and it's too big, it will be moved to the end
> of the file, and the previous element will get it's space.  (I'll
> deal with boundary cases, sufice it to say.)

> Now, this reeks of inneficiency when trying to look up an object
> based on some key (object number in my case, probably 8 bytes ).

> However, I'm going to create a hash table at boot time to link an
> object id with the seek position of that object in the file.  I
> believe this should give me as quick access as I can expect.

A very simple cheat which performs very nicely is to do something as
follows:

  Every record in the DB is identified by a unique record # of a
signed integer type.

  The DB consists of two files, the database itself, and an index.

  The index file is an array of the following structures:

    struct {
      off_t record_pos;   // Offset of the record in the DB
      size_t record_len;  // Length of the record in the DB
    }

  The N'th structure in the index file, located by seeking to an
offset of (N *sizeof (struct)) and then reading sizeof (struct) bytes, 
holds the data for the record in the DB with a record number of N.

  Seeking to offset record_pos in the DB file and reading record_len
bytes will get you the wrapped contents of that record.

  The DB file is an amorphous binary blob of no particular pattern.
It contains records of variable length which follow a particular
pattern:

    struct {
      unsigned char beg_signature[16];
      signed long record_num;
      size_t record_len;
      unsigned char data[1]; // the actual contents of the record,
                             // record_len bytes long.
      off_t next_record;     // Offset of next record in DB
      unsigned char end_signature[16];
    }

  This pattern is the "wrapped" pattern.  The first record_len bytes
of the data[] array are the actual data contents of the record.

  There is reason behind the signatures and the like.  Look at tdbm
and YOODA for working models.  The base reason here is so that the
index file can be reconstructed from the DB file, and so that gross
format corruption can be detected.

  New records are added into either the first free space in the DB
file that is large enough to hold them, or some "best fit" equivalent.
As old records are deleted this opens "spaces" that new records (with
potentially very different record numbers) will be inserted into.
Order has no importance in the DB file -- that is what the index file
is for.  

  Of course eventually your DB will be peppered with such small spaces
which are too small for new records -- then you pack the DB.
  

> Some other things: I'll have objects unique across the universe,
> even different servers.  (Each server will be assigned an ID, and
> each object within that server will have a unique ID.  So, given
> both ID's, I can guarantee uniqueness.)

You need to look at Cool -- it did exactly this with some success.

> PS.  I hope the fact that the last message I saw was about a change
> in the mailing list software isn't a bad sign!!

I now have a dedicated 'net connection at home.  In celebration I'm
moving the list to Petidomo (see www.petidomo.com), and adding
searchable web archives of past list traffic (yes, email addresses
will be munged), and a backing FTP site.  This transfer was supposed
to be happening today.  It didn't and it won't, not today.  Sorry.
RSN.

--
J C Lawrence                               Internet: claw at null.net
(Contractor)                               Internet: coder at ibm.net
---------(*)                     Internet: claw at under.engr.sgi.com
...Honourary Member of Clan McFud -- Teamer's Avenging Monolith...



More information about the MUD-Dev mailing list