[MUD-Dev] Player run reputation system

J Todd Coleman warden at wolfpackstudios.com
Tue Jul 3 10:57:55 New Zealand Standard Time 2001

----- Original Message -----
From: "Trump" <trump at vividvideo.com>
To: <mud-dev at kanga.nu>
Cc: <azeraab at dies-irae.org>
Sent: Friday, June 29, 2001 1:21 PM
Subject: Re: [MUD-Dev] Player run reputation system

> Lets test a live database running PostgreSQL.

>   2 tables, about 2 million entries total....


> Avoiding joins at all costs is a bad idea.

I used to work for a large data warehousing company, and
optimization efforts (which we did all the time) almost always
resulted in a "denormalization" effort -- yes, even at the expense
of data redundancy.

The example join you give above would be more typical to a
transaction processing system (i.e. where "userid=1" aka just give
me "bob's" records) then a reporting system ("do a full join of
these two tables.")  This should make a pretty sizeable difference
in the result (again, I'm no database programmer, but I would
assuming that joining something like, I dunno, 60 records by 1.3m
would take substantially less time than joining 600k * 1.3m.)

To reiterate my original point, "The downside, of course, is that
you waste a lot of space with repeated data, but thats often a much
more trivial concern than performance when you have a huge number of
concurrent users."

If you building a system for fast reporting (as opposed to
transaction processing) redundancy often isn't as big of a rub and
speed is _absolutely vital_.  Another way to read your conclusion
("..the join takes about twice as much CPU time as the individual
selects") would be "I can _double_ my read efficiency for reporting
if I'm willing to give on the redundancy."  Hey, maybe your app
isn't so critical that you can handle it.  Not every app can.  Hard
drives are cheap, and scale easier than processors.  Some people are
willing to take the hit.

I'd also be curious to see what happens when you add more than two
tables to your test case. What happens to the cost hit when you
include a third table, or a fourth?  Is it still 2x the time, or
does the multiplier continue to increase as well?

MUD-Dev mailing list
MUD-Dev at kanga.nu

More information about the MUD-Dev mailing list