[MUD-Dev] Re: processors

J C Lawrence claw at kanga.nu
Sat Jan 30 18:17:57 New Zealand Daylight Time 1999


On Thu, 28 Jan 1999 16:54:03 -0800 (PST) 
diablo <diablo at best.com> wrote:

> I hope this isn't the wrong forum to ask this question. If it is, I
> apologize.

Its fine.

> Achaea has recently run into processor-overload. We run a PII300 and
> now we are going to have to upgrade at least to a PII450. The other
> option we are considering is a Xeon-based machine. I am not a
> techie, but from what I understand, the Xeon chip is likely to be
> faster for a mud, as it doesn't fool around with all the MMX and
> other graphical bs on the Pentium line. However, I cannot find any
> benchmarks comparing the two in terms of pure processing powr. The
> Pentiums are always rated using graphical applications and the Xeons
> are not.

You are barking up the wrong tree in the wrong forest, but are on the
right continent.  Sorry to be so blunt, but I used to do this sort of
stuff for a living.

You need to find out where you bottlenecks are, what system
performance characteristics are constraining your _perceived_
perfomance, and what the relative sources and values of those two
things are.  Once you know the parameters of your problem, you can
then selectively handle *that* and know that you will actually resolve
the problem.

  I've seen too many performance have faster systems thrown at them,
resulting in ___zero___ performance increase because the real
performance bottleneck was in disk IO or some such, and that aspect
didn't change at all in the new system despite the fact that the new
CPU was 20 times faster.  A really bad example of this BTW was where
the real problem was the striping size used on the RAID arrays.  Once
we bumped the stripe size up to around 8Meg, performance went thru the
roof.

Early initial interesting values:

  What is average CPU load and load for your task?

  What is you average IO queue depth for the system and your task?

  What is your average disk load for the system and your task?

  What is your average disk queue depth for the system and your task?

  Kernel time vs user space time for your task?

  What is your basic IO profile (statistics on data in and out per
data channel)?

  How saturated is your net connection (send and receive seperately --
I do hope you are running duplex ethernet and not simplex)?

  How saturated is your local net segment (nothing to do with local
performance, but can it impact)?  Remember that ethernet saturation
starts becoming a concern at around 60% and you're in trouble by the
time you get into the 70's.  Also remember that the curve is not
linear.

  What does a profiler say about your coder in normal execution?

  Are you deadlocking or depending on API time-outs in your normal
execution?

  Are there any API timeouts?

  How expensive is your error handling and how often is it invoked?
syslog() can be *REALLY* slow.

Unfortunately this whole area is a bit of an art.  Very very tiny
changes made to systems can have massive performance returns (or
penalties).

I'm going to generically bet that your system is suffering from almost
all of the following as they are what I usually see:

  1) IO bandwidth (likely disk, unlikely RAM, possibly network, very
possibly IPC).  NB Excessive heap fragmentation can massively
exacerbate this problem.  'sar' and all the other system analysis and
report tools, especially the live-system ones are your friends.

  2) Busywork -- just bad algorithm implemenation with pathological
performance characteristics.  Yeah, it happens, even when we try not
to.  Profilers are your friends.

  3) API timeouts/lock contention, possibly at the kernel level but
always as a result of user space mis-handling.  Profilers are your
friend.

#1 The basic PC architecture is terrible in this regard.  They hardly
have enough IO bandwidth to get out of their own trouble.  OTOH they
have more than enough bandwidth for most MUDs or other lightweight
text processing tasks.

#2 DIKU loop and linked list spinning, busy loops, polling loops,
repetitive file opens, spurious IO generation, simple optimisation
failures, etc.  The list is endlessly varied.

#3 is least common with well experienced programmers.  I've seen a
*lot* of hobbiest code that bites here tho.

--
J C Lawrence                              Internet: claw at kanga.nu
----------(*)                            Internet: coder at kanga.nu
...Honorary Member of Clan McFud -- Teamer's Avenging Monolith...




More information about the MUD-Dev mailing list