[MUD-Dev] Storing tokens with flex & bison

cg at ami-cg.GraySage.Edmonton.AB.CA cg at ami-cg.GraySage.Edmonton.AB.CA
Sun Jan 2 23:16:55 New Zealand Daylight Time 2000

[Jon A. Lambert:]

[Oh dear, I'm running on again. Sorry about that!]

> I've noticed that you can obtain lots of little optimization tweaks by just using 
> "lazy evaluation".  That is instead of emitting an opcode as soon as it is 
> identified, pass that opcode along as an integer to the next routine in the 
> parse tree as you begin to evaluate the next token.   Based on the next
> token you may be able to output code that handles that special case
> rather than the general case.  
> For instance in the general case, you might evaluate and generate:
> x+y --> pushvar(x) pushvar(y) addop()
> x+1 --> pushvar(x) pushint(1) addop()
> The second case can be optimized as a special case if instead of
> immediately emiting pushvar(x), you wait until the next expression is 
> identified.  Thus it could become:
> x+1 --> pushvar(x) increment()
> or 
> x+1 --> pushvar(x+1) 
> I imagine something similar goes on in C compilers which recognize
> that forms like 'x++' and 'x=x+1' as equivalent.

Normally, you just do the special-case check at the place where you have
all of the needed information. In this case, it would be in the processing
for the '+' operation. I don't have an 'add1' opcode, so this particular
example doesn't work for me, but here is a more generic one, which is
a minor kind of "compile-time expression evaluation":

    case ex_negate:
    case ex_fNegate:
	ex = ex->ex_v.ex_expressionPtr->ex_right;
	if (ex->ex_kind == ex_int) {
	    bcGenConst(- ex->ex_v.ex_integer);
	} else {

If the operand for the unary '-' operator is a constant, then just generate
the negative of that constant, rather than generating the positive
value and then a 'neg' byte-code instruction.

Some compilers do a lot of special cases like this. Others have a lot
of "tree rewriting rules". They work through the built-up parse trees,
applying various rewriting rules wherever their patterns match. The result
is still a valid parse tree, but it has been optimized with a bunch of
special cases.

> Another possible optimization is for the parser to generate code for
> a virtual machine that uses a combination of register and stack instructions.
> The parser can be made smart enough to know to use registers.

I thought about that a bit for my system. If you step back a bit and look
at what a modern CPU is actually doing, you find that the register set is
very much like a stack - they will both end up in the processor's data
cache. The key thing to look at is how many native machine instructions
it takes to execute the sequence you want to look at. With a register
set, you are going to have to extract a register number from the byte-code
stream, and use that to index the array of registers in the virtual machine.
That takes native machine instructions, perhaps more than are needed to
do stack operations, given that many real CPU's have post-increment and
pre-decrement addressing modes that the native compiler can generate.

> And finally, there are optimizations that you just can't do with a RD parser. 
> You can pass the generated byte-code into a post processor that rescans
> it for patterns that can be simplified.  Call it a byte-code optimizer, or
> peephole optimizer.  The only problem with this last bit is that pretty printing
> or reverse disassembly to source becomes rather difficult.  I've noticed a
> lot of mud servers do not do this and seek to support reverse code generation
> instead of storing the source code somewhere.  Perhaps they are missing
> out on some major optimization possibilities.

Those can readily be done in conjunction with recursive descent parsing.
Let's make sure we understand the context here. Recursive descent parsing
normally only means the parsing of the input token sequence into complete
statements and functions understood by the language parser as a whole.
That parser can be tied into direct code generation which emits byte-codes,
or can produce a "parse-tree" data structure.

In the latter case, a recursive walk over that tree will produce a byte-code
stream. That recursive walk is *not* known as recursive descent parsing.
I don't think I've seen the term "recursive descent" used for things other
than parsing, just to avoid the possible confusions.

In any case, optimizations can be done regardless of what parsing or
code-generation techniques are used. In my MUD, I build a parse tree,
because that is the form that non-byte-code-compiled code runs from,
directly. In my Draco compiler for 8080 and later for MC68000, the recursive
descent parser directly emitted native machine code, which was filtered
through a pretty good peephole optimizer. Peephole optimizers, and special
cases like we've been discussing, can make for pretty reasonable generated
code. Nothing like gcc and commercial compilers which do things like
register colouring and full program restructuring, of course!

> >Byte-code execution really only makes sense, I believe, for a strongly
> >typed language. If run-time checks and conversions are needed, they will
> >likely cost far more time than the basic interpretation of the code, and
> >so going to byte-code instead of staying with something more direct, could
> >be a waste of time, in that byte-code won't speed execution up much.

> I don't know about whether it makes sense or not.   You are paying for
> something in the subjective area of "ease of user programming" that is 
> difficult to quantify.  I agree that evaluations at runtime is definitely going to be 
> slower, than compile time evaluation.   I don't agree that it will be slower than 
> interpreted execution though.  Certainly some of the optimizations I talked about 
> above are operative with strong type checking.

Either I'm not understanding what you are getting at, or you are
misinterpreting what I was getting at. I was not referring to evaluation
at compile time (e.g. my example above of the compile-time versus
run-time evaluation of the unary negation operator). I was referring to
the work needed to do type-checking in a non-strongly-typed language.
I understand that lots of people prefer dynamic typing, but my point is
that, having made that choice, you have to put up with the run-time cost
of it, regardless of what kind of run-time you have: direct interpretation
of program source, tree interpretation, byte-code execution, native code
execution. For example, lets say you have a language which does not use
strong typing (variables have no types at parse/compile time). You then
write something like this:

    a = "hello";
    a = a + func();

Associated with variable 'a', at run-time, must be information about what
type of value is currently stored in the variable. That type information
must be examined in the second line, in order to know what to do with the
'+' operation. If the language uses '+' for string concatenation, and
'func' happens to return a string, then that statement does string
concatenation. If there had been some other assignment to 'a', the
statement might do integer (or perhaps floating point) addition. The
point is that the check for what to do must be made at run-time. The
only way around this that I'm familiar with is that of having a very
smart compiler, which attempts to deduce the correct type, so that it
can get away without the run-time checks. That would be *very* fancy
stuff for a MUD system!

In a strongly typed language, 'a' would have a declared type. If the
parser allowed the second assignment, it would know, at compile time,
exactly what has to happen at run-time, and so would emit byte-code
that does just that. Also, the byte-code interpretation does not have
to do any checking - it just directly does what needs to be done.

Lets follow a simple example in more detail:

    int a, b;
    a = a + b;

generated bytecode:

    pshl a
    pshl b
    popl a

If the byte-code engine has the stack pointer in a local variable called
'sp', then the C code for the 'iadd' byte-code can be just:

    int *sp;
    int temp;
    temp = *sp++;
    *sp += temp;

(Ignoring any code for fetching opcodes and dispatching to the proper
C code for the byte-codes.)

With a non-strongly-typed language (assuming no smart compiler that can
deduce the types), either the byte-code emitted is considerably more
complex, or the C code for the byte-code engine is considerably more
complex. Either way, it runs slower, perhaps significantly slower.
Remember that in today's processors, conditional branches ('if' statements)
can slow things down a *lot*. You also need to store the current type of
each variable, as well as the current value of it. That adds statements
to your byte-code machine source, and uses more registers/memory.

Don't design inefficiency in - it'll happen in the implementation.

Chris Gray     cg at ami-cg.GraySage.Edmonton.AB.CA

MUD-Dev maillist  -  MUD-Dev at kanga.nu

More information about the MUD-Dev mailing list