[MUD-Dev] Storing tokens with flex & bison

Jon A. Lambert jlsysinc at ix.netcom.com
Mon Jan 3 00:11:14 New Zealand Daylight Time 2000

Chris Gray wrote:
>JCL: why do you say recursive descent makes it hard to put a VM under a
>language? Lexing/parsing is pretty independent of execution. My AmigaMUD
>system uses recursive descent for its programming language, and that had
>no effect on being able to put a byte-code machine into the system. It
>had never occurred to me to even think about any effect. Hmm. Perhaps
>I'm misparsing the conjunction in your sentence?

I've noticed that you can obtain lots of little optimization tweaks by just using 
"lazy evaluation".  That is instead of emitting an opcode as soon as it is 
identified, pass that opcode along as an integer to the next routine in the 
parse tree as you begin to evaluate the next token.   Based on the next
token you may be able to output code that handles that special case
rather than the general case.  

For instance in the general case, you might evaluate and generate:
x+y --> pushvar(x) pushvar(y) addop()
x+1 --> pushvar(x) pushint(1) addop()

The second case can be optimized as a special case if instead of
immediately emiting pushvar(x), you wait until the next expression is 
identified.  Thus it could become:

x+1 --> pushvar(x) increment()
x+1 --> pushvar(x+1) 

I imagine something similar goes on in C compilers which recognize
that forms like 'x++' and 'x=x+1' as equivalent.

Another possible optimization is for the parser to generate code for
a virtual machine that uses a combination of register and stack instructions.
The parser can be made smart enough to know to use registers.

For instance, instead of: 
x+y --> pushvar(x) pushvar(y) addop()
would become...
x+y--> lda(x) adda(y) pusha()
and the special case...
x+1--> lda(x) inca() pusha()

The first example does 3 stack push operations and 2 stack pops, 
while the latter does only 1 stack push.  (Assuming addop() pops the
top 2 arguments on the stack and pushes the result.) 
A VM that uses an array of possible registers to make assignments 
could execute bytecode faster, no?

Class VM 
    int           register[8];
    Stack*   stack;
    int*         code;

And finally, there are optimizations that you just can't do with a RD parser. 
You can pass the generated byte-code into a post processor that rescans
it for patterns that can be simplified.  Call it a byte-code optimizer, or
peephole optimizer.  The only problem with this last bit is that pretty printing
or reverse disassembly to source becomes rather difficult.  I've noticed a
lot of mud servers do not do this and seek to support reverse code generation
instead of storing the source code somewhere.  Perhaps they are missing
out on some major optimization possibilities.  

>Byte-code execution really only makes sense, I believe, for a strongly
>typed language. If run-time checks and conversions are needed, they will
>likely cost far more time than the basic interpretation of the code, and
>so going to byte-code instead of staying with something more direct, could
>be a waste of time, in that byte-code won't speed execution up much.

I don't know about whether it makes sense or not.   You are paying for
something in the subjective area of "ease of user programming" that is 
difficult to quantify.  I agree that evaluations at runtime is definitely going to be 
slower, than compile time evaluation.   I don't agree that it will be slower than 
interpreted execution though.  Certainly some of the optimizations I talked about 
above are operative with strong type checking.

--*     Jon A. Lambert - TychoMUD Email: jlsysinc at nospam.ix.netcom.com     *--
--*     Mud Server Developer's Page <http://jlsysinc.home.netcom.com>      *--
--* "No Free man shall ever be debarred the use of arms." Thomas Jefferson *--

MUD-Dev maillist  -  MUD-Dev at kanga.nu

More information about the MUD-Dev mailing list