[MUD-Dev] Storing tokens with flex & bison

cg at ami-cg.GraySage.Edmonton.AB.CA cg at ami-cg.GraySage.Edmonton.AB.CA
Wed Jan 19 19:59:33 New Zealand Daylight Time 2000

[Jon A. Lambert:]

> >Unfortunately, there was no noticeable change in run speed with or without
> >them. However, with this version of gcc/egcs and RedHat Linux, the tests
> >all run slower than on the previous version (same machine). Grrrrrr!

> Hmm.. on its face, you would think it would be an operation that would occur
> frequently in loop blocks.  A few instructions saved in each iteration should
> in theory add up fast.

That was my hope. I'm sure there is *some* speedup, but it wasn't visible
in 30 second runs, with granularity 1 second. I think the main jump-table
overhead overwhelms the difference between 'add1' and 'psh1; add'.

> class TMVar {
>    Type type;
>    union {
>       int       i;
>       float     f;
>       TMError   e;
>       TMString  s;
>       TMBuffer  b;
>       TMList    l;
>       TMQueue   q;
>       TMArray   a;
>       TMObject  o;
>       TMKA     k;
>       TMMethod m;
>    } value;
> }

> It's bytecode that has been compiled by a PRS compiler.

Er, PRS?

> The big union is reminiscent of Bison/Yacc tokens (and CoolMUD too).

Just about anything that tokenizes. My 'QValue_t' (don't ask about the
'Q' - its historic) is just a union like that. I cheat and store the
type as the top 6 bits of the values. All values are DB references, other
than 'int' and 'fixed', which I cheat on. That causes my limit of 64 million
database entities.

> Anyways I came across this notion of split-phase construction and
> split-phase destruction as documented by James Coplien I believe.


Definitely interesting. It had never occurred to me that C++ won't let
you have a union of classes, but it sortof makes sense to me.

> So the new operator is overloaded (short circuited) to return a 
> pointer to "value".  It does nothing, ergo no allocation is done
> at all, yet the constructors are called.  :-)

But with inlining and heavy optimization, even that might go away.

> And here is the TMVar default constructor:

> inline TMVar::TMVar(Type t = NUM) : type(t) {
>    switch(t) {
>       case NUM:
>          *(int*)value = 0;
>       case STRING:
>          new(value) TMString;
>       case LIST:
>          new(value) TMList;
>       ... etc...

Er, doncha need some 'break's in there? That's my type of bug!

Plus, you are answering my question with another one. What's the syntax:

    new(<field-name>) <typename>

do!!? Treat me as completely C++ illiterate. This is likely too basic for
the list, so feel free to educate me privately.

> So there's my explanation for all the fugly casting that's going on below 
> and the unusual use of the new() operator.

The rest I followed - a fascinating look at higher level ways of doing

> Another danger to earlier is in returning references to variables 
> created on the stack.  Pretty much the same as the C error of returning
> pointers to locals.  I would _never_ever do that.  <grin>

Me neither! For instance, I absolutely do not know that that can stomp the
return address, resulting in the program branching off into the middle of
some other unsuspecting routine.

Don't design inefficiency in - it'll happen in the implementation.

Chris Gray     cg at ami-cg.GraySage.Edmonton.AB.CA

MUD-Dev maillist  -  MUD-Dev at kanga.nu

More information about the MUD-Dev mailing list