VMR Control representation (Design of CMU Common Lisp)

Next: Stack analysis, Up: VMR conversion [Contents]

17.1 VMR Control representation ¶

We represent all control transfer explicitly. In particular, :Conditional VOPs take a single Target continuation and a Not-P flag indicating whether the sense of the test is negated. Then an unconditional Branch VOP will be emitted afterward if the other path isn’t a drop-through.

So we linearize the code before VMR-conversion. This isn’t a problem, since there isn’t much change in control flow after VMR conversion (none until loop optimization requires introduction of header blocks.) It does make cost-based branch prediction a bit ucky, though, since we don’t have any cost information in ICR. Actually, I guess we do have pretty good cost information after LTN even before VMR conversion, since the most important thing to know is which functions are open-coded.

VMR preserves the block structure of ICR, but replaces the nodes with a target dependent virtual machine (VM) representation. Different implementations may use different VMs without making major changes in the back end. The two main components of VMR are Temporary Names (TNs) and Virtual OPerations (VOPs). TNs represent the locations that hold values, and VOPs represent the operations performed on the values.

A “primitive type” is a type meaningful at the VM level. Examples are Fixnum, String-Char, Short-Float. During VMR conversion we use the primitive type of an expression to determine both where we can store the result of the expression and which type-specific implementations of an operation can be applied to the value. [Ptype is a set of SCs == representation choices and representation specific operations]

The VM specific definitions provide functions that do stuff like find the primitive type corresponding to a type and test for primitive type subtypep. Usually primitive types will be disjoint except for T, which represents all types.

The primitive type T is special-cased. Not only does it overlap with all the other types, but it implies a descriptor (“boxed” or “pointer”) representation. For efficiency reasons, we sometimes want to use alternate representations for some objects such as numbers. The majority of operations cannot exploit alternate representations, and would only be complicated if they had to be able to convert alternate representations into descriptors. A template can require an operand to be a descriptor by constraining the operand to be of type T.

A TN can only represent a single value, so we bare the implementation of MVs at this point. When we know the number of multiple values being handled, we use multiple TNs to hold them. When the number of values is actually unknown, we use a convention that is compatible with full function call.

Everything that is done is done by a VOP in VMR. Calls to simple primitive functions such as + and CAR are translated to VOP equivalents by a table-driven mechanism. This translation is specified by the particular VM definition; VMR conversion makes no assumptions about which operations are primitive or what operand types are worth special-casing. The default calling mechanisms and other miscellaneous builtin features are implemented using standard VOPs that must be implemented by each VM.

Type information can be forgotten after VMR conversion, since all type-specific operation selections have been made.

Simple type checking is explicitly done using CHECK-xxx VOPs. They act like innocuous effectless/unaffected VOPs which return the checked thing as a result. This allows loop-invariant optimization and common subexpression elimination to remove redundant checks. All type checking is done at the time the continuation is used.

Note that we need only check asserted types, since if type inference works, the derived types will also be satisfied. We can check whichever is more convenient, since both should be true.

Constants are turned into special Constant TNs, which are wired down in a SC that is determined by their type. The VM definition provides a function that returns a constant TN to represent a Constant Leaf.

Each component has a constant pool. There is a register dedicated to holding the constant pool for the current component. The back end allocates non-immediate constants in the constant pool when it discovers them during translation from ICR.

[### Check that we are describing what is actually implemented. But this really isn’t very good in the presence of interesting unboxed representations...] Since LTN only deals with values from the viewpoint of the receiver, we must be prepared during the translation pass to do stuff to the continuation at the time it is used. – If a VOP yields more values than are desired, then we must create TNs to hold the discarded results. An important special-case is continuations whose value is discarded. These continuations won’t be annotated at all. In the case of a Ref, we can simply skip evaluation of the reference when the continuation hasn’t been annotated. Although this will eliminate bogus references that for some reason weren’t optimized away, the real purpose is to handle deferred references. – If a VOP yields fewer values than desired, then we must default the extra values to NIL. – If a continuation has its type-check flag set, then we must check the type of the value before moving it into the result location. In general, this requires computing the result in a temporary, and having the type-check operation deliver it in the actual result location. – If the template’s result type is T, then we must generate a boxed temporary to compute the result in when the continuation’s type isn’t T.

We may also need to do stuff to the arguments when we generate code for a template. If an argument continuation isn’t annotated, then it must be a deferred reference. We use the leaf’s TN instead. We may have to do any of the above use-time actions also. Alternatively, we could avoid hair by not deferring references that must be type-checked or may need to be boxed.