Local TN assignment (Design of CMU Common Lisp)

Next: Control optimization, Previous: Global TN assignment, Up: Design of CMU Common Lisp [Contents]

15 Local TN assignment ¶

[Want a different name for this so as not to be confused with the different local/global TN representations. The really interesting stuff in this phase is operation selection, values representation selection, return strategy, etc. Maybe this phase should be conceptually lumped with GTN as “implementation selection”, since GTN determines call strategies and locations.]

[### I guess I believe that it is OK for VMR conversion to dick the ICR flow graph. An alternative would be to give VMR its very own flow graph, but that seems like overkill.

In particular, it would be very nice if a TR local call looked exactly like a jump in VMR. This would allow loop optimizations to be done on loops written as recursions. In addition to making the call block transfer to the head of the function rather than to the return, we would also have to do something about skipping the part of the function prolog that moves arguments from the passing locations, since in a TR call they are already in the right frame.

In addition to directly indicating whether a call should be coded with a TR variant, the Tail-P annotation flags non-call nodes that can directly return the value (an “advanced return”), rather than moving the value to the result continuation and jumping to the return code. Then (according to policy), we can decide to advance all possible returns. If all uses of the result are Tail-P, then LTN can annotate the result continuation as :Unused, inhibiting emission of the default return code.

[### But not really. Now there is a single list of templates, and a given template has only one policy.]

In LTN, we use the :Safe template as a last resort even when the policy is unsafe. Note that we don’t try :Fast-Safe; if this is also a good unsafe template, then it should have the unsafe policies explicitly specified.

With a :Fast-Safe template, the result type must be proven to satisfy the output type assertion. This means that a fast-safe template with a fixnum output type doesn’t need to do fixnum overflow checking. [### Not right to just check against the Node-Derived-Type, since type-check intersects with this.]

It seems that it would be useful to have a kind of template where the args must be checked to be fixnum, but the template checks for overflow and signals an error. In the case where an output assertion is present, this would generate better code than conditionally branching off to make a bignum, and then doing a type check on the result.

How do we deal with deciding whether to do a fixnum overflow check? This is perhaps a more general problem with the interpretation of result type restrictions in templates. It would be useful to be able to discriminate between the case where the result has been proven to be a fixnum and where it has simply been asserted to be so.

The semantics of result type restriction is that the result must be proven to be of that type *except* for safe generators, which are assumed to verify the assertion. That way “is-fixnum” case can be a fast-safe generator and the “should-be-fixnum” case is a safe generator. We could choose not to have a safe “should-be-fixnum” generator, and let the unrestricted safe generator handle it. We would then have to do an explicit type check on the result.

In other words, for all template except Safe, a type restriction on either an argument or result means “this must be true; if it is not the system may break.” In contrast, in a Safe template, the restriction means “If this is not true, I will signal an error.”

Since the node-derived-type only takes into consideration stuff that can be proved from the arguments, we can use the node-derived-type to select fast-safe templates. With unsafe policies, we don’t care, since the code is supposed to be unsafe.

Local TN assignment (LTN) assigns all the TNs needed to represent the values of continuations. This pass scans over the code for the component, examining each continuation and its destination. A number of somewhat unrelated things are also done at the same time so that multiple passes aren’t necessary. – Determine the Primitive-Type for each continuation value and assigns TNs to hold the values. – Use policy information to determine the implementation strategy for each call to a known function. – Clear the type-check flags in continuations whose destinations have safe implementations. – Determine the value-passing strategy for each continuation: known or unknown. – Note usage of unknown-values continuations so that stack analysis can tell when stack values must be discarded.

If safety is more important than speed and space, then we consider generating type checks on the values of nodes whose CONT has the Type-Check flag set. If the destination for the continuation value is safe, then we don’t need to do a check. We assume that all full calls are safe, and use the template information to determine whether inline operations are safe.

This phase is where compiler policy switches have most of their effect. The speed/space/safety tradeoff can determine which of a number of coding strategies are used. It is important to make the policy choice in VMR conversion rather than in code generation because the cost and storage requirement information which drives TNBIND will depend strongly on what actual VOP is chosen. In the case of +/FIXNUM, there might be three or more implementations, some optimized for speed, some for space, etc. Some of these VOPS might be open-coded and some not.

We represent the implementation strategy for a call by either marking it as a full call or annotating it with a “template” representing the open-coding strategy. Templates are selected using a two-way dispatch off of operand primitive-types and policy. The general case of LTN is handled by the LTN-Annotate function in the function-info, but most functions are handled by a table-driven mechanism. There are four different translation policies that a template may have:

Safe: The safest implementation; must do argument type checking.
Small: The (unsafe) smallest implementation.
Fast: The (unsafe) fastest implementation.
Fast-Safe: An implementation optimized for speed, but which does any necessary checks exclusive of argument type checking. Examples are array bounds checks and fixnum overflow checks.

Usually a function will have only one or two distinct templates. Either or both of the safe and fast-safe templates may be omitted; if both are specified, then they should be distinct. If there is no safe template and our policy is safe, then we do a full call.

We use four different coding strategies, depending on the policy:

Safe:: safety $>$ space $>$ speed, or we want to use the fast-safe template, but there isn’t one.
Small:: space $>$ (max speed safety)
Fast:: speed $>$ (max space safety)
Fast-Safe (and type check):: safety $>$ speed $>$ space, or we want to use the safe template, but there isn’t one.

“Space” above is actually the maximum of space and cspeed, under the theory that less code will take less time to generate and assemble. [### This could lose if the smallest case is out-of-line, and must allocate many linkage registers.]