Stack analysis (Design of CMU Common Lisp)

17.2 Stack analysis

Think of this as a lifetime problem: a values generator is a write and a values receiver is a read. We want to annotate each VMR-Block with the unknown-values continuations that are live at that point. If we do a control transfer to a place where fewer continuations are live, then we must deallocate the newly dead continuations.

We want to convince ourselves that values deallocation based on lifetime analysis actually works. In particular, we need to be sure that it doesn’t violate the required stack discipline. It is clear that it is impossible to deallocate the values before they become dead, since later code may decide to use them. So the only thing we need to ensure is that the “right” time isn’t later than the time that the continuation becomes dead.

The only reason why we couldn’t deallocate continuation A as soon as it becomes dead would be that there is another continuation B on top of it that isn’t dead (since we can only deallocate the topmost continuation).

The key to understanding why this can’t happen is that each continuation has only one read (receiver). If B is on top of A, then it must be the case that A is live at the receiver for B. This means that it is impossible for B to be live without A being live.

The reason that we don’t solve this problem using a normal iterative flow analysis is that we also need to know the ordering of the continuations on the stack so that we can do deallocation. When it comes time to discard values, we want to know which discarded continuation is on the bottom so that we can reset SP to its start.

[I suppose we could also decrement SP by the aggregate size of the discarded continuations.] Another advantage of knowing the order in which we expect continuations to be on the stack is that it allows us to do some consistency checking. Also doing a localized graph walk around the values-receiver is likely to be much more efficient than doing an iterative flow analysis problem over all the code in the component (not that big a consideration.)

#| Actually, what we do is a backward graph walk from each unknown-values receiver. As we go, we mark each walked block with the ordered list of continuations we believe are on the stack. Starting with an empty stack, we: – When we encounter another unknown-values receiver, we push that continuation on our simulated stack. – When we encounter a receiver (which had better be for the topmost continuation), we pop that continuation. – When we pop all continuations, we terminate our walk.

[### not quite right... It seems we may run into “dead values” during the graph walk too. It seems that we have to check if the pushed continuation is on stack top, and if not, add it to the ending stack so that the post-pass will discard it.]

[### Also, we can’t terminate our walk just because we hit a block previously walked. We have to compare the End-Stack with the values received along the current path: if we have more values on our current walk than on the walk that last touched the block, then we need to re-walk the subgraph reachable from that block, using our larger set of continuations. It seems that our actual termination condition is reaching a block whose End-Stack is already EQ to our current stack.]

If at the start, the block containing the values receiver has already been walked, we skip the walk for that continuation, since it has already been handled by an enclosing values receiver. Once a walk has started, we ignore any signs of a previous walk, clobbering the old result with our own, since we enclose that continuation, and the previous walk doesn’t take into consideration the fact that our values block underlies its own.

When we are done, we have annotated each block with the stack current both at the beginning and at the end of that block. Blocks that aren’t walked don’t have anything on the stack either place (although they may hack MVs internally).

We then scan all the blocks in the component, looking for blocks that have predecessors with a different ending stack than that block’s starting stack. (The starting stack had better be a tail of the predecessor’s ending stack.) We insert a block intervening between all of these predecessors that sets SP to the end of the values for the continuation that should be on stack top. Of course, this pass needn’t be done if there aren’t any global unknown MVs.

Also, if we find any block that wasn’t reached during the walk, but that USEs an outside unknown-values continuation, then we know that the DEST can’t be reached from this point, so the values are unused. We either insert code to pop the values, or somehow mark the code to prevent the values from ever being pushed. (We could cause the popping to be done by the normal pass if we iterated over the pushes beforehand, assigning a correct END-STACK.)

[### But I think that we have to be a bit clever within blocks, given the possibility of blocks being joined. We could collect some unknown MVs in a block, then do a control transfer out of the receiver, and this control transfer could be squeezed out by merging blocks. How about:

    (tagbody
      (return
       (multiple-value-prog1 (foo)
	 (when bar
	   (go UNWIND))))

     UNWIND
      (return
       (multiple-value-prog1 (baz)
	 bletch)))

But the problem doesn’t happen here (can’t happen in general?) since a node buried within a block can’t use a continuation outside of the block. In fact, no block can have more then one PUSH continuation, and this must always be the last continuation. So it is trivially (structurally) true that all pops come before any push.

[### But not really: the DEST of an embedded continuation may be outside the block. There can be multiple pushes, and we must find them by iterating over the uses of MV receivers in LTN. But it would be hard to get the order right this way. We could easily get the order right if we added the generators as we saw the uses, except that we can’t guarantee that the continuations will be annotated at that point. (Actually, I think we only need the order for consistency checks, but that is probably worthwhile). I guess the thing to do is when we process the receiver, add the generator blocks to the Values-Generators, then do a post-pass that re-scans the blocks adding the pushes.]

I believe that above concern with a dead use getting mashed inside a block can’t happen, since the use inside the block must be the only use, and if the use isn’t reachable from the push, then the use is totally unreachable, and should have been deleted, which would prevent it from ever being annotated. ] ] |#

We find the partial ordering of the values globs for unknown values continuations in each environment. We don’t have to scan the code looking for unknown values continuations since LTN annotates each block with the continuations that were popped and not pushed or pushed and not popped. This is all we need to do the inter-block analysis.

After we have found out what stuff is on the stack at each block boundary, we look for blocks with predecessors that have junk on the stack. For each such block, we introduce a new block containing code to restore the stack pointer. Since unknown-values continuations are represented as <start, count>, we can easily pop a continuation using the Start TN.

Note that there is only doubt about how much stuff is on the control stack, since only it is used for unknown values. Any special stacks such as number stacks will always have a fixed allocation.