Inline Expansion (CMUCL User’s Manual)

Next: Byte Coded Compilation, Previous: Block Compilation, Up: Advanced Compiler Use and Efficiency Hints [Contents][Index]

5.8 Inline Expansion ¶

Python can expand almost any function inline, including functions with keyword arguments. The only restrictions are that keyword argument keywords in the call must be constant, and that global function definitions (defun) must be done in a null lexical environment (not nested in a let or other binding form.) Local functions (flet) can be inline expanded in any environment. Combined with Python’s source-level optimization, inline expansion can be used for things that formerly required macros for efficient implementation. In Python, macros don’t have any efficiency advantage, so they need only be used where a macro’s syntactic flexibility is required.

Inline expansion is a compiler optimization technique that reduces the overhead of a function call by simply not doing the call: instead, the compiler effectively rewrites the program to appear as though the definition of the called function was inserted at each call site. In Common Lisp, this is straightforwardly expressed by inserting the lambda corresponding to the original definition:

(proclaim '(inline my-1+))
(defun my-1+ (x) (+ x 1))

(my-1+ someval) ⇒ ((lambda (x) (+ x 1)) someval)

When the function expanded inline is large, the program after inline expansion may be substantially larger than the original program. If the program becomes too large, inline expansion hurts speed rather than helping it, since hardware resources such as physical memory and cache will be exhausted. Inline expansion is called for:

When profiling has shown that a relatively simple function is called so often that a large amount of time is being wasted in the calling of that function (as opposed to running in that function.) If a function is complex, it will take a long time to run relative the time spent in call, so the speed advantage of inline expansion is diminished at the same time the space cost of inline expansion is increased. Of course, if a function is rarely called, then the overhead of calling it is also insignificant.
With functions so simple that they take less space to inline expand than would be taken to call the function (such as my-1+ above.) It would require intimate knowledge of the compiler to be certain when inline expansion would reduce space, but it is generally safe to inline expand functions whose definition is a single function call, or a few calls to simple Common Lisp functions.

In addition to this speed/space tradeoff from inline expansion’s avoidance of the call, inline expansion can also reveal opportunities for optimization. Python’s extensive source-level optimization can make use of context information from the caller to tremendously simplify the code resulting from the inline expansion of a function.

The main form of caller context is local information about the actual argument values: what the argument types are and whether the arguments are constant. Knowledge about argument types can eliminate run-time type tests (e.g., for generic arithmetic.) Constant arguments in a call provide opportunities for constant folding optimization after inline expansion.

A hidden way that constant arguments are often supplied to functions is through the defaulting of unsupplied optional or keyword arguments. There can be a huge efficiency advantage to inline expanding functions that have complex keyword-based interfaces, such as this definition of the member function:

(proclaim '(inline member))
(defun member (item list &key
                    (key #'identity)
                    (test #'eql testp)
                    (test-not nil notp))
  (do ((list list (cdr list)))
      ((null list) nil)
    (let ((car (car list)))
      (if (cond (testp
                 (funcall test item (funcall key car)))
                (notp
                 (not (funcall test-not item (funcall key car))))
                (t
                 (funcall test item (funcall key car))))
          (return list)))))

After inline expansion, this call is simplified to the obvious code:

(member a l :key #'foo-a :test #'char=) ⇒

(do ((list list (cdr list)))
    ((null list) nil)
  (let ((car (car list)))
    (if (char= item (foo-a car))
        (return list))))

In this example, there could easily be more than an order of magnitude improvement in speed. In addition to eliminating the original call to member, inline expansion also allows the calls to char= and foo-a to be open-coded. We go from a loop with three tests and two calls to a loop with one test and no calls.

See source-optimization for more discussion of source level optimization.