8.7.6 Accessing Lisp Arrays

Due to the way CMUCL manages memory, the amount of memory that can be dynamically allocated by malloc or make-alien is limited15.

To overcome this limitation, it is possible to access the content of Lisp arrays which are limited only by the amount of physical memory and swap space available. However, this technique is only useful if the foreign function takes pointers to memory instead of allocating memory for itself. In latter case, you will have to modify the foreign functions.

This technique takes advantage of the fact that CMUCL has specialized array types (see specialized-array-types) that match a typical C array. For example, a (simple-array double-float (100)) is stored in memory in essentially the same way as the C array double x[100] would be. The following function allows us to get the physical address of such a Lisp array:

(defun array-data-address (array)
  "Return the physical address of where the actual data of an array is
stored.

ARRAY must be a specialized array type in CMUCL.  This means ARRAY
must be an array of one of the following types:

                  double-float
                  single-float
                  (unsigned-byte 32)
                  (unsigned-byte 16)
                  (unsigned-byte  8)
                  (signed-byte 32)
                  (signed-byte 16)
                  (signed-byte  8)
"
  (declare (type (or (simple-array (signed-byte 8))
                     (simple-array (signed-byte 16))
                     (simple-array (signed-byte 32))
                     (simple-array (unsigned-byte 8))
                     (simple-array (unsigned-byte 16))
                     (simple-array (unsigned-byte 32))
                     (simple-array single-float)
                     (simple-array double-float)
                     (simple-array (complex single-float))
                     (simple-array (complex double-float)))
                 array)
           (optimize (speed 3) (safety 0))
           (ext:optimize-interface (safety 3)))
  ;; with-array-data will get us to the actual data.  However, because
  ;; the array could have been displaced, we need to know where the
  ;; data starts.
  (lisp::with-array-data ((data array)
                          (start)
                          (end))
    (declare (ignore end))
    ;; DATA is a specialized simple-array.  Memory is laid out like this:
    ;;
    ;;   byte offset    Value
    ;;        0         type code (should be 70 for double-float vector)
    ;;        4         4 * number of elements in vector
    ;;        8         1st element of vector
    ;;      ...         ...
    ;;
    (let ((addr (+ 8 (logandc1 7 (kernel:get-lisp-obj-address data))))
          (type-size
           (let ((type (array-element-type data)))
             (cond ((or (equal type '(signed-byte 8))
                        (equal type '(unsigned-byte 8)))
                    1)
                   ((or (equal type '(signed-byte 16))
                        (equal type '(unsigned-byte 16)))
                    2)
                   ((or (equal type '(signed-byte 32))
                        (equal type '(unsigned-byte 32)))
                    4)
                   ((equal type 'single-float)
                    4)
                   ((equal type 'double-float)
                    8)
                   (t
                    (error "Unknown specialized array element type"))))))
      (declare (type (unsigned-byte 32) addr)
               (optimize (speed 3) (safety 0) (ext:inhibit-warnings 3)))
      (system:int-sap (the (unsigned-byte 32)
                        (+ addr (* type-size start)))))))

We note, however, that the system function system:vector-sap will do the same thing as above does.

Assume we have the C function below that we wish to use:

  double dotprod(double* x, double* y, int n)
  {
    int k;
    double sum = 0;

    for (k = 0; k < n; ++k) {
      sum += x[k] * y[k];
    }
    return sum;
  }

The following example generates two large arrays in Lisp, and calls the C function to do the desired computation. This would not have been possible using malloc or make-alien since we need about 16 MB of memory to hold the two arrays.

  (alien:def-alien-routine "dotprod" c-call:double
    (x (* double-float) :in)
    (y (* double-float) :in)
    (n c-call:int :in))
    
  (defun test-dotprod ()
    (let ((x (make-array 10000 :element-type 'double-float 
                         :initial-element 2d0))
          (y (make-array 10000 :element-type 'double-float
                         :initial-element 10d0)))
        (sys:without-gcing
          (let ((x-addr (sys:vector-sap x))
                (y-addr (sys:vector-sap y)))
            (dotprod x-addr y-addr 10000)))))

In this example, we have used sys:vector-sap instead of array-data-address, but we could have used (sys:int-sap (array-data-address x)) as well.

Also, we have wrapped the inner let expression in a sys:without-gcing that disables garbage collection for the duration of the body. This will prevent garbage collection from moving x and y arrays after we have obtained the (now erroneous) addresses but before the call to dotprod is made.


Footnotes

(15)

CMUCL mmaps a large piece of memory for its own use and this memory is typically about 256~MB above the start of the C heap. Thus, only about 256~MB of memory can be dynamically allocated. In earlier versions, this limit was closer to 8~MB.