13.4.1 External Formats

Users may write their own external formats. It is probably easiest to look at existing external formats to see how do this.

An external format basically needs two functions: octets-to-code to convert octets to Unicode codepoints and code-to-octets to convert Unicode codepoints to octets. The external format is defined using the macro stream::define-external-format.

Macro: stream:define-external-format name (&key :base :min :max :size :documentation) (&rest slots) &optional octets-to-code code-to-octets flush-state copy-state

If :base is not given, this defines a new external format of the name :name. min, max, and size are the minimum and maximum number of octets that make up a character. (:size n is just a short cut for :min n :max n.) The description of the external format can be given using :documentation. The arguments octets-to-code and code-to-octets are not optional in this case. They specify how to convert octets to codepoints and vice versa, respectively. These should be backquoted forms for the body of a function to do the conversion. See the description below for these functions. Some good examples are the external format for :utf-8 or :utf-16. The :slots argument is a list of read-only slots, similar to defstruct. The slot names are available as local variables inside the code-to-octets and octets-to-code bodies.

If :base is given, then an external format is defined with the name :name that is based on a previously defined format :base. The slots are inherited from the :base format by default, although the definition may alter their values and add new slots. See, for example, the :mac-greek external format.

Macro: octets-to-code state input unput error &rest args

This defines a form to be used by an external format to convert octets to a code point. state is a form that can be used by the body to access the state variable of a stream. This can be used for any reason to hold anything needed by octets-to-code. input is a form that returns one octet from the input stream. unput will put back N octets to the stream. args is a list of variables that need to be defined for any symbols in the body of the macro.

error controls how errors are handled. If nil, some suitable replacement character is used. That is, any errors are silently ignored and replaced by some replacement character. If non-nil, error is a symbol or function that is called to handle the error. This function takes three arguments: a message string, the invalid octet (or nil), and a count of the number of octets that have been read so far. If the function returns, it should be the codepoint of the desired replacement character.

Macro: code-to-octets code state output error &rest args

Defines a form to be used by the external format to convert a code point to octets for output. code is the code point to be converted. state is a form to access the current value of the stream’s state variable. output is a form that writes one octet to the output stream.

Similar to octets-to-code, error indicates how errors should be handled. If nil, some default replacement character is substituted. If non-nil, error should be a symbol or function. This function takes two arguments: a message string and the invalid codepoint. If the function returns, it should be the codepoint that will be substituted for the invalid codepoint.

Macro: flush-state state output error &rest args

Defines a form to be used by the external format to flush out any state when an output stream is closed. Similar to code-to-octets, but there is no code point to be output. The error argument indicates how to handle errors. If nil, some default replacement character is used. Otherwise, error is a symbol or function that will be called with a message string and codepoint of the offending state. If the function returns, it should be the codepoint of a suitable replacement.

If flush-state is nil, then nothing special is needed to flush the state to the output.

This is called only when an output character stream is being closed.

Macro: copy-state state &rest args

Defines a form to copy any state needed by the external format. This should probably be a deep copy so that if the original state is modified, the copy is not.

If not given, then nothing special is needed to copy the state either because there is no state for the external format or that no special copier is needed.