Tuesday, December 4, 2007

extra/delegation: an annotated vocabulary

In order to allow for better compiler optimizations change-slot operations and other improvements, Slava plans to add tuple inheritance to the Factor object system. The exact form of this isn't completely finalized, but it looks like delegation in its current form isn't going to stick around much longer. I proposed a new kind of delegation mechanism, along with other changes to the Factor object system. In the end, a more conservative change was chosen in the form of mixins.

I described the Factor object system before, but I'll review delegation here. Basically, when a generic word dispatches on the tuple, one of two things can happen: either a method is found, and then called, or no method is found. If no method is found, the generic word checks the delegate slot. If that slot is filled (with something that's not f), the method will be re-called on that delegate in place of the original object. For more information on this, see my previous blog post

There are a few problems with delegates. One problem is the slicing problem: when relying on delegation to extend behavior and a method is delegated, the method call knows only about the delegate, not the original object. Sometimes, this is the desired behavior, but other times it can lead to subtle bugs. Another problem is performance. When looking up the value of a direct tuple slot, the required operation is a simple method dispatch and pointer arithmetic. If the compiler knows the tuple class of an object before runtime, it can even eliminate the dispatch step. However, when looking at a delegate slot, there is an additional pointer indirection. Crucially, with the current system, there is no way to avoid the dispatch. Tuple inheritance would require fewer levels of indirection and give more opportunity for optimization. With delegation, method dispatch takes linear time with respect to the length of the delegate chain, and this is bad. A very subtle problem with delegation is that everything is delegated at once, when often, we only need to delegate a couple things.

Yet delegation is still sometimes useful, and there must be a way to fix the slicing problem without breaking everything*. I've written a vocabulary (vocab) called delegate which attempts to fix this. The premise is that this will constitute delegation after it is removed from the core. Central to the vocab is the concept of a group, or a word which signifies a number of methods. There are three builtin types of groups: tuple classes, generic words and protocols. There is a word called group-words which extracts the constituent words from a group:

GENERIC: group-words ( group -- words )

For a generic word, the group consists just of itself

M: generic group-words
1array ;

For a tuple class, the group consists of all of the slot readers and writers, except for the delegate slot. (delegate and set-delegate aren't generic words, anyway)

M: tuple-class group-words
"slots" word-prop 1 tail ! The first slot is the delegate
! 1 tail should be removed when the delegate slot is removed
dup [ slot-spec-reader ] map
swap [ slot-spec-writer ] map append ;

A protocol is an explicitly-defined list of generic words which form a group. The syntax PROTOCOL: protocol-name words... ; is used to define a protocol, and the words it contains are stored in the "protocol-words" word property.

: define-protocol ( wordlist protocol -- )
swap { } like "protocol-words" set-word-prop ;

: PROTOCOL:
CREATE dup reset-generic dup define-symbol
parse-definition swap define-protocol ; parsing

PREDICATE: word protocol "protocol-words" word-prop ;

M: protocol group-words
"protocol-words" word-prop ;

Here's an example of the defintion of the sequence protocol:

PROTOCOL: sequence-protocol
clone clone-like like new new-resizable nth nth-unsafe
set-nth set-nth-unsafe length immutable set-length lengthen ;

The advantage of defining delegation over these groups is that, usually, we don't need to delegate for everything. With this, multiple delegation over disjoint groups is possible. This is in use in the xml.data vocab, to let tags act like their names, their attributes assoc and their children sequence at the same time. It could also be used for multiple tuple inheritance.

One mechanism which replaces delegation is called consultation. Billy Tanksley has suggested that this is what Factor's delegation should be termed; in essence, consultation is the same as delegation, except that it's limited to a single group. Say you have some class foo which implements all generic words in the group bar. There is another class baz, which has a word baz>foo itself into into a foo, or getting a corresponding foo to which to delegate the methods. Then, we could define CONSULT: bar baz baz>foo ; in order to define the delegation. If bar contains the words qux and quux, then the following method definitions will be generated:

M: baz qux baz>foo qux ;
M: baz quux baz>foo quux ;

Here's the implementation:

: spin ( x y z -- z y x ) ! This stack shuffler makes things easier to think about
swap rot ;

: define-consult-method ( word class quot -- )
pick add <method> spin define-method ;

: define-consult ( class group quot -- )
>r group-words r>
swapd [ define-consult-method ] 2curry each ;

: CONSULT:
scan-word scan-word parse-definition swapd define-consult ; parsing

So, this fixes one problem with delegation: that it happens for everything. The second problem, performance, can be solved with increased static typing, for example static typing of tuple slots. But this won't be an available feature of Factor for some time. Until then, if we're sure that a tuple slot is a specific class, we can use declare to let the compiler know. For example,

CONSULT: bar baz
baz>foo { foo } declare ;

There's still one last problem, the slicing problem. A complete solution would involve heavy hackery into the Factor method dispatch system, allowing one object to mimic another. But a simpler solution is to just copy all of the method implementations from one class to another, for the generic words in a specific group. This can be unsafe in some situations, such as when the group is a tuple class, but it can also be useful. I called the system mimicry.

: define-mimic ( group mimicker mimicked -- )
>r >r group-words r> r> [
pick "methods" word-prop at [ 2drop ]
[ method-def <method> spin define-method ] if*
] 2curry each ;

: MIMIC:
scan-word scan-word scan-word define-mimic ; parsing

So there you have it: in fewer than 40 lines, the basis for a replacement of the delegation system, at least partially fixing all three main problems. The code is in my git repository. There are still problems, of course: metadata about what mimics/consults what is not stored, and when you see a generic word or class, you'll see all of the mechanically-defined methods. It's amazing, though, how flexible Factor is to this kind of manipulation and metaprogramming.



* Actually, it's possible to re-implement delegation in Factor, completely eliminating the slicing problem, but at a slight performance cost for tuple slot lookups, and at the cost of breaking a lot of code. The thing is, if instances of tuple class a all delegate to tuple class b, then a instances will return f to b?. To get around this, semi-hacks like the is? word are used.

No comments: