Monday, January 1, 2007

Designing combinators in Factor

Factor's a great programming lanugage, but one of the more annoying points of Factor is that it can be difficult to write complicated combinators, the Factor/Joy term for higher order functions. Here's one that I wrote recently. See if you can tell what it does:

: map-replace ( seq pred map -- )
pick dup length [ ! ( seq pred map elt i )
>r swap >r rot >r ! ( pred elt | seq map i )
[ swap call ] 2keep rot ! ( pred elt ? | seq map i )
[
r> r> swap >r rot >r ! ( elt map | pred seq i )
[ call ] keep ! ( new-elt map | pred seq i )
r> r> r> rot >r rot >r ! ( new-elt seq i | map pred )
swap [ set-nth ] keep ! ( seq | map pred )
r> r> swap ! ( seq pred map )
] [
drop r> swap r> r> drop ! ( seq map pred )
] if
] 2each 3drop ; inline

I showed this code to Slava Pestov, the creator of Factor, and he didn't understand it. This is ridiculously illegible, and the funny part is, it only has five words that actually *do* anything, and the rest just move stuff around the stack. I put these words in bold and italics.

This code looks complicated but it actually does a pretty simple thing. Given a mutable sequence, this code takes two quotations, a predicate and a map. On each element, the predicate is applied to the element. If the result is not f, the map quotation is run with the element, and that value is put in the place where the element used to be. An example of this is

{ 1 2 3 } dup [ 2 = ] [ 3 + ] map-replace .

which outputs { 1 5 3 }. I wrote this up as part of some code to change XML code in place, as a supplement to xml-map, which makes a new tree entirely.

To simplify this, I wrote a new combinator which takes only one quotation. This quotation is run on the element. If the result is f, nothing happens to the element. But if it is something else, the item is set to the resulting value. This can be written as:

: map-replace2 ( seq quot -- )
over dup length [ ! ( seq quot elt i )
>r rot >r swap [ call ] keep ! ( result quot | seq i )
swap [ ! ( quot result | seq i )
r> r> swap [ set-nth ] keep ! ( quot seq )
] [ ! ( quot | seq i )
r> r> drop ! ( quot seq )
] if* swap
] 2each 2drop ; inline


In the end, it turned out all of this was almost completely pointless. Apparently, the Factor standard library already has a word called inject, which is like map, except it works in place. This is basically what I wanted all along, I was just making the mistake of premature optimization, shying away from setting each element individually because I thought it'd be too costly. inject is defined as

: inject ( seq quot -- )
over length [
[ -rot change-nth ] 3keep
] repeat 2drop ; inline

Using inject, Slava implemented my first map-replace as

: (map-replace) ( pred quot elt -- newelt )
[ -rot slip ] keep rot [ swap call ] [ nip ] if ;

: map-replace ( seq pred quot -- )
rot
[ pick pick >r >r (map-replace) r> r> rot ] inject
2drop ;

but at this point, it looks like like inject will be enough by itself. Now, going back to the original purpose, it is really simple to lift this to operate on XML trees rather than just sequences. xml-inject now joins the existing combinators xml-each xml-map and xml-find, all of which were very simple to implement given the existing parallel combinators on sequences:

GENERIC: (xml-inject) ( quot tag -- ) inline
M: tag (xml-inject)
tag-children [
swap [ call ] keep
swap [ (xml-inject) ] keep
] inject ;
M: object (xml-inject) 2drop ;
M: xml-doc (xml-inject) delegate (xml-inject) ;
: xml-inject ( tag quot -- ) ! quot: tag -- tag
swap (xml-inject) ; inline


Update: Note that in the current version of factor, inject is called change-each. Also, when writing new combinators now, it's very often useful to include a new word called curry, which I described here.

2 comments:

Unknown said...

Doesn't the following word implement map-replace as you want?

: map-replace ( seq pred map -- )
1array append [ when ] append [ dup ] swap append inject ;

Unknown said...

Or even the somewhat cleaner and shorter:

: map-replace ( seq pred map -- )
add \ when add \ dup add* inject ;