1 #LyX 1.3 created this file. For more info see http://www.lyx.org/
15 \use_numerical_citations 0
16 \paperorientation portrait
19 \paragraph_separation skip
21 \quotes_language english
25 \paperpagestyle headings
29 Factor Developer's Guide
36 \begin_inset LatexCommand \tableofcontents{}
46 Factor is an imperitive programming language with functional and object-oriented
48 Its primary goal is to be used for web-based server-side applications.
49 Factor is interpreted by a virtual machine that provides garbage collection
50 and prohibits pointer arithmetic.
56 Two releases of Factor are available -- a virtual machine written in C,
57 and an interpreter written in Java that runs on the Java virtual machine.
58 This guide targets the C version of Factor.
64 Factor borrows heavily from Forth, Joy and Lisp.
65 From Forth it inherits a flexible syntax defined in terms of
66 \begin_inset Quotes eld
70 \begin_inset Quotes erd
73 and an execution model based on a data stack and call stack.
74 From Joy and Lisp it inherits a virtual machine prohibiting direct pointer
75 arithmetic, and the use of
76 \begin_inset Quotes eld
80 \begin_inset Quotes erd
83 to represent code and data struture.
89 A "word" is the main unit of program organization in Factor -- it corresponds
90 to a "function", "procedure" or "method" in other languages.
93 When code examples are given, the input is in a roman font, and any output
94 from the interpreter is in italics:
98 \begin_inset Quotes eld
102 \begin_inset Quotes erd
116 The stack is used to exchange data between words.
117 When a number is executed, it is pushed on the stack.
118 When a word is executed, it receives input parameters by removing successive
119 elements from the top of the stack.
120 Results are then pushed back to the top of the stack.
128 prints the contents of the stack, leaving the contents of the stack unaffected.
129 The top of the stack is the rightmost element in the printout:
145 removes the object at the top of the stack, and prints it:
168 The usual arithmetic operators
172 all take two parameters from the stack, and push one result back.
173 Where the order of operands matters (
181 ), the operands are taken from the stack in the natural order.
209 This type of arithmetic is called
213 , because the operator follows the operands.
218 notation used in many other languages, so-called because the operator is
219 in-between the two operands.
222 More complicated infix expressions can be translated into postfix by translating
223 the inner-most parts first.
224 Grouping parantheses are never necessary:
227 ! Postfix equivalent of (2 + 3) * 6
238 ! Postfix equivalent of 2 + (3 * 6)
252 New words can be defined in terms of existing words using the
287 When the new word is executed, each one of its factors gets executed, in
289 The comment delimited by
297 is called a stack effect comment and is described later.
298 The stack effect comment, as well as the documentation comment starting
303 are both optional, and can be placed anywhere in the source code, not just
304 in colon definitions.
307 Note that in a source file, a word definition can span multiple lines.
308 However, the interactive interpreter expects each line of input to be
309 \begin_inset Quotes eld
313 \begin_inset Quotes erd
316 , so interactively, colon definitions must be entered all on one line.
319 For example, lets assume we are designing some software for an aircraft
321 Lets assume that internally, all lengths are stored in meters, and all
322 times are stored in seconds.
323 We can define words for converting from kilometers to meters, and hours
324 and minutes to seconds:
327 : kilometers 1000 * ;
360 Now, suppose we need a word that takes the flight time, the aircraft velocity,
361 and the tailwind velocity, and returns the distance travelled.
362 If the parameters are given on the stack in that order, all we do is add
363 the top two elements (aircraft velocity, tailwind velocity) and multiply
364 it by the element underneath (flight time).
365 So the definition looks like this, this time with a stack effect comment
366 since its slightly less obvious what the operands are:
369 : distance ( time aircraft tailwind -- distance ) + * ;
380 Note that we are not using any units here.
381 We could, if we defined some words for velocity units first.
382 The only non-trivial thing here is the implementation of
386 -- we have to divide the
390 velocity by the number of seconds in one hour to get the desired result:
393 : km/hour kilometers 1 hours / ;
396 2 hours 900 km/hour 36 km/hour distance .
407 A stack effect comment contains a description of inputs to the left of
411 , and a description of outputs to the right.
412 As always, the top of the stack is on the right side.
413 Lets try writing a word to compute the cube of a number.
419 I'd use the somewhat simpler example of a word that squares a number, but
420 such a word already exists in the standard library.
435 Three numbers on the stack can be multiplied together using
450 However, the stack effect of
459 We would like to write word that takes
464 To achive this, we need to be able to duplicate the top stack element twice.
465 As it happends, there is a word
469 for precisely this purpose.
470 Now, we are able to define the
496 It is quite often the case that we want to compose two factors in a colon
497 definition, but their stack effects don't
498 \begin_inset Quotes eld
502 \begin_inset Quotes erd
512 for solving precisely this problem.
513 These words are so-called because they simply rearrange stack elements
514 in some fashion, without modifying them in any way.
515 Lets take a look at the most frequently-used shuffle words:
522 Discard the top stack element.
523 Used when a return value is not needed.
530 Duplicate the top stack element.
531 Used when a value is needed more than once.
538 Swap top two stack elements.
539 Used when a word expects parameters in a different order.
544 rot ( x y z -- y z x )
546 Rotate top three stack elements to the left.
551 -rot ( x y z -- z x y )
553 Rotate top three stack elements to the right.
558 over ( x y -- x y x )
560 Bring the second stack element
561 \begin_inset Quotes eld
565 \begin_inset Quotes erd
575 Remove the second stack element.
580 tuck ( x y -- y x y )
582 Tuck the top stack element under the second stack element.
585 You can try all these words out -- push some numbers on the stack, execute
586 a word, and look at how the stack contents was changed using
591 Compare the stack contents with the stack effects above.
594 Note the order of the shuffle word descriptions above.
595 The ones at the top are used most often because they are easy to understand.
596 The more complex ones such as rot should be avoided as possible, because
597 they make the flow of data in a word definition harder to understand.
600 If you find yourself using too many shuffle words, or you're writing a stack
601 effect comment in the middle of a colon definition, it is a good sign that
602 the word should probably be factored into two or more words.
603 Effective factoring is like riding a bicycle -- it is hard at first, but
605 \begin_inset Quotes eld
609 \begin_inset Quotes erd
612 , and writing small, clear and reusable word definitions becomes second-nature.
618 A quotation a list of objects that can be executed.
619 Words that operate on quotations are called
624 Quotations are input using the following syntax:
631 When input, a quotation is not executed immediately -- rather, it becomes
632 one object on the stack.
633 Try evaluating the following:
660 executes the quotation at the top of the stack.
665 with a literal quotation is useless; writing out the elements of the quotation
671 combinator is a building block of more powerful combinators, since quotations
672 can be passed around arbitrarily and even modified before being called.
681 ( cond true false -- )
691 quotations, depending on the boolean value of
696 In Factor, there is no real boolean data type -- instead, a special object
701 is the only object with a
702 \begin_inset Quotes eld
706 \begin_inset Quotes erd
710 Every other object is a boolean
711 \begin_inset Quotes eld
715 \begin_inset Quotes erd
724 \begin_inset Quotes eld
728 \begin_inset Quotes erd
734 Here is an example of
742 \begin_inset Quotes eld
746 \begin_inset Quotes erd
750 \begin_inset Quotes eld
754 \begin_inset Quotes erd
760 Compare the order of operands here, and the order of arguments in the stack
768 That the stack effects of the two
772 branches should be the same.
773 If they differ, the word becomes harder to document and debug.
778 times ( num quot -- )
780 executes a quotation a number of times.
781 It is good style to have the quotation always consume as many values from
782 the stack as it produces.
783 This ensures the stack effect of the entire
787 expression stays constant regardless of the number of iterations.
790 More combinators will be introduced later.
796 The dictionary of words is not a flat list -- rather, it is separated into
802 Each vocabulary is a named list of words that have something in common
804 \begin_inset Quotes eld
808 \begin_inset Quotes erd
811 vocabulary contains words for working with linked lists.
814 When a word is read by the parser, the
816 vocabulary search path
818 determines which vocabularies to search.
819 In the interactive interpreter, the default search path contains a large
820 number of vocabularies.
821 Contrast this to the situation when a file is being parsed -- the search
822 path has a minimal set of vocabularies containing basic parsing words.
828 The rationale here is that the interactive interpreter should have a large
829 number of words available by default, for convinience, whereas source files
830 should specify their external dependencies explicitly.
836 New vocabularies are added to the search path using the
845 \begin_inset Quotes eld
848 /home/slava/.factor-rc
849 \begin_inset Quotes erd
857 ERROR: <interactive>:1: Undefined: exists?
864 \begin_inset Quotes eld
867 /home/slava/.factor-rc
868 \begin_inset Quotes erd
879 How do you know which vocabulary contains a word? Vocabularies can either
881 \begin_inset Quotes eld
885 \begin_inset Quotes erd
888 search can be performed:
896 [ ?run-file boot cli-arg cli-param init-environment
901 init-gc init-interpreter init-scratchpad init-search-path
906 init-stdio init-toplevel parse-command-line parse-switches
911 run-files run-user-init stdin stdout ]
959 New words are defined in the
964 The input vocabulary can be changed at the interactive prompt, or in a
965 source file, using the
976 : random-playlist ...
980 It is a convention (although it is not enforced by the parser) that the
985 directive is the first statement in a source file, and all
989 follow, before any other definitions.
992 PRACTICAL: Numbers game
995 In this section, basic input/output and flow control is introduced.
996 We construct a program that repeatedly prompts the user to guess a number
997 -- they are informed if their guess is correct, too low, or too high.
998 The game ends on a correct guess.
1006 I'm thinking of a number between 0 and 100.
1045 Development methodology
1048 A typical Factor development session involves a text editor
1054 Try jEdit, which has Factor syntax highlighting out of the box.
1057 and Factor interpreter running side by side.
1058 Instead of the edit/compile/run cycle, the development process becomes
1060 \begin_inset Quotes eld
1064 \begin_inset Quotes erd
1067 -- you make some changes to the source file and reload it in the interpreter
1068 using a command like this:
1072 \begin_inset Quotes eld
1076 \begin_inset Quotes erd
1082 Then the changes can be tested, either by hand, or using a test harness.
1083 There is no need to compile anything, or to lose interpreter state by restartin
1085 Additionally, words with
1086 \begin_inset Quotes eld
1090 \begin_inset Quotes erd
1093 definitions that you do not intend to keep can also be entered directly
1094 at this interpreter prompt.
1097 Each word should do one useful task.
1098 New words can be defined in terms of existing, already-tested words.
1099 You design a set of reusable words that model the problem domain.
1100 Then, the problem is solved in terms of a
1102 domain-specific vocabulary
1113 Start a text editor and create a file named
1120 At the top of the file, write a comment.
1121 Comments are a feature that can be found in almost any programming language;
1122 in Factor, they are implemented as parsing words.
1123 An example of commenting follows:
1126 ! The word ! discards input until the end of the line
1129 ( The word ( discards input until the next )
1132 It is always a good idea to comment your code.
1133 Try to write simple code that does not need detailed comments to describe;
1134 similarly, avoid redundant comments.
1135 These two principles are hard to quantify in a concrete way, and will become
1136 more clear as your skills with Factor increase.
1139 We will be defining new words in the numbers-game vocabulary; add an
1143 statement at the top of the source file:
1149 Also in order to be able to test the words, issue a
1153 statement in the interactive interpreter:
1159 This section will develop the numbers game in an incremental fashion.
1160 After each addition, issue a command like the following to load the source
1161 file into the Factor interpreter:
1165 \begin_inset Quotes eld
1169 \begin_inset Quotes erd
1175 Reading a number from the keyboard
1178 A fundamental operation required for the numbers game is to be able to read
1179 a number from the keyboard.
1188 reads a line of input and pushes it on the stack as a string.
1197 takes a string from the stack, and parses it, pushing an integer.
1198 These two words can be combined into a single colon definition:
1201 : read-number ( -- n ) read parse-number ;
1204 You should add this definition to the source file, and try loading the file
1205 into the interpreter.
1206 As you will soon see, this raises an error! The problem is that the two
1215 are not part of the default, minimal, vocabulary search path used when
1217 The solution is to use
1222 to find out which vocabularies contain those words, and add the appropriate
1223 USE: statements to the source file:
1232 After adding the above two statements, the file should now parse, and testing
1233 should confirm that the read-number word works correctly.
1239 There is the possibility of an invalid number being entered at the keyboard.
1248 , the boolean false value.
1249 For the sake of simplicity, we ignore this case in the numbers game example.
1250 However, proper error handling is an essential part of any large program
1251 and is covered later.
1257 Printing some messages
1260 Now we need to make some words for printing various messages.
1261 They are given here without further ado:
1268 \begin_inset Quotes eld
1271 I'm thinking of a number between 0 and 100.
1272 \begin_inset Quotes erd
1279 \begin_inset Quotes eld
1283 \begin_inset Quotes erd
1290 \begin_inset Quotes eld
1294 \begin_inset Quotes erd
1301 \begin_inset Quotes eld
1305 \begin_inset Quotes erd
1312 \begin_inset Quotes eld
1316 \begin_inset Quotes erd
1322 Note that in the above, stack effect comments are omitted, since they are
1323 obvious from context.
1324 You should ensure the words work correctly after loading the source file
1325 into the interpreter.
1328 Taking action based on a guess
1331 The next logical step is to write a word
1335 that takes the user's guess along with the actual number to be guessed,
1336 and prints one of the messages
1349 This word will also push a boolean flag, indicating if the game should
1350 continue or not -- in the case of a correct guess, the game does not continue.
1353 This description of judge-guess is a mouthful -- and it suggests that it
1354 may be best to split it into two words.
1355 So the first word we write handles the more specific case of an
1359 guess -- so it prints either
1370 : inexact-guess ( guess actual -- )
1373 > [ too-high ] [ too-low ] ifte ;
1376 Note that the word gives incorrect output if the two parameters are equal.
1377 However, it will never be called this way.
1380 With this out of the way, the implementation of judge-guess is an easy task
1397 : judge-guess ( actual guess -- ? )
1417 2dup ( x y -- x y x y )
1424 consumes both its parameters, we must make copies of them to pass to
1433 Try the following at the interpreter to see what's going on:
1498 Generating random numbers
1509 pushes a random number in a specified range.
1510 The range is inclusive, so both the minimum and maximum indices are candidate
1517 to determine that this word is in the
1522 For the purposes of this game, random numbers will be in the range of 0
1523 to 100, so we can define a word that generates a random number in the range
1527 : number-to-guess ( -- n ) 0 100 random-int ;
1530 Add the word definition to the source file, along with the appropriate
1535 Load the source file in the interpreter, and confirm that the word functions
1536 correctly, and that its stack effect comment is accurate.
1542 The game loop consists of repeated calls to
1563 , the loop stops, otherwise it continues.
1564 This is realized with a recursive implementation:
1567 : numbers-game-loop ( actual -- )
1570 dup guess-prompt read-number judge-guess [
1585 In Factor, tail-recursive words consume a bounded amount of call stack space.
1586 This means you are free to pick recursion or iteration based on their own
1587 merits when solving a problem.
1588 In many other languages, the usefulness of recursion is severely limited
1589 by the lack of tail-recursive call optimization.
1595 The last task is to combine everything into the main
1600 This is easier than it seems:
1603 : numbers-game number-to-guess numbers-game-loop ;
1606 Try it out! Simply invoke the numbers-game word in the interpreter.
1607 It should work flawlessly, assuming you tested each component of this design
1611 The complete program
1614 ! Numbers game example
1629 : read-number ( -- n ) read parse-number ;
1637 \begin_inset Quotes eld
1640 I'm thinking of a number between 0 and 100.
1641 \begin_inset Quotes erd
1648 \begin_inset Quotes eld
1652 \begin_inset Quotes erd
1659 \begin_inset Quotes eld
1663 \begin_inset Quotes erd
1670 \begin_inset Quotes eld
1674 \begin_inset Quotes erd
1681 \begin_inset Quotes eld
1685 \begin_inset Quotes erd
1692 : inexact-guess ( guess actual -- )
1695 > [ too-high ] [ too-low ] ifte ;
1699 : judge-guess ( actual guess -- ? )
1718 : number-to-guess ( -- n ) 0 100 random-int ;
1722 : numbers-game-loop ( actual -- )
1725 dup guess-prompt read-number judge-guess [
1741 : numbers-game number-to-guess numbers-game-loop ;
1749 A list is composed of a set of pairs; each pair holds a list element, and
1750 a reference to the next pair.
1751 Lists have the following literal syntax:
1755 \begin_inset Quotes eld
1759 \begin_inset Quotes erd
1763 \begin_inset Quotes eld
1767 \begin_inset Quotes erd
1773 Before we continue, it is important to understand the role of data types
1775 Lets make a distinction between two categories of data types:
1778 Representational type -- this refers to the form of the data in the interpreter.
1779 Representational types include integers, strings, and vectors.
1780 Representational types are checked at run time -- attempting to multiply
1781 two strings, for example, will yield an error.
1784 Intentional type -- this refers to the meaning of the data within the problem
1786 This could be a length measured in inches, or a string naming a file, or
1787 a list of objects in a room in a game.
1788 It is up to the programmer to check intentional types -- Factor won't prevent
1789 you from adding two integers representing a distance and a time, even though
1790 the result is meaningless.
1796 It may surprise you that in Factor,
1798 lists are intentional types
1801 This means that they are not an inherent feature of the interpreter; rather,
1802 they are built from a simpler data type, the
1809 A cons cell is an object that holds a reference to two other objects.
1810 The order of the two objects matters -- the first is called the
1814 , the second is called the
1821 All words relating to cons cells and lists are found in the
1844 These infamous names originate from the Lisp language.
1846 \begin_inset Quotes eld
1850 \begin_inset Quotes erd
1854 \begin_inset Quotes eld
1858 \begin_inset Quotes erd
1864 construct and deconstruct cons cells:
1891 The output of the first expression suggests a literal syntax for cons cells:
1903 \begin_inset Quotes eld
1907 \begin_inset Quotes erd
1911 \begin_inset Quotes eld
1915 \begin_inset Quotes erd
1924 \begin_inset Quotes eld
1928 \begin_inset Quotes erd
1935 \begin_inset Quotes eld
1939 \begin_inset Quotes erd
1943 \begin_inset Quotes eld
1947 \begin_inset Quotes erd
1956 \begin_inset Quotes eld
1960 \begin_inset Quotes erd
1966 The last two examples make it clear how nested cons cells represent a list.
1968 \begin_inset Quotes eld
1972 \begin_inset Quotes erd
1975 syntax is extremely cumbersome, the parser provides an easier way:
1978 [ 1 2 3 4 ] cdr cdr car .
1990 is a set of cons cells linked by their cdr.
1995 , or just list, is a generalized list with a cdr equal to f, the list is
2001 is a proper list, and in fact it is equivalent to the empty list
2010 is a generalized list that is not a proper list.
2017 word tests if the object at the top of the stack is a proper list:
2021 \begin_inset Quotes eld
2025 \begin_inset Quotes erd
2037 \begin_inset Quotes eld
2041 \begin_inset Quotes erd
2045 \begin_inset Quotes eld
2049 \begin_inset Quotes erd
2053 \begin_inset Quotes eld
2057 \begin_inset Quotes erd
2069 \begin_inset Quotes eld
2073 \begin_inset Quotes erd
2077 \begin_inset Quotes eld
2081 \begin_inset Quotes erd
2085 \begin_inset Quotes eld
2089 \begin_inset Quotes erd
2103 Unless otherwise documented, list manipulation words expect proper lists
2105 Given an improper list, they will either raise an error, or disregard the
2106 hanging cdr at the end of the list.
2109 Also unless otherwise documented, list manipulation words return newly-created
2111 The original parameters are not modified.
2112 This may seem inefficient, however the absence of side effects makes code
2113 much easier to test and debug.
2119 Side effect-free code is the fundamental idea underlying functional programming
2121 While Factor allows side effects and is not a functional programming language,
2122 for a lot of problems, coding in a functional style gives the most maintanable
2123 and readable results.
2126 Where performance is important, a set of
2127 \begin_inset Quotes eld
2131 \begin_inset Quotes erd
2135 They are documented in the next section.
2140 nth ( index list -- obj )
2142 Look up an element specified by a zero-based index, by successively iterating
2143 down the cdr of the list:
2147 \begin_inset Quotes eld
2151 \begin_inset Quotes erd
2155 \begin_inset Quotes eld
2159 \begin_inset Quotes erd
2163 \begin_inset Quotes eld
2167 \begin_inset Quotes erd
2176 \begin_inset Quotes eld
2180 \begin_inset Quotes erd
2186 This word takes linear time proportional to the list index.
2187 If you need constant time lookups, use a vector instead.
2192 length ( list -- n )
2194 Iterate down the cdr of the list until it reaches
2198 , counting the number of elements in the list:
2201 [ [ 1 2 ] [ 3 4 ] 5 ] length .
2210 \begin_inset Quotes eld
2214 \begin_inset Quotes erd
2227 unit ( obj -- list )
2229 Make a list of one element:
2233 \begin_inset Quotes eld
2237 \begin_inset Quotes erd
2246 \begin_inset Quotes eld
2250 \begin_inset Quotes erd
2258 append ( list list -- list )
2260 Append the two lists at the top of the stack:
2263 [ 1 2 3 ] [ 4 5 6 ] append .
2271 [ 1 2 3 ] dup [ 4 5 6 ] append .s
2276 { [ 1 2 3 ] [ 1 2 3 4 5 6 ] }
2279 The first list is copied, and the cdr of its last cons cell is set to the
2281 The second example above shows that the original parameter was not modified.
2282 Interestingly, if the second parameter is not a proper list,
2286 returns an improper list:
2289 [ 1 2 3 ] 4 append .
2299 add ( list obj -- list )
2301 Create a new list consisting of the original list, and a new element added
2329 appear to have similar effects, they are quite different --
2333 is a very cheap operation, while
2337 has to copy the entire list first! If you need adds to the end to take
2338 a constant time, use a vector.
2343 reverse ( list -- list )
2345 Push a new list which has the same elements as the original one, but in
2349 [ 4 3 2 1 ] reverse .
2359 contains ( obj list -- list )
2365 for an occurrence of an object in a list.
2366 The remainder of the list starting from the first occurrence
2369 If the object does not occur in the list, f is returned:
2372 : lived-in? ( country -- ? )
2376 \begin_inset Quotes eld
2380 \begin_inset Quotes erd
2384 \begin_inset Quotes eld
2388 \begin_inset Quotes erd
2392 \begin_inset Quotes eld
2396 \begin_inset Quotes erd
2400 \begin_inset Quotes eld
2404 \begin_inset Quotes erd
2411 \begin_inset Quotes eld
2415 \begin_inset Quotes erd
2424 \begin_inset Quotes eld
2428 \begin_inset Quotes erd
2432 \begin_inset Quotes eld
2436 \begin_inset Quotes erd
2443 \begin_inset Quotes eld
2447 \begin_inset Quotes erd
2459 \begin_inset Quotes eld
2463 \begin_inset Quotes erd
2467 \begin_inset Quotes eld
2470 contains an object that looks like
2471 \begin_inset Quotes erd
2475 The issue of object equality is covered in the next chapter.
2480 remove ( obj list -- list )
2482 Push a new list, with all occurrences of the object removed.
2483 All other elements are in the same order:
2487 \begin_inset Quotes eld
2491 \begin_inset Quotes erd
2497 [ "Canada" "New Zealand" "Australia" "Russia" ] australia- .
2502 [ "Canada" "New Zealand" "Russia" ]
2507 remove-nth ( index list -- list )
2509 Push a new list, with an index removed:
2513 \begin_inset Quotes eld
2517 \begin_inset Quotes erd
2523 [ "Canada" "New Zealand" "Australia" "Russia" ] australia- .
2528 [ "Canada" "New Zealand" "Russia" ]
2531 XXX: unique, set-nth -- talk about lists as stacks
2541 is one where every element is a cons.
2542 The car of each cons is a name, the cdr is a value.
2543 The literal notation is suggestive:
2550 \begin_inset Quotes eld
2554 \begin_inset Quotes erd
2558 \begin_inset Quotes eld
2562 \begin_inset Quotes erd
2569 \begin_inset Quotes eld
2573 \begin_inset Quotes erd
2577 \begin_inset Quotes eld
2581 \begin_inset Quotes erd
2588 \begin_inset Quotes eld
2592 \begin_inset Quotes eld
2596 \begin_inset Quotes erd
2613 if the object is a list whose every element is a cons; otherwise it returns
2623 assoc ( name alist -- value )
2625 looks for a pair with this name in the list, and pushes the cdr of the
2627 Pushes f if no name with this pair is present.
2628 Note that assoc cannot differentiate between a name that is not present
2629 at all, or a name with a value of
2638 assoc* ( name alist -- [ name | value ] )
2640 looks for a pair with this name, and pushes the pair itself.
2649 returns different values in the cases of a value set to
2653 , or an undefined value.
2658 set-assoc ( value name alist -- alist )
2660 removes any existing occurrence of a name from the list, and adds a new
2662 This creates a new list, the original is unaffected.
2667 acons ( value name alist -- alist )
2669 is slightly faster than
2673 since it simply conses a new pair onto the list.
2674 However, if used repeatedly, the list will grow to contain a lot of
2675 \begin_inset Quotes eld
2679 \begin_inset Quotes erd
2685 Searching an association list incurs a linear time cost, so they should
2686 only be used for small mappings -- a typical use is a mapping of half a
2687 dozen entries or so, specified literally in source.
2688 Hashtables can achieve better performance with larger mappings.
2694 In a traditional language such as C, every iteration or collection must
2695 be written out as a loop, with setting up and updating of idices, etc.
2696 Factor on the other hand relies on combinators and quotations to avoid
2697 duplicating these loop
2698 \begin_inset Quotes eld
2702 \begin_inset Quotes erd
2705 throughout the code.
2708 The simplest case is iterating through each element of a list, and printing
2709 it or otherwise consuming it from the stack.
2714 each ( list quot -- )
2716 pushes each element of the list in turn, and executes the quotation.
2717 The list and quotation are not on the stack when the quotation is executed.
2718 This allows a powerful idiom where the quotation makes a copy of a value
2719 on the stack, and consumes it along with the list element.
2720 In fact, this idiom works with all well-designed combinators.
2726 Later, you will learn how to apply it when designing your own combinators.
2732 The previously-mentioned
2736 word is implemented using
2743 : reverse [ ] swap [ swons ] each ;
2746 To understand how it works, consider that each element of the original list
2747 is consed onto the beginning of a new list, in turn.
2748 So the last element of the original list ends up at the beginning of the
2754 inject ( list quot -- list )
2760 , except the return values of the quotation are collected into the new list.
2761 The quotation must leave one more element on the stack than was present
2762 before the quotation was called, otherwise the combinator will not function
2763 properly; so the quotation must have stack effect
2770 For example, suppose we have a list where each element stores the quantity
2771 of a some nutrient in 100 grams of food; we would like to find out the
2772 total nutrients contained in 300 grams:
2775 : multiply-each ( n list -- list )
2778 [ dupd * ] inject nip ;
2781 3 [ 50 450 101 ] multiply-each .
2793 to discard the original parameter
2800 In case there is no appropriate combinator, recursion can be used.
2801 Factor performs tail call optimization, so a word where the recursive call
2802 is the last thing done will not use an arbitrary amount of stack space.
2807 subset ( list quot -- list )
2809 produces a new list containing some of the elements of the original list.
2810 Which elements to collect is determined by the quotation -- the quotation
2811 is called with each list element on the stack in turn, and those elements
2812 for which the quotation does not return
2816 are added to the new list.
2817 The quotation must have stack effect
2824 For example, lets construct a list of all numbers between 0 and 99 such
2825 that the sum of their digits is less than 10:
2828 : sum-of-digits ( n -- n ) 10 /mod + ;
2831 100 count [ sum-of-digits 10 < ] subset .
2836 [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 20 21
2841 22 23 24 25 26 27 30 31 32 33 34 35 36 40 41 42 43 44
2846 45 50 51 52 53 54 60 61 62 63 70 71 72 80 81 90 ]
2851 all? ( list quot -- ? )
2857 if the quotation returns
2861 for all elements of the list, otherwise it returns
2878 applied to the same list and quotation would return the entire list.
2884 Barring any side effects which modify the execution of the quotation.
2885 It is best to avoid side effects when using list combinators.
2891 For example, the implementation of
2902 : assoc? ( list -- ? )
2905 dup list? [ [ cons? ] all? ] [ drop f ] ifte ;
2911 The list construction words minimize stack noise with a clever trick.
2912 They store a partial list in a variable, thus reducing the number of stack
2913 elements that have to be juggled.
2920 begins list construction.
2927 appends an object to the partial list.
2934 pushes the complete list.
2937 While variables haven't been described yet, keep in mind that a new scope
2947 This means that list constructions can be nested, as long as in the end,
2957 There is no requirement that
2965 appear in the same word, however, debugging becomes prohibitively difficult
2966 when a list construction begins in one word and ends with another.
2969 Here is an example of list construction using this technique:
2972 [, 1 10 [ 2 * dup , ] times drop ,] .
2977 [ 2 4 8 16 32 64 128 256 512 1024 ]
2982 Destructively modifying lists
2985 All previously discussed list modification functions always returned newly-alloc
2987 Destructive list manipulation functions on the other hand reuse the cons
2988 cells of their input lists, and hence avoid memory allocation.
2991 Only ever destructively change lists you do not intend to reuse again.
2992 You should not rely on the side effects -- they are unpredictable.
2993 It is wrong to think that destructive words
2994 \begin_inset Quotes eld
2998 \begin_inset Quotes erd
3001 the original list -- rather, think of them as returning a new list, just
3002 like the normal versions of the words, with the added caveat that the original
3003 list must not be used again.
3008 nreverse ( list -- list )
3010 reverses a list without consing.
3011 In the following example, the return value reuses the cons cells of the
3012 original list, and the original list has been ruined by unpredictable side
3016 [ 1 2 3 4 ] dup nreverse .s
3021 { [ 4 ] [ 4 3 2 1 ] }
3024 Compare the second stack element (which is what remains of the original
3025 list) and the top stack element (the list returned by
3036 word is the most frequently used destructive list manipulator.
3037 The usual idiom is a loop where values are consed onto the beginning of
3038 a list in each iteration of a loop, then the list is reversed at the end.
3039 Since the original list is never used again,
3043 can safely be used here.
3049 nappend ( list list -- list )
3051 sets the cdr of the last cons cell in the first list to the second list,
3052 unless the first list is
3056 , in which case it simply returns the second list.
3057 Again, the side effects on the first list are unpredictable -- if it is
3062 , it is unchanged, otherwise, it is equal to the return value:
3065 [ 1 2 ] [ 3 4 ] nappend .
3073 Note in the above examples, we use literal list parameters to nreverse and
3075 This is actually a very bad idea, since the same literal list may be used
3076 more than once! For example, lets make a colon definition:
3079 : very-bad-idea [ 1 2 3 4 ] nreverse ;
3099 \begin_inset Quotes eld
3103 \begin_inset Quotes erd
3119 As you can see, the word definition itself was ruined!
3122 Sometimes it is desirable make a copy of a list, so that the copy may be
3123 safely side-effected later.
3128 clone-list ( list -- list )
3130 pushes a new list containing the exact same elements as the original.
3131 The elements themselves are not copied.
3134 If you want to write your own destructive list manipulation words, you can
3137 set-car ( value cons -- )
3141 set-cdr ( value cons -- )
3143 to modify individual cons cells.
3144 Some words that are not destructive on their inputs nonetheless create
3145 intermediate lists which are operated on using these words.
3156 A vector is a contiguous chunk of cells which hold references to arbitrary
3158 Vectors have the following literal syntax:
3161 { f f f t t f t t -6
3162 \begin_inset Quotes eld
3166 \begin_inset Quotes erd
3172 Use of vector literals in source code is discouraged, since vector manipulation
3173 relies on side effects rather than return values, and it is very easy to
3174 mess up a literal embedded in a word definition.
3177 Vectors versus lists
3180 Vectors are applicable for a different class of problems than lists.
3181 Compare the relative performance of common operations on vectors and lists:
3185 \begin_inset Tabular
3186 <lyxtabular version="3" rows="4" columns="3">
3188 <column alignment="center" valignment="top" leftline="true" width="0">
3189 <column alignment="center" valignment="top" leftline="true" width="0">
3190 <column alignment="center" valignment="top" leftline="true" rightline="true" width="0">
3191 <row topline="true" bottomline="true">
3192 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3199 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3207 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3216 <row topline="true">
3217 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3222 Random access of an index
3225 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3233 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3242 <row topline="true">
3243 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3248 Add new element at start
3251 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3259 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3268 <row topline="true" bottomline="true">
3269 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3274 Add new element at end
3277 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3285 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3301 When using vectors, you need to pass around a vector and an index -- when
3302 working with lists, often only a list head is passed around.
3303 For this reason, if you need a sequence for iteration only, a list is a
3304 better choice because the list vocabulary contains a rich collection of
3308 On the other hand, when you need to maintain your own
3309 \begin_inset Quotes eld
3313 \begin_inset Quotes erd
3316 -like collection, a vector is the obvious choice, since most pushes and
3317 pops can then avoid allocating memory.
3320 Vectors and lists can be converted back and forth using the
3344 <vector> ( capacity -- vector )
3346 pushes a zero-length vector.
3347 Storing more elements than the initial capacity grows the vector.
3352 vector-nth ( index vector -- obj )
3354 pushes the object stored at a zero-based index of a vector:
3358 \begin_inset Quotes eld
3362 \begin_inset Quotes erd
3366 \begin_inset Quotes eld
3370 \begin_inset Quotes erd
3379 \begin_inset Quotes eld
3383 \begin_inset Quotes erd
3389 2 { 1 2 } vector-nth .
3394 ERROR: Out of bounds
3399 set-vector-nth ( obj index vector -- )
3401 stores a value into a vector:
3415 used in this example will be formally introduced later.
3422 \begin_inset Quotes eld
3426 \begin_inset Quotes erd
3430 \begin_inset Quotes eld
3434 \begin_inset Quotes erd
3438 \begin_inset Quotes eld
3442 \begin_inset Quotes erd
3449 \begin_inset Quotes eld
3453 \begin_inset Quotes erd
3457 \begin_inset Quotes eld
3461 \begin_inset Quotes erd
3468 \begin_inset Quotes eld
3472 \begin_inset Quotes erd
3481 \begin_inset Quotes eld
3485 \begin_inset Quotes erd
3489 \begin_inset Quotes eld
3493 \begin_inset Quotes erd
3500 \begin_inset Quotes eld
3504 \begin_inset Quotes erd
3508 \begin_inset Quotes eld
3512 \begin_inset Quotes erd
3519 \begin_inset Quotes eld
3523 \begin_inset Quotes erd
3532 \begin_inset Quotes eld
3536 \begin_inset Quotes erd
3540 \begin_inset Quotes eld
3544 \begin_inset Quotes erd
3548 \begin_inset Quotes eld
3552 \begin_inset Quotes erd
3560 vector-length ( vector -- length )
3562 pushes the number of elements in a vector.
3563 As the previous two examples demonstrate, attempting to fetch beyond the
3564 end of the vector will raise an error, while storing beyond the end will
3565 grow the vector as necessary.
3570 set-vector-length ( length vector -- )
3573 If the new length is larger than the current length, the vector grows if
3574 necessary, and the new cells are filled with
3583 vector-push ( obj vector -- )
3585 adds an object at the end of the vector.
3586 This increments the vector's length by one.
3591 vector-pop ( vector -- obj )
3593 removes the object at the end of the vector and pushes it.
3594 This decrements the vector's length by one.
3600 vector-each, vector-map
3606 Strings are character vectors
3609 str-nth, str-length, substring, ...
3612 String buffers are mutable
3615 <sbuf>, sbuf-append, sbuf>str
3618 Otherwise like a vector:
3621 sbuf-nth, set-sbuf-nth, sbuf-length, set-sbuf-length
3630 Printing and reading strings
3633 print, write, read, turning a string into a number
3636 PRACTICAL: Contractor timesheet
3642 - print a time difference as hours:minutes
3648 - end work & annotate
3651 - print an invoice, takes hourly rate as a parameter.
3652 do simple formatted output, using 'spaces' and 'pad-string'.
3655 use a vector to store [ annotation | time ] pairs, pass the vector in
3673 PRACTICAL: Music player
3679 Text -> objects - parser, objects -> text - unparser for atoms, prettyprinter
3683 What really is a word -- primitive, parameter, property list.
3686 Call stack how it works and >r/r>
3692 Lets take a closer look at Factor syntax.
3693 Consider a simple expression, and the result of evaluating it in the interactiv
3705 The interactive interpreter is basically an infinite loop.
3706 It reads a line of input from the terminal, parses this line to produce
3711 , and executes the quotation.
3714 In the parse step, the input text is tokenized into a sequence of white
3715 space-separated tokens.
3716 First, the interpreter checks if there is an existing word named by the
3718 If there is no such word, the interpreter instead treats the token as a
3725 Of course, Factor supports a full range of data types, including strings,
3727 Their source representations are still built from numbers and words, however.
3733 Once the expression has been entirely parsed, the interactive interpreter
3737 This parse time/run time distinction is important, because words fall into
3739 \begin_inset Quotes eld
3743 \begin_inset Quotes erd
3747 \begin_inset Quotes eld
3751 \begin_inset Quotes erd
3757 The parser constructs a parse tree from the input text.
3758 When the parser encounters a token representing a number or an ordinary
3759 word, the token is simply appended to the current parse tree node.
3760 A parsing word on the other hand is executed
3764 immediately after being tokenized.
3765 Since it executes in the context of the parser, it has access to the raw
3766 input text, the entire parse tree, and other parser structures.
3769 Parsing words are also defined using colon definitions, except we add
3773 after the terminating
3778 Here are two examples of definitions for words
3786 , both are identical except in the second example,
3790 is defined as a parsing word:
3793 ! Lets define 'foo' as a running word.
3797 \begin_inset Quotes eld
3801 \begin_inset Quotes erd
3808 \begin_inset Quotes eld
3812 \begin_inset Quotes erd
3846 ! Now lets define 'foo' as a parsing word.
3850 \begin_inset Quotes eld
3854 \begin_inset Quotes erd
3861 \begin_inset Quotes eld
3865 \begin_inset Quotes erd
3895 \begin_inset Quotes eld
3900 that denotes a string literal is a parsing word -- it reads characters from
3901 the input text until the next occurrence of
3904 \begin_inset Quotes eld
3909 , and appends this string to the current node of the parse tree.
3910 Note that strings and words are different types of objects.
3911 Strings are covered in great detail later.
3914 PRACTICAL: Infix syntax
3920 Generators, co-routines, multitasking, exception handling
3926 PRACTICAL: Some web app