1 USING: help kernel math parser words ;
3 ARTICLE: "syntax" "Syntax"
4 "In Factor, an " { $emphasis "object" } " is a piece of data that can be identified. Code is data, so Factor syntax is actually a syntax for describing objects, of which code is a special case. Factor syntax is read by the parser. The parser performs two kinds of tasks -- it creates objects from their " { $emphasis "printed representations" } ", and it adds " { $emphasis "word definitions" } " to the dictionary (see " { $link "words" } "). The parser can be extended (see " { $link "parser" } ")."
5 { $subsection "parser-algorithm" }
6 { $subsection "vocabulary-search" }
7 { $subsection "syntax-comments" }
8 { $subsection "syntax-literals" } ;
10 ARTICLE: "parser-algorithm" "Parser algorithm"
11 "At the most abstract level, Factor syntax consists of whitespace-separated tokens. The parser tokenizes the input on whitespace boundaries. The parser is case-sensitive and whitesapce between tokens is significant, so the following three expressions tokenize differently:"
12 { $code "2X+\n2 X +\n2 x +" }
13 "As the parser reads tokens it makes a distinction between numbers, ordinary words, and parsing words. Tokens are appended to the parse tree, the top level of which is a list returned by the original parser invocation. Nested levels of the parse tree are created by parsing words."
15 "The parser iterates through the input text, checking each character in turn. Here is the parser algorithm in more detail -- some of the concepts therein will be defined shortly:"
17 { "If the current character is a double-quote (\"), the " { $link POSTPONE: " } " parsing word is executed, causing a string to be read." }
19 "Otherwise, the next token is taken from the input. The parser searches for a word named by the token in the currently used set of vocabularies. If the word is found, one of the following two actions is taken:"
21 "If the word is an ordinary word, it is appended to the parse tree."
22 "If the word is a parsing word, it is executed."
25 "Otherwise if the token does not represent a known word, the parser attempts to parse it as a number. If the token is a number, the number object is added to the parse tree. Otherwise, an error is raised and parsing halts."
27 "There is one exception to the above process; the parser might be placed in " { $emphasis "string mode" } ", in which case it simply reads tokens and appends them to the parse tree as strings. String mode is activated and deactivated by certain parsing words wishing to read input in an unstructured but tokenized manner -- see " { $link "string-mode" } "."
29 "Parsing words play a key role in parsing; while ordinary words and numbers are simply added to the parse tree, parsing words execute in the context of the parser, and can do their own parsing and create nested data structures in the parse tree. Parsing words are also able to define new words."
31 "While parsing words supporting arbitrary syntax can be defined, the default set is found in the " { $vocab-link "syntax" } " vocabulary and provides the basis for all further syntactic interaction with Factor." ;
33 ARTICLE: "vocabulary-search" "Vocabulary search"
34 "A " { $emphasis "word" } " is a code definition identified by a name. Words are sorted into " { $emphasis "vocabularies" } ". Words are discussed in depth in " { $link "words" } "."
36 "When the parser reads a token, it attempts to look up a word named by that token. The lookup is performed by searching each vocabulary in the search path, in order."
38 "Due to the way the parser works, words cannot be referenced before they are defined; that is, source files must order definitions in a strictly bottom-up fashion. Use the " { $link POSTPONE: DEFER: } " parsing word to get around this limitation, for example when defining mutually-recursive words."
40 "For a source file the vocabulary search path starts off with two vocabularies:"
41 { $code "syntax\nscratchpad" }
42 "The " { $vocab-link "syntax" } " vocabulary consists of a set of parsing words for reading Factor data and defining new words. The " { $vocab-link "scratchpad" } " vocabulary is the default vocabulary for new word definitions."
44 "At the interactive listener, the default search path contains many more vocabularies. Details on the default search path and parser invocation are found in " { $link "parser" } "."
46 "Three parsing words deal with the vocabulary search path:"
47 { $subsection POSTPONE: USE: }
48 { $subsection POSTPONE: USING: }
49 { $subsection POSTPONE: IN: }
50 "Here is an example demonstrating the vocabulary search path. If you can understand this example, then you have grasped vocabularies."
53 "USING: sequences io ;"
56 " #! Prints a message, then calls sequences::append."
57 " \"foe::append calls sequences::append\" print append ;"
62 " #! Loops, calling fee::append."
63 " \"fee::append calls fee::append\" print append ;"
68 " #! Redefining fee::append to call foe::append."
69 " \"fee::append calls foe::append\" print append ;"
71 "\"1234\" \"5678\" append print"
73 "When placed in a source file and run, the above code produces the following output:"
75 "fee::append calls foe::append"
76 "foe::append calls sequences::append"
80 ARTICLE: "syntax-comments" "Comments"
81 { $subsection POSTPONE: ! }
82 { $subsection POSTPONE: #! } ;
84 ARTICLE: "syntax-literals" "Literals"
85 "Many different types of objects can be constructed at parse time via literal syntax. Numbers are a special case since support for reading them is built-in to the parser. All other literals are constructed via parsing words."
87 "If a quotation contains a literal object, the same literal object instance is used each time the quotation executes; that is, literals are ``live''."
89 "Using mutable object literals in word definitions requires care, since if those objects are mutated, the actual word definition will be changed, which is in most cases not what you would expect."
90 { $subsection "syntax-numbers" }
91 { $subsection "syntax-words" }
92 { $subsection "syntax-booleans" }
93 { $subsection "syntax-quots" }
94 { $subsection "syntax-arrays" }
95 { $subsection "syntax-vectors" }
96 { $subsection "syntax-strings" }
97 { $subsection "syntax-sbufs" }
98 { $subsection "syntax-hashtables" }
99 { $subsection "syntax-tuples" }
100 { $subsection "syntax-aliens" } ;
102 ARTICLE: "syntax-numbers" "Number syntax"
103 "If a vocabulary lookup of a token fails, the parser attempts to parse it as a number."
104 { $subsection "syntax-integers" }
105 { $subsection "syntax-ratios" }
106 { $subsection "syntax-floats" }
107 { $subsection "syntax-complex-numbers" } ;
109 ARTICLE: "syntax-integers" "Integer syntax"
110 "The printed representation of an integer consists of a sequence of digits, optionally prefixed by a sign."
114 "2432902008176640000"
116 "Integers are entered in base 10 unless prefixed with a base change parsing word."
117 { $subsection POSTPONE: BIN: }
118 { $subsection POSTPONE: OCT: }
119 { $subsection POSTPONE: HEX: }
120 "More information on integers can be found in " { $link "integers" } "." ;
122 ARTICLE: "syntax-ratios" "Ratio syntax"
123 "The printed representation of a ratio is a pair of integers separated by a slash (/). No intermediate whitespace is permitted. Either integer may be signed, however the ratio will be normalized into a form where the denominator is positive and the greatest common divisor of the two terms is 1."
129 "More information on ratios can be found in " { $link "rationals" } ;
131 ARTICLE: "syntax-floats" "Float syntax"
132 "Floating point numbers contain an optional decimal part, an optional exponent, with an optional sign prefix on either the mantissa or exponent."
139 "More information on floats can be found in " { $link "floats" } "." ;
141 ARTICLE: "syntax-complex-numbers" "Complex number syntax"
142 "A complex number is given by two components, a ``real'' part and ''imaginary'' part. The components must either be integers, ratios or floats."
144 "C{ 1/2 1/3 } ! the complex number 1/2+1/3i"
145 "C{ 0 1 } ! the imaginary unit"
147 "More information on complex numbers can be found in " { $link "complex-numbers" } "." ;
149 ARTICLE: "syntax-words" "Word syntax"
150 "A word occurring inside a quotation is executed when the quotation is called. Sometimes a word needs to be pushed on the data stack instead. The canonical use-case for this is passing the word to the " { $link execute } " combinator, or alternatively, reflectively accessing word properties (" { $link "word-props" } ")."
151 { $subsection POSTPONE: \ }
152 { $subsection POSTPONE: POSTPONE: }
153 "The implementation of the " { $link POSTPONE: \ } " word is discussed in detail in " { $link "reading-ahead" } ". Words are documented in " { $link "words" } "." ;
155 ARTICLE: "syntax-booleans" "Boolean syntax"
156 "Any Factor object may be used as a truth value in a conditional expression. The " { $link f } " object is false and anything else is true. The " { $link f } " object is also used to represent the empty list, as well as the concept of a missing value. The canonical truth value is the " { $link t } " object."
157 { $subsection POSTPONE: f }
160 ARTICLE: "syntax-strings" "Character and string syntax"
161 "Factor has no distinct character type, however Unicode character value integers can be read by specifying a literal character, or an escaped representation thereof."
162 { $subsection POSTPONE: CHAR: }
163 { $subsection POSTPONE: " }
164 { $subsection "escape" }
165 "Strings are documented in " { $link "strings" } "." ;
167 ARTICLE: "escape" "Character escape codes"
169 { "Escape code" "Meaning" }
170 { { $snippet "\\\\" } { $snippet "\\" } }
171 { { $snippet "\\s" } "a space" }
172 { { $snippet "\\t" } "a tab" }
173 { { $snippet "\\n" } "a newline" }
174 { { $snippet "\\r" } "a carriage return" }
175 { { $snippet "\\0" } "a null byte (ASCII 0)" }
176 { { $snippet "\\e" } "escape (ASCII 27)" }
177 { { $snippet "\\\"" } { $snippet "\"" } }
179 "A Unicode character can be specified by its code number by writing " { $snippet "\\u" } " followed by a four-digit hexadecimal number. That is, the following two expressions are equivalent:"
184 "While not useful for single characters, this syntax is also permitted inside strings." ;
186 ARTICLE: "syntax-sbufs" "String buffer syntax"
187 { $subsection POSTPONE: SBUF" }
188 "String buffers are documented in " { $link "sbufs" } "." ;
190 ARTICLE: "syntax-arrays" "Array syntax"
191 { $subsection POSTPONE: { }
192 { $subsection POSTPONE: } }
193 "Arrays are documented in " { $link "arrays" } "." ;
195 ARTICLE: "syntax-vectors" "Vector syntax"
196 { $subsection POSTPONE: V{ }
197 { $subsection POSTPONE: } }
198 "Vectors are documented in " { $link "vectors" } "." ;
200 ARTICLE: "syntax-hashtables" "Hashtable syntax"
201 { $subsection POSTPONE: H{ }
202 { $subsection POSTPONE: } }
203 "Hashtables are documented in " { $link "hashtables" } "." ;
205 ARTICLE: "syntax-tuples" "Tuple syntax"
206 { $subsection POSTPONE: T{ }
207 { $subsection POSTPONE: } }
208 "Tuples are documented in " { $link "tuples" } "." ;
210 ARTICLE: "syntax-quots" "Quotation syntax"
211 { $subsection POSTPONE: [ }
212 { $subsection POSTPONE: ] }
213 "Quotations are documented in " { $link "quotations" } "." ;
215 ARTICLE: "syntax-aliens" "Alien object syntax"
216 "These literal forms mainly exist for print-outs, and should not be input unless you know what you are doing."
217 { $subsection POSTPONE: DLL" }
218 { $subsection POSTPONE: ALIEN: }
219 "The alien interface is documented in " { $link "alien" } "." ;