2 USING: arrays help kernel kernel-internals sequences-internals
3 strings vectors words ;
5 ARTICLE: "sequence-implementations" "Sequence implementations"
6 "There are two basic types of sequences. Instances of the following two types have fixed length:"
7 { $subsection "arrays" }
8 { $subsection "strings" }
9 "Instances of the following are resizable:"
10 { $subsection "vectors" }
11 { $subsection "sbufs" }
12 "Quotations are special sequences which hold code:"
13 { $subsection "quotations" }
14 "Integers support the sequence protocol:"
15 { $subsection "sequences-integers" }
16 "Virtual sequences wrap an underlying sequence, and changes to the underlying sequence are reflected in the virtual sequence:"
17 { $subsection <reversed> }
18 { $subsection <slice> }
19 { $subsection <column> }
20 "The " { $link f } " object also supports the sequence protocol. It responds with a length of zero, and instead of throwing an out of bounds error, outputs " { $link f } " when an element is accessed. This can simplify code that would like a dummy sequence behaving as if it has arbitrary length." ;
22 ARTICLE: "sequences-integers" "Integer sequences and counted loops"
23 "Integers support the sequence protocol in a trivial fashion; a non-negative integer presents its non-negative predecessors as elements. For example, the integer 3, when viewed as a sequence, contains the elements 0, 1, and 2. This is very useful for performing counted loops."
25 "For example, the " { $link each } " combinator, given an integer, simply calls a quotation that number of times, pushing a counter on each iteration that ranges from 0 up to that integer:"
26 { $example "3 [ . ] each" "0\n1\n2" }
27 "A common idiom is to iterate over a sequence, while also maintaining a loop counter. This can be done using " { $link 2each } ":"
28 { $example "{ \"a\" \"b\" \"c\" } dup length [\n \"Index: \" write . \"Element: \" write .\n] 2each" "Index: 0\nElement: \"a\"\nIndex: 1\nElement: \"b\"\nIndex: 2\nElement: \"c\"" }
29 "Combinators that produce new sequences, such as " { $link map } ", will output an array if the input is an integer." ;
31 ARTICLE: "sequences-access" "Accessing sequence elements"
35 { $subsection second }
37 { $subsection fourth }
38 { $subsection first2 }
39 { $subsection first3 }
40 { $subsection first4 }
42 { $subsection last/first } ;
44 ARTICLE: "sequences-combinators" "Sequence combinators"
47 { $subsection each-with }
48 { $subsection reduce }
49 { $subsection interleave }
51 { $subsection 2reduce }
54 { $subsection map-with }
55 { $subsection accumulate }
58 { $subsection subset }
59 { $subsection subset-with } ;
61 ARTICLE: "sequences-tests" "Testing sequences"
62 "Testing for an empty sequence:"
63 { $subsection empty? }
65 { $subsection bounds-check? }
66 "Testing if a sequence contains an object:"
67 { $subsection member? }
69 "Testing if a sequence contains a subsequence:"
72 { $subsection subseq? }
73 "Testing if a sequence contains elements satisfying a predicate:"
74 { $subsection contains? }
75 { $subsection contains-with? }
77 { $subsection all-with? }
78 "Testing how elements are related:"
79 { $subsection monotonic? }
80 { $subsection all-eq? }
81 { $subsection all-equal? } ;
83 ARTICLE: "sequences-search" "Searching sequences"
84 "Finding the index of an element:"
86 { $subsection index* }
87 { $subsection last-index }
88 { $subsection last-index* }
89 "Finding the start of a subsequence:"
91 { $subsection start* }
92 "Finding the index of an element satisfying a predicate:"
95 { $subsection find-with }
96 { $subsection find-with* }
97 { $subsection find-last }
98 { $subsection find-last-with }
99 { $subsection find-last* }
100 { $subsection find-last-with* } ;
102 ARTICLE: "sequences-join-split" "Joining and splitting sequences"
103 "Forming new sequences from existing sequences:"
104 { $subsection append }
105 { $subsection 3append }
106 { $subsection concat }
108 { $subsection replace-slice }
110 { $subsection subseq }
113 { $subsection head* }
114 { $subsection tail* }
115 "Variants of the above which output a virtual sequence sharing storage with the input sequence:"
116 { $subsection <slice> }
117 { $subsection head-slice }
118 { $subsection tail-slice }
119 { $subsection head-slice* }
120 { $subsection tail-slice* }
121 "Splitting sequences at specific indices:"
124 { $subsection unclip }
125 "Splitting sequences at occurrences of subsequences:"
126 { $subsection ?head }
127 { $subsection ?tail }
128 { $subsection split1 }
129 { $subsection split }
130 { $subsection drop-prefix } ;
132 ARTICLE: "sequences-add-remove" "Adding and removing sequence elements"
137 { $subsection remove }
138 { $subsection remove-nth }
140 { $subsection prune } ;
142 ARTICLE: "sequences-reshape" "Reshaping sequences"
143 { $subsection reverse }
144 { $subsection <reversed> }
145 { $subsection group }
146 { $subsection flatten }
148 { $subsection <column> }
149 { $subsection subst } ;
151 ARTICLE: "sequences-destructive" "Destructive operations"
152 "These words modify their input, instead of creating a new sequence."
154 "In-place variant of " { $link add } ":"
156 "In-place variant of " { $link append } ":"
157 { $subsection nappend }
158 "In-place variant of " { $link remove } ":"
159 { $subsection delete }
160 "In-place variant of " { $link map } ":"
161 { $subsection inject }
162 { $subsection inject-with }
164 { $subsection set-nth }
165 { $subsection change-nth }
166 { $subsection cache-nth }
167 "Other destructive words:"
168 { $subsection exchange }
169 { $subsection push-new }
172 { $subsection delete-all }
173 { $subsection copy-into } ;
175 ARTICLE: "sequences-stacks" "Stack operations"
176 "The classical stack operations, modifying a sequence in place:"
177 { $subsection empty? }
182 "Lazy instantiation of stacks:"
183 { $subsection ?push } ;
185 ARTICLE: "sequences-comparing" "Comparing sequences"
187 { $subsection sequence= }
188 { $subsection mismatch }
189 "Lexicographic order:"
190 { $subsection <=> } ;
192 ARTICLE: "sequences-assoc" "Association lists"
193 "An " { $emphasis "association list" } " is a sequence of pairs. Association lists come up from time to time; for example, the " { $link cond } " combinator takes an association list of quotations as input. You can perform lookups and take association lists apart:"
194 { $subsection assoc }
195 { $subsection rassoc }
196 { $subsection unpair }
197 "An association list is slower to search than a hashtable. The main advantage of an association list is that the elements are ordered; also sometimes it is more convenient to construct an association list with sequence words than to construct a hashtable with hastable words. Most of the time, hashtables are more appropriate. See " { $link "hashtables" } "." ;
199 ARTICLE: "sequence-protocol" "Sequence protocol"
200 "All sequences must know their length, and provide a way to access elements:"
201 { $subsection length }
204 { $subsection set-nth }
205 "Resizable sequences:"
206 { $subsection set-length }
207 "An optional generic word for creating sequences of the same class as a given sequence:"
209 "Another optional generic word for optimization purposes:"
210 { $subsection new } ;
212 ARTICLE: "arrays" "Arrays"
213 "An array is a fixed-size mutable sequence whose elements are stored in a contiguous range of memory. The literal syntax is covered in " { $link "syntax-arrays" } ". Sometimes you need a resizable array -- this is called a vector, and vectors are documented in " { $link "vectors" } "."
215 "Array words are in the " { $vocab-link "arrays" } " vocabulary. Unsafe implementation words are in the " { $vocab-link "kernel-internals" } " vocabulary."
217 "Arrays form a class of objects."
218 { $subsection array }
219 { $subsection array? }
220 "There are several ways to construct arrays."
221 { $subsection >array }
222 { $subsection <array> }
223 { $subsection 1array }
224 { $subsection 2array }
225 { $subsection 3array }
226 { $subsection 4array }
227 "Arrays can be accessed without bounds checks in a pointer unsafe way."
228 { $subsection array-nth }
229 { $subsection set-array-nth } ;
231 ARTICLE: "strings" "Strings"
232 "A string is a fixed-size mutable sequence of characters."
234 "String words are found in the " { $vocab-link "strings" } " vocabulary."
235 { $subsection string? }
236 { $subsection >string }
237 { $subsection <string> }
238 "A pair of words are used to nicely format columns of text."
239 { $subsection pad-left }
240 { $subsection pad-right }
241 "Characters are not a first-class type; they are simply represented as integers between 0 and 65535. A few words operate on characters:"
242 { $subsection blank? }
243 { $subsection letter? }
244 { $subsection LETTER? }
245 { $subsection digit? }
246 { $subsection printable? }
247 { $subsection control? }
248 { $subsection quotable? }
249 { $subsection ch>lower }
250 { $subsection ch>upper } ;
252 ARTICLE: "sbufs" "String buffers"
253 "A string buffer is a resizable mutable sequence of characters. String buffers can be used to construct new strings by accumilating substrings and characters, however usually they are only used indirectly, since the sequence construction words are more convenient to use in most cases (see " { $link "namespaces-make" } ")."
255 "String buffer words are found in the " { $vocab-link "strings" } " vocabulary."
256 { $subsection sbuf? }
257 "Words for creating string buffers:"
258 { $subsection >sbuf }
259 { $subsection <sbuf> }
260 "If you don't care about initial capacity, a more elegant way to create a new string buffer is to write:"
261 { $code "SBUF\" \" clone" } ;
263 ARTICLE: "vectors" "Vectors"
264 "A vector is a resizable mutable sequence of objects. Vector words are found in the " { $vocab-link "vectors" } " vocabulary."
265 { $subsection vector? }
266 "Words for creating vectors:"
267 { $subsection >vector }
268 { $subsection <vector> }
269 "If you don't care about initial capacity, a more elegant way to create a new vector is to write:"
270 { $code "V{ } clone" } ;
272 ARTICLE: "sequences-sorting" "Sorting and binary search"
273 "Sorting and binary search combinators all take comparator quotations with stack effect " { $snippet "( elt1 elt2 -- n )" } " that order the two given elements and output a value whose sign denotes the result:"
275 { "positive - indicates that " { $snippet "elt1" } " follows " { $snippet "elt2" } }
276 { "zero - indicates that " { $snippet "elt1" } " is ordered equivalently to " { $snippet "elt2" } }
277 { "negative - indicates that " { $snippet "elt1" } " precedes " { $snippet "elt2" } }
280 { $subsection nsort }
281 "Sorting elements in a new sequence:"
283 "Common comparators:"
284 { $subsection natural-sort }
285 { $subsection sort-keys }
286 { $subsection sort-values }
288 { $subsection binsearch }
289 { $subsection binsearch* } ;
291 ARTICLE: "sequences-unsafe" "Unsafe sequence operations"
292 "The unsafe sequence protocol bypasses bounds checks for increased performance:"
293 { $subsection nth-unsafe }
294 { $subsection set-nth-unsafe }
295 "These words assume the sequence index given is within bounds; if it is not, memory corruption can occur. Please think twice before using them; first, make sure the code in question is actually a bottleneck; next, try improving the algorithm first. If all else fails, then use these words and test your code very carefully."
297 "There is a very important invariant these word must preserve: if at some point in time, the length of a sequence was " { $snippet "n" } ", then any future lookups of elements with indices below " { $snippet "n" } " must not crash the runtime, even if the sequence length is now less than " { $snippet "n" } ". For example, vectors preserve this invariant by never shrinking the underlying storage, only growing it as necessary."
299 "The justification for this is that the runtime should not crash if a resizable sequence is resized during the execution of an iteration combinator."
301 "Indeed, iteration combinators are the primary use-case for these words; if the iteration index is already guarded by a loop test which ensures it is within bounds, then additional bounds checks are redundant. For example, see the implementation of " { $link each } "." ;
303 ARTICLE: "sequences-resizable" "Resizable sequence implementation"
304 "Resizable sequences are implementing by having a wrapper object hold a reference to an underlying sequence, together with a fill pointer indicating how many elements of the underlying sequence are occupied. When the fill pointer exceeds the underlying sequence capacity, the underlying sequence grows."
306 "There is a resizable sequence protocol:"
307 { $subsection underlying }
308 { $subsection set-underlying }
309 { $subsection set-fill }
310 "Any instance of a class implementing the above generics can make use of several utility words:"
311 { $subsection capacity }
312 { $subsection ensure }
313 { $subsection grow-length }
314 { $subsection clone-resizable }
315 "This protocol and the above words are unsafe; they do not perform bounds checks for performance reasons, and thus a mistake can lead to memory corruption due to an underlying sequence being shorter than the fill pointer."
317 "Vectors and string buffers are implemented using the resizable sequence facility (and they perform full bounds-checks and thus are safe)." ;
319 ARTICLE: "sequences" "Sequences"
320 "A sequence is a linearly-ordered finite collection of elements."
322 "Sequence utility words can operate on any object whose class implements the sequence protocol."
323 { $subsection "sequence-protocol" }
324 "There are a number of implementations of sequences in the core library, and you can write new implementations yourself."
325 { $subsection "sequence-implementations" }
326 "Much of the power of sequences lies in the polymorphic utility words that allow computations to be expressed as bulk operations without loops, recursion or micro-management of elements."
327 { $subsection "sequences-access" }
328 { $subsection "sequences-combinators" }
329 { $subsection "sequences-tests" }
330 { $subsection "sequences-search" }
331 { $subsection "sequences-add-remove" }
332 { $subsection "sequences-join-split" }
333 { $subsection "sequences-reshape" }
334 { $subsection "sequences-comparing" }
335 { $subsection "sequences-sorting" }
336 { $subsection "sequences-destructive" }
337 "Sequences are also used to implement other data structures:"
338 { $subsection "sequences-stacks" }
339 { $subsection "sequences-assoc" }
340 "Some implementation details which your code should probably not care about:"
341 { $subsection "sequences-unsafe" }
342 { $subsection "sequences-resizable" }
343 { $see-also "namespaces-make" } ;