1 ! Copyright (C) 2020 Doug Coleman.
2 ! See http://factorcode.org/license.txt for BSD license.
3 USING: help.markup help.syntax kernel math multiline vectors ;
6 HELP: <reservoir-sampler>
7 { $values k: integer sampler: object }
8 { $description Creates an object that will hold k samples from everything it sees with equal probability. To show a reservoir-sampler an object, call \ reservoir-sample . } ;
10 HELP: reservoir-sample
11 { $values obj: object sampler: object }
12 { $description Feeds a sample to a \ reservoir-sampler which will maintain a vector of samples with equal probability. This word is especially useful when you do not know how many objects will appear but wish to sample them with equal probability, such as in a stream with unknown length. }
15 USING: prettyprint io strings math reservoir-sampling
16 kernel accessors io.streams.string ;
18 "Nothing will fundamentally change." [
19 10 <reservoir-sampler>
20 [ [ read1 dup ] swap '[ dup 1string . _ reservoir-sample ] while ] keep nip sampled>> >string .
26 HELP: reservoir-sample-iteration
27 { $values iteration: integer k: integer obj: object sampled: vector sampled': vector }
28 { $description Sample with equal probabilty without using a \ reservoir-sampler object. } ;
30 HELP: reservoir-sampler
31 { $class-description The class of a reservoir sampler object. Create one with \ <reservoir-sampler> . } ;
33 ARTICLE: "reservoir-sampling" "Reservoir Sampling"
34 The { $vocab-link "reservoir-sampling" } vocabulary is a way to take k samples with equal probability from all the objects shown to the sampler. This means that you do not have to know how many objects the sampler will eventually see, and that the probability will still be equivalent.
46 Reservoir sampling without an object:
48 reservoir-sample-iteration
51 ABOUT: "reservoir-sampling"