factor: trim using lists

[factor.git] / basis / math / vectors / simd / simd-docs.factor
diff --git a/basis/math/vectors/simd/simd-docs.factor b/basis/math/vectors/simd/simd-docs.factor

index d35f8589b61d0924f21cab366587810e90a56c5f..68592cb06b13a6f4324fca2641d63cc1e57790d4 100644 (file)
--- a/basis/math/vectors/simd/simd-docs.factor
+++ b/basis/math/vectors/simd/simd-docs.factor
@@ -1,6 +1,6 @@
-USING: help.markup help.syntax sequences math math.vectors
-kernel.private classes.tuple.private
-math.vectors.simd.intrinsics cpu.architecture ;
+USING: classes.tuple.private cpu.architecture help.markup
+help.syntax kernel.private math.vectors
+math.vectors.simd.intrinsics sequences ;
  IN: math.vectors.simd
  
  ARTICLE: "math.vectors.simd.intro" "Introduction to SIMD support"
@@ -8,7 +8,7 @@ ARTICLE: "math.vectors.simd.intro" "Introduction to SIMD support"
  $nl
  "SIMD support in the processor takes the form of instruction sets which operate on vector registers. By operating on multiple scalar values at the same time, code which operates on points, colors, and other vector data can be sped up."
  $nl
-"In Factor, SIMD support is exposed in the form of special-purpose SIMD " { $link "sequence-protocol" } " implementations. These are fixed-length, homogeneous sequences. They are referred to as vectors, but should not be confused with Factor's " { $link "vectors" } ", which can hold any type of object and can be resized.)."
+"In Factor, SIMD support is exposed in the form of special-purpose SIMD " { $link "sequence-protocol" } " implementations. These are fixed-length, homogeneous sequences. They are referred to as vectors, but should not be confused with Factor's " { $link "vectors" } ", which can hold any type of object and can be resized."
  $nl
  "The words in the " { $vocab-link "math.vectors" } " vocabulary, which can be used with any sequence of numbers, are special-cased by the compiler. If the compiler can prove that only SIMD vectors are used, it expands " { $link "math-vectors" } " into " { $link "math.vectors.simd.intrinsics" } ". While in the general case, SIMD intrinsics operate on heap-allocated SIMD vectors, that too can be optimized since in many cases the compiler unbox SIMD vectors, storing them directly in registers."
  $nl
@@ -17,48 +17,62 @@ $nl
  "There should never be any reason to use " { $link "math.vectors.simd.intrinsics" } " directly, but they too have a straightforward, but lower-level, interface." ;
  
  ARTICLE: "math.vectors.simd.support" "Supported SIMD instruction sets and operations"
-"At present, the SIMD support makes use of SSE2 and a few SSE3 instructions on x86 CPUs."
+"At present, the SIMD support makes use of a subset of SSE up to SSE4.1. The subset used depends on the current CPU type."
  $nl
-"SSE3 introduces horizontal adds (summing all components of a single vector register), which is useful for computing dot products. Where available, SSE3 operations are used to speed up " { $link sum } ", " { $link v. } ", " { $link norm-sq } ", " { $link norm } ", and " { $link distance } ". If SSE3 is not available, software fallbacks are used for " { $link sum } " and related words, decreasing performance."
+"SSE1 only supports single-precision SIMD (" { $snippet "float-4" } ")."
  $nl
-"On PowerPC, or older x86 chips without SSE2, software fallbacks are used for all high-level vector operations. SIMD code can run with no loss in functionality, just decreased performance."
+"SSE2 introduces double-precision SIMD (" { $snippet "double-2" } ") and integer SIMD (all types). Integer SIMD is missing a few features; in particular, the " { $link vmin } " and " { $link vmax } " operations only work on " { $snippet "uchar-16" } " and " { $snippet "short-8" } "."
  $nl
-"The primities in the " { $vocab-link "math.vectors.simd.intrinsics" } " vocabulary do not have software fallbacks, but they should not be called directly in any case." ;
+"SSE3 introduces horizontal adds (summing all components of a single vector register), which are useful for computing dot products. Where available, SSE3 operations are used to speed up " { $link sum } ", " { $link vdot } ", " { $link norm-sq } ", " { $link norm } ", and " { $link distance } "."
+$nl
+"SSSE3 introduces " { $link vabs } " for " { $snippet "char-16" } ", " { $snippet "short-8" } " and " { $snippet "int-4" } "."
+$nl
+"SSE4.1 introduces " { $link vmin } " and " { $link vmax } " for all remaining integer types, a faster instruction for " { $link vdot } ", and a few other things."
+$nl
+"On PowerPC, or older x86 chips without SSE, software fallbacks are used for all high-level vector operations. SIMD code can run with no loss in functionality, just decreased performance."
+$nl
+"The primitives in the " { $vocab-link "math.vectors.simd.intrinsics" } " vocabulary do not have software fallbacks, but they should not be called directly in any case." ;
  
  ARTICLE: "math.vectors.simd.types" "SIMD vector types"
-"Each SIMD vector type is named " { $snippet "scalar-count" } ", where " { $snippet "scalar" } " is a scalar C type such as " { $snippet "float" } " or " { $snippet "double" } ", and " { $snippet "count" } " is a vector dimension, such as 2, 4, or 8."
-$nl
-"The following vector types are defined:"
-{ $subsection float-4 }
-{ $subsection double-2 }
-{ $subsection float-8 }
-{ $subsection double-4 }
-"For each vector type, several words are defined:"
+"Each SIMD vector type is named " { $snippet "scalar-count" } ", where " { $snippet "scalar" } " is a scalar C type and " { $snippet "count" } " is a vector dimension."
+$nl
+"The following 128-bit vector types are defined in the " { $vocab-link "math.vectors.simd" } " vocabulary:"
+{ $code
+    "char-16"
+    "uchar-16"
+    "short-8"
+    "ushort-8"
+    "int-4"
+    "uint-4"
+    "longlong-2"
+    "ulonglong-2"
+    "float-4"
+    "double-2"
+}
+"Double-width 256-bit vector types are defined in the " { $vocab-link "math.vectors.simd.cords" } " vocabulary:"
+{ $code
+    "char-32"
+    "uchar-32"
+    "short-16"
+    "ushort-16"
+    "int-8"
+    "uint-8"
+    "longlong-4"
+    "ulonglong-4"
+    "float-8"
+    "double-4"
+} ;
+
+ARTICLE: "math.vectors.simd.words" "SIMD vector words"
+"For each SIMD vector type, several words are defined, where " { $snippet "type" } " is the type in question:"
  { $table
-    { "Word" "Stack effect" "Description" }
+    { { $strong "Word" } { $strong "Stack effect" } { $strong "Description" } }
      { { $snippet "type-with" } { $snippet "( x -- simd-array )" } "creates a new instance where all components are set to a single scalar" }
      { { $snippet "type-boa" } { $snippet "( ... -- simd-array )" } "creates a new instance where components are read from the stack" }
+    { { $snippet "type-cast" } { $snippet "( simd-array -- simd-array' )" } "creates a new SIMD array where the underlying data is taken from another SIMD array, with no format conversion" }
      { { $snippet ">type" } { $snippet "( seq -- simd-array )" } "creates a new instance initialized with the elements of an existing sequence, which must have the correct length" }
      { { $snippet "type{" } { $snippet "type{ elements... }" } "parsing word defining literal syntax for an SIMD vector; the correct number of elements must be given" }
  }
-"The " { $link float-4 } " and " { $link double-2 } " types correspond to 128-bit vector registers. The " { $link float-8 } " and " { $link double-4 } " types are not directly supported in hardware, and instead unbox to a pair of 128-bit vector registers."
-$nl
-"Operations on " { $link float-4 } " instances:"
-{ $subsection float-4-with }
-{ $subsection float-4-boa }
-{ $subsection POSTPONE: float-4{ }
-"Operations on " { $link double-2 } " instances:"
-{ $subsection double-2-with }
-{ $subsection double-2-boa }
-{ $subsection POSTPONE: double-2{ }
-"Operations on " { $link float-8 } " instances:"
-{ $subsection float-8-with }
-{ $subsection float-8-boa }
-{ $subsection POSTPONE: float-8{ }
-"Operations on " { $link double-4 } " instances:"
-{ $subsection double-4-with }
-{ $subsection double-4-boa }
-{ $subsection POSTPONE: double-4{ }
  "To actually perform vector arithmetic on SIMD vectors, use " { $link "math-vectors" } " words."
  { $see-also "c-types-specs" } ;
  
@@ -71,45 +85,47 @@ $nl
  $nl
  "For example, in the following, no SIMD operations are used at all, because the compiler's propagation pass does not consider dynamic variable usage:"
  { $code
-"""USING: compiler.tree.debugger math.vectors
+"USING: compiler.tree.debugger math.vectors
  math.vectors.simd ;
  SYMBOLS: x y ;
  
  [
-    double-4{ 1.5 2.0 3.7 0.4 } x set
-    double-4{ 1.5 2.0 3.7 0.4 } y set
+    float-4{ 1.5 2.0 3.7 0.4 } x set
+    float-4{ 1.5 2.0 3.7 0.4 } y set
      x get y get v+
-] optimizer-report.""" }
+] optimizer-report." }
  "The following word benefits from SIMD optimization, because it begins with an unsafe declaration:"
  { $code
-"""USING: compiler.tree.debugger kernel.private
+"USING: compiler.tree.debugger kernel.private
  math.vectors math.vectors.simd ;
+IN: simd-demo
  
  : interpolate ( v a b -- w )
      { float-4 float-4 float-4 } declare
      [ v* ] [ [ 1.0 ] dip n-v v* ] bi-curry* bi v+ ;
  
-\ interpolate optimizer-report.""" }
+\\ interpolate optimizer-report." }
  "Note that using " { $link declare } " is not recommended. Safer ways of getting type information for the input parameters to a word include defining methods on a generic word (the value being dispatched upon has a statically known type in the method body), as well as using " { $link "hints" } " and " { $link POSTPONE: inline } " declarations."
  $nl
  "Here is a better version of the " { $snippet "interpolate" } " words above that uses hints:"
  { $code
-"""USING: compiler.tree.debugger hints
+"USING: compiler.tree.debugger hints
  math.vectors math.vectors.simd ;
+IN: simd-demo
  
  : interpolate ( v a b -- w )
      [ v* ] [ [ 1.0 ] dip n-v v* ] bi-curry* bi v+ ;
  
  HINTS: interpolate float-4 float-4 float-4 ;
  
-\ interpolate optimizer-report. """ }
+\\ interpolate optimizer-report. " }
  "This time, the optimizer report lists calls to both SIMD primitives and high-level vector words, because hints cause two code paths to be generated. The " { $snippet "optimized." } " word can be used to make sure that the fast code path consists entirely of calls to primitives."
  $nl
  "If the " { $snippet "interpolate" } " word was to be used in several places with different types of vectors, it would be best to declare it " { $link POSTPONE: inline } "."
  $nl
  "In the " { $snippet "interpolate" } " word, there is still a call to the " { $link <tuple-boa> } " primitive, because the return value at the end is being boxed on the heap. In the next example, no memory allocation occurs at all because the SIMD vectors are stored inside a struct class (see " { $link "classes.struct" } "); also note the use of inlining:"
  { $code
-"""USING: compiler.tree.debugger math.vectors math.vectors.simd ;
+"USING: compiler.tree.debugger math.vectors math.vectors.simd ;
  IN: simd-demo
  
  STRUCT: actor
@@ -122,134 +138,68 @@ GENERIC: advance ( dt object -- )
  
  : update-velocity ( dt actor -- )
      [ acceleration>> n*v ] [ velocity>> v+ ] [ ] tri
-    (>>velocity) ; inline
+    velocity<< ; inline
  
  : update-position ( dt actor -- )
      [ velocity>> n*v ] [ position>> v+ ] [ ] tri
-    (>>position) ; inline
+    position<< ; inline
  
  M: actor advance ( dt actor -- )
      [ >float ] dip
      [ update-velocity ] [ update-position ] 2bi ;
  
-M\ actor advance optimized."""
+M\\ actor advance optimized."
  }
-"The " { $vocab-link "compiler.cfg.debugger" } " vocabulary can give a lower-level picture of the generated code, that includes register assignments and other low-level details. To look at low-level optimizer output, call " { $snippet "test-mr mr." } " on a word or quotation:"
+"The " { $vocab-link "compiler.cfg.debugger" } " vocabulary can give a lower-level picture of the generated code, that includes register assignments and other low-level details. To look at low-level optimizer output, call " { $snippet "regs." } " on a word or quotation:"
  { $code
-"""USE: compiler.tree.debugger
+"USE: compiler.tree.debugger
  
-M\ actor advance test-mr mr.""" }
-"An example of a high-performance algorithm that uses SIMD primitives can be found in the " { $vocab-link "benchmark.nbody-simd" } " vocabulary." ;
+M\\ actor advance regs." }
+"Example of a high-performance algorithms that use SIMD primitives can be found in the following vocabularies:"
+{ $list
+    { $vocab-link "benchmark.nbody-simd" }
+    { $vocab-link "benchmark.raytracer-simd" }
+    { $vocab-link "random.sfmt" }
+} ;
  
  ARTICLE: "math.vectors.simd.intrinsics" "Low-level SIMD primitives"
  "The words in the " { $vocab-link "math.vectors.simd.intrinsics" } " vocabulary are used to implement SIMD support. These words have three disadvantages compared to the higher-level " { $link "math-vectors" } " words:"
  { $list
      "They operate on raw byte arrays, with a separate “representation” parameter passed in to determine the type of the operands and result."
      "They are unsafe; passing values which are not byte arrays, or byte arrays with the wrong size, will dereference invalid memory and possibly crash Factor."
-    { "They do not have software fallbacks; if the current CPU does not have SIMD support, a " { $link bad-simd-call } " error will be thrown." }
  }
  "The compiler converts " { $link "math-vectors" } " into SIMD primitives automatically in cases where it is safe; this means that the input types are known to be SIMD vectors, and the CPU supports SIMD."
  $nl
-"It is best to avoid calling these primitives directly. To write efficient high-level code that compiles down to primitives and avoids memory allocation, see " { $link "math.vectors.simd.efficiency" } "."
-{ $subsection (simd-v+) }
-{ $subsection (simd-v-) }
-{ $subsection (simd-v/) }
-{ $subsection (simd-vmin) }
-{ $subsection (simd-vmax) }
-{ $subsection (simd-vsqrt) }
-{ $subsection (simd-sum) }
-{ $subsection (simd-broadcast) }
-{ $subsection (simd-gather-2) }
-{ $subsection (simd-gather-4) }
+"It is best to avoid calling SIMD primitives directly. To write efficient high-level code that compiles down to primitives and avoids memory allocation, see " { $link "math.vectors.simd.efficiency" } "."
+$nl
  "There are two primitives which are used to implement accessing SIMD vector fields of " { $link "classes.struct" } ":"
-{ $subsection alien-vector }
-{ $subsection set-alien-vector }
+{ $subsections
+    alien-vector
+    set-alien-vector
+}
  "For the most part, the above primitives correspond directly to vector arithmetic words. They take a representation parameter, which is one of the singleton members of the " { $link vector-rep } " union in the " { $vocab-link "cpu.architecture" } " vocabulary." ;
  
  ARTICLE: "math.vectors.simd.alien" "SIMD data in struct classes"
-"Struct classes may contain fields which store SIMD data; use one of the following C type names:"
-{ $code
-"""float-4
-double-2
-float-8
-double-4""" }
-"Passing SIMD data as function parameters is not yet supported." ;
+"Struct classes may contain fields which store SIMD data; for each SIMD vector type listed in " { $snippet "math.vectors.simd.types" } " there is a C type with the same name."
+$nl
+"Only SIMD struct fields are allowed at the moment; passing SIMD data as function parameters is not yet supported." ;
+
+ARTICLE: "math.vectors.simd.accuracy" "Numerical accuracy of SIMD primitives"
+"No guarantees are made that " { $vocab-link "math.vectors.simd" } " words will give identical results on different SSE versions, or between the hardware intrinsics and the software fallbacks."
+$nl
+"In particular, horizontal operations on " { $snippet "float-4" } " vectors are affected by this. They are computed with lower precision in intrinsics than the software fallback. Horizontal operations include anything involving adding together the components of a vector, such as " { $link sum } " or " { $link normalize } "." ;
  
  ARTICLE: "math.vectors.simd" "Hardware vector arithmetic (SIMD)"
  "The " { $vocab-link "math.vectors.simd" } " vocabulary extends the " { $vocab-link "math.vectors" } " vocabulary to support efficient vector arithmetic on small, fixed-size vectors."
-{ $subsection "math.vectors.simd.intro" }
-{ $subsection "math.vectors.simd.types" }
-{ $subsection "math.vectors.simd.support" }
-{ $subsection "math.vectors.simd.efficiency" }
-{ $subsection "math.vectors.simd.alien" }
-{ $subsection "math.vectors.simd.intrinsics" } ;
-
-! ! ! float-4
-
-HELP: float-4
-{ $class-description "A sequence of four single-precision floating point values. New instances can be created with " { $link float-4-with } " or " { $link float-4-boa } "." } ;
-
-HELP: float-4-with
-{ $values { "x" float } { "simd-array" float-4 } }
-{ $description "Creates a new vector with all four components equal to a scalar." } ;
-
-HELP: float-4-boa
-{ $values { "a" float } { "b" float } { "c" float } { "d" float } { "simd-array" float-4 } }
-{ $description "Creates a new vector from four scalar components." } ;
-
-HELP: float-4{
-{ $syntax "float-4{ a b c d }" }
-{ $description "Literal syntax for a " { $link float-4 } "." } ;
-
-! ! ! double-2
-
-HELP: double-2
-{ $class-description "A sequence of two double-precision floating point values. New instances can be created with " { $link double-2-with } " or " { $link double-2-boa } "." } ;
-
-HELP: double-2-with
-{ $values { "x" float } { "simd-array" double-2 } }
-{ $description "Creates a new vector with both components equal to a scalar." } ;
-
-HELP: double-2-boa
-{ $values { "a" float } { "b" float } { "simd-array" double-2 } }
-{ $description "Creates a new vector from two scalar components." } ;
-
-HELP: double-2{
-{ $syntax "double-2{ a b }" }
-{ $description "Literal syntax for a " { $link double-2 } "." } ;
-
-! ! ! float-8
-
-HELP: float-8
-{ $class-description "A sequence of eight single-precision floating point values. New instances can be created with " { $link float-8-with } " or " { $link float-8-boa } "." } ;
-
-HELP: float-8-with
-{ $values { "x" float } { "simd-array" float-8 } }
-{ $description "Creates a new vector with all eight components equal to a scalar." } ;
-
-HELP: float-8-boa
-{ $values { "a" float } { "b" float } { "c" float } { "d" float } { "e" float } { "f" float } { "g" float } { "h" float } { "simd-array" float-8 } }
-{ $description "Creates a new vector from eight scalar components." } ;
-
-HELP: float-8{
-{ $syntax "float-8{ a b c d e f g h }" }
-{ $description "Literal syntax for a " { $link float-8 } "." } ;
-
-! ! ! double-4
-
-HELP: double-4
-{ $class-description "A sequence of four double-precision floating point values. New instances can be created with " { $link double-4-with } " or " { $link double-4-boa } "." } ;
-
-HELP: double-4-with
-{ $values { "x" float } { "simd-array" double-4 } }
-{ $description "Creates a new vector with all four components equal to a scalar." } ;
-
-HELP: double-4-boa
-{ $values { "a" float } { "b" float } { "c" float } { "d" float } { "simd-array" double-4 } }
-{ $description "Creates a new vector from four scalar components." } ;
-
-HELP: double-4{
-{ $syntax "double-4{ a b c d }" }
-{ $description "Literal syntax for a " { $link double-4 } "." } ;
+{ $subsections
+    "math.vectors.simd.intro"
+    "math.vectors.simd.types"
+    "math.vectors.simd.words"
+    "math.vectors.simd.support"
+    "math.vectors.simd.accuracy"
+    "math.vectors.simd.efficiency"
+    "math.vectors.simd.alien"
+    "math.vectors.simd.intrinsics"
+} ;
  
  ABOUT: "math.vectors.simd"