Specifically, instead of returning values [0,1], we now return [-1,1]:
* −1 meaning exactly opposite
* 1 meaning exactly the same
* 0 indicating orthogonality (decorrelation)
* in-between values indicating intermediate similarity or dissimilarity.
! Copyright (C) 2012 John Benediktsson
! See http://factorcode.org/license.txt for BSD license
-USING: math.functions math.similarity tools.test ;
+USING: math.functions math.similarity math.vectors tools.test ;
IN: math.similarity.tests
{ t } [ a b pearson-similarity 0.2376861940759582 1e-10 ~ ] unit-test
{ t } [ a a cosine-similarity 1.0 1e-10 ~ ] unit-test
-{ t } [ a b cosine-similarity 0.5472455591261534 1e-10 ~ ] unit-test
+{ t } [ a a vneg cosine-similarity -1.0 1e-10 ~ ] unit-test
+{ t } [ a b cosine-similarity 0.0944911182523068 1e-10 ~ ] unit-test
+
over length 3 < [ 2drop 1.0 ] [ population-corr 0.5 * 0.5 + ] if ;
: cosine-similarity ( a b -- n )
- [ v* sum ] [ [ norm ] bi@ * ] 2bi / 0.5 * 0.5 + ;
+ [ v* sum ] [ [ norm ] bi@ * ] 2bi / ;