Module Classify


module Classify: sig .. end
Classification, using KNN tree structure.

val lg : float -> float
log base 2
val float_sum : float list -> float
The sum of a list of floats
val distances_to_distribution : 'a list -> float -> (float * 'b * 'a) list -> ('a * float) list
Given the list of (distance, x, y) tuples found by the k-nearest-neighbors tree, normalizes it to a probability distribution. The values of y are weighted by 1 / distance for now; I don't know what's generally used.

It's inefficient to list all the keys, but simpler to implement this way.

alphabet - the alphabet

a - the amount of prob. mass to give to all letters (in other words, we're linearly interpolating between a uniform distribution and whatever the nearest-neighbor tree gives

neighbors - the nearest neighbors

val sample_from_dist : (float * 'a) list -> 'a
val one_letter_cross_entropy : ('a * float) list -> 'a -> float