Text Classifiers in Java
Originally, this was going to be a comparison of various text
classifier algorithms for
Fernando Pereira's
Machine
Learning for Language Processing
class. I only ended up implementing a nearest-neighbor classifier,
using a tree structure reminiscent of Clarkson's RPO trees.
[edit on 20070906]
This is very similar to Clarkson's
Kenneth Clarkson's Nearest-neighbor queries in metric spaces.
I didn't realize at the time how similar it was, but a reference
to it should have been included in the above writeup.
The source tarball is distributed under
the
GNU Library (a.k.a. "Lesser") General Public License. There's a
somewhat rough description of it
available. The code is documented, but also somewhat rough; I'm hoping
to
revise it Real Soon Now (tm.)
20030104
I've rewritten this in Objective Caml, adding a tree-balancing
heuristic which seems to improve the approximation.
[.tar.gz archive]
20061030
Fixed some comments, and changed comments so that ocamldoc would see
them.
[browse source][.tar.gz archive]
Josh Burdick /
last updated on October 30, 2006