trying to improve my knn algorithm
hunter.hammond.dev at gmail.com wrote:
> This is a knn algorithm for articles that I have gotten. Then determines
> which category it belongs to. I am not getting very good results :/
[snip too much code;)]
- Shouldn't the word frequency vectors be normalized? I don't see that in
your code. Without that the length of the text may overshade its contents.
- There are probably words that are completely irrelevant. Getting
rid of these should improve the signal-to-noise ratio.