Re: Bayesian Filtering: The Spam Fights Back

Charles Miller had an interesting post about
spam he is recieving that is designed to get through Bayesian filtering.

I am of the opinion that Bayesian filtering will eventually be only one of a range of filters which people will need to
deploy against spam. I'm optimistic that combinations of text filtering algorithms (including, but not only
Bayesian alogorithms) can continue to be effective for some time.

I think other filters are needed, though. For instance, in the spam that Charles recieved many words (designed to
fool Bayesian filtering) were styled to be invisible. This used to be an old search-engine spamming technique, but now
Google detectes that, and actually uses the stylistic structure of the web page (ie – the appearance) during its analysis.
I can't think of any reason why mail filters can't do the same thing.

(Disclaimer: I've written an open source package for Bayesian filtering in Java)

Leave a Reply

Your email address will not be published. Required fields are marked *