Re: RSS Aggregators are the killer app

Ted talks about how he expects RSS aggregators to start chewing CPU time.

I've done some experiments in this area, and Bayesian classification on 4000 items a day would currently be an interesting performance tuning Lose Weight Exercise. In my experience it isn't CPU bound, though – it's I/O bound.

I have a few ideas about things that might perform better that Bayesian classification anyway, but these techniques (as well as things like Latent Semantic Indexing) will be more CPU hungry, though.

Everytime I think about trying to do LSI (or even Vector Space Search) on a couple of million items I start looking at the vector processor units on modern video cards and start drooling. Forget the CPU – off load that processing to the GPU. There still will be problems with disk and memory I/O, but the processing power is there.

(A couple of times I've actually began investigating this. It would be an excellent project to add GPU co-processing to Classifier4J and/or Lucene. JOGL may be the best way to do it.)

GPGPU.org is a decent site for more stuff about this.

Leave a Reply

Your email address will not be published. Required fields are marked *