I'm off to the UK for two weeks for my friend's wedding (which is now a “commitment ceremony” due to various immigration issues).
I'll be checking my email, but don't expect well-thought out responses….
For the interested, here's some stuff to read while I'm away:
This paper describes mass personalization, a
framework for combining mass media with a highly
personalized Web-based experience. We introduce
four applications for mass personalization:
personalized content layers, ad hoc social
communities, real-time popularity ratings and
virtual media library services. Using the ambient
audio originating from the television, the four
applications are available with no more effort than
simple television channel surfing. Our audio
identification system does not use dedicated
interactive TV hardware and does not compromise
the user’s privacy. Feasibility tests of the proposed
applications are provided both with controlled
conversational interference and with “living-room”
Detecting Spam Web Pages through Content Analysis
In this paper, we continue our investigations of “web spam”: the injection of artificially-created pages into the web in order to influence the results from search engines, to drive traffic to certain pages for fun or profit. This paper considers some previously-undescribed techniques for automatically detecting spam pages, examines the effectiveness of these techniques in isolation and when aggregated using classification algorithms. When combined, our heuristics correctly identify 2,037 (86.2%) of the 2,364 spam pages (13.8%) in our judged collection of 17,168 pages, while misidentifying 526 spam and non-spam pages (3.1%).
I like this paper, because I used some very similar techniques in my de-spammed version of the Google Blog Search.