Category Archives: Uncategorized

WebDosBeta

My reading recommendation engine[1] found me an interesting post by Yannick Laclau: WebDosBeta, the summary. WebDosBeta was a one day conference on internet statups in Spain.

This was a grassroots initiative started by Albert Armengol's post on the lack of innovation in Spain. Journalist Enrique Dans and SixApart's man in Spain, Victor Ruiz, picked up on this meme and the three of them kicked off, via the blogosphere naturally, the idea to hold a conference.

Sounds a bit like the TechCrunch BBQ the other day.

So when/where is the Australian version?

[1] My reading recommendation engine: Intrigued? Good.

VMWare Player

From Wubble:

Today, as part of our VMworld 2005 festivities, we announced our VMware Player. This is a freely downloadable tool that, as you might guess, plays virtual machines.

It's a free download, and there are a number of preconfigured VMs available for download.

This is a really good idea for software demos, as can be seen by some of the VMs that are available (eg IBM, Oracle etc). No longer do you need to sacrafice a machine (or a preconfigured VM) and spend a long time getting some demo software working. Now vendors can just redistribute the player and a preconfigured VM. Brilliant!

The Browser Application VM looks like a good idea, too.

Anatomy of a Cross Site Scripting Attack

If you create websites that require any kind of security hopefully you are familiar with the dangers of cross site scripting attacks. (If not, please let me know so I can stay clear…)

The other day MySpace got taken down by a XSS attack. The interesting thing about it was that (a) it used XMLHttpRequest to get around a multi-phase hash verification test and (b) the author has written about how they did it.

The attack itself is quite smart:

9) Finally we can do a POST! However, when we send the post it never actually adds a friend. Why not? Myspace generates a random hash on a pre-POST page (for example, the “Are you sure you want to add this user as a friend” page). If this hash is not passed along with the POST, the POST is not successful. To get around this, we mimic a browser and send a GET to the page right before adding the user, parse the source for the hash, then perform the POST while passing the hash.

It is a worry though that he spent so long working out how to get it to work and yet after he deployed it:

7 hours later, 8:35 am: You have 74 friends and 221 friend requests.
Woah. I did not expect this much. I'm surprised it even worked.. 200 people have been infected in 8 hours. That means I'll have 600 new friends added every day. Woah.

1 hour later, 9:30 am: You have 74 friends and 480 friend requests.
Oh wait, it's exponential, isn't it. Shit.

I'll never get caught. I'm Popular., 10/04/05

Spam Blog Crisis

Tim Bray says there is a spam blog emergency occuring right now. I tend to agree. I'd like to see the search terms he is using to get that many splogs, though.

Removing spam blogs results from results sorted based on time is difficult because you can't rely on PageRank-like algorithms. Email spam filters are probably a better model, although the auto-generated splogs that I suspect Tim is suffering from are hard to detect using Bayesian-type algorithms. OTOH, my de-spammed version of Google's blog search just uses heuristics based on the URL of the item, and it does okay for many searches. Compare my version of a search for “cancer” with the raw version. At the time of writing my version removes 26 spammy results to get the first 10 non-spammy ones.

Re: private feeds

Crazy Bob:

Many aggregators don't handle password-protected feeds well: some don't support it at all, and some do support it (either fully or with the user ID and password in the URL) but aren't very secure. What if you used hard to guess feed URLs? For example:

http://myhost/feeds/[big cryptographically unique ID]

It works with any reader. If it leaks out, others won't be able to access your account (they don't have your real password).

On the down side, if you subscribed to this feed in something like Bloglines, wouldn't Bloglines index it so other users could search it? Of course Bloglines supports embedding the user ID and password in the URL. Does Bloglines index these feeds?

I started replying in a comment, but it got too long and interesting:

There was an intersting discussion on using this technique on the P2P Hackers & REST Discuss mailing lists (although it was more for conventional webpages rather than just feeds).

I think it has some promise and I've been thinking of using it in one of my projects, but there are some things to be aware of:

1) Referrers. If your feed includes resources from or links to other sites you need to make sure links go though a redirector to strip the referrer headers.

2) Use https (if possible). This will partially solve the referrer problem (although not when readin via an aggreagor), and could be used as a sign for the aggregaror not to index it.

I don't think Bloglines does index password protected feeds. That creates an interesting possibility: create a feed that requires HTTP basic authentication, but accepts any combination of usernames and passwords. That will signal to aggregators not to index that feed, but doesn't have the security risks associated with sharing a real username/password.

Google Reader feedback

As part of my continuing quest for the ultimate aggregator I've been using Google Reader a bit – although I haven't replaced Bloglines yet.

Firstly, the good things: I generally like the feel of the “lens” part of the UI. The scrolly headline box thing is nice (although there should be a delay when you stop on an item before it gets marked as read). It's good to see that Google hasn't gone down the whole “sharing lables as tags” thing – tags are useful when you primarily tag something for yourself. (The whole “tagging for other people” thing leads to spam [1].) It isn't clear exactly what use the lables have yet, though.

Despite those things, I can't use it as my main aggregator. The one feature I need is a “view new items by author” view. The “Your Subscriptions” page almost has the elements on it – it needs to bold feeds with new item and put a count of the new items next to them.

[1]: I've been meaning to write on that – del.icio.us tags are useful because they benefit from people's selfishness; ie, people want to find something in their own bookmarks. Technorati tags aren't as useful because the only reason to use them is to benefit other people. See the use of HTML meta tags circa 1997 and how what happened to them..

Fixing Google Blog Search

I've previously complained about how much
spam is included in Google's Blog Search. Generally, though, I think Google does a good job with most of the things they do, and
I think that most of the criticism they get is unfair. That made me feel a little uneasy about
adding to the criticism and increasing the perception of Google as an evil company.

So what should someone to do when they believe they have uncovered a problem? I decided I'd do what I like people to do when the find a problem with
some of my software: fix it.

Here's my imperfect attempt: