Archive for March, 2005

Google Desktop Search 1.0

Google Desktop Search 1.0 is out. This includes search in your Firefox history (as well as other new formats) and also comes with a COM based API (insert evil comment about how much I hate COM here).

Comments

Stupid Google Operating System Meme

Why do people insist on thinking Google is writing an operating system? Surely people realise
they already have one, which allows them to build innovative but
useful things on top of it.

It seems to me that Google's strategy – as far as they have one – is to commoditize the entire software stack
using web applications, and then do the best possible implementation. Then they make money by knowing more about
their users than anyone else. Bearing this in mind I just don't see where shipping an operating system to consumers
makes sense. They do need to make sure that users have a decent browser
on as many systems as possible, though – hence the money they are putting into Firefox.

So why would Google hire people like Mark Lucovsky? Well..
he's a smart guy so Google would want him, they do have a fairly big cluster of computers that need operating systems so
he'd want to go there. Plus – and everyone seems to be ignoring this –
he architected Hailstorm:

“I had these ideas that the way to really bootstrap Web services was to
come up with a model where data was the central pivot point, and we came
up with an architecture for connecting people and applications with information”

Perhaps Google might be just a little bit interested in doing something like that?

It's not like he's the first operating system person they've hired, anyway.

Comments

Using JDK 1.5 to optimize for modern CPUs

Modern CPUs have many features that increase multi-threaded performance
(eg: hyperthreading on Pentium 4s and similar features on recent PowerPC chips).
Over the next year the trend towards multi-cored CPUs from AMD, Intel and Sun will accelerate multithreaded performance while single threaded performance will begin to level off.

Java has always had excellent threading support, but JDK 1.5 introduces a whole new set of concurrency libraries which make multi-threaded programming easier. In theory these libraries should mesh well with modern CPUs, since (on a hyperthreading CPU) each thread appears as an extra CPU.

I've written a program to investigate multithreaded performance under Java. Does hyperthreading help multithreaded Java programs, or is the VM unable to use it properly?

My program is pretty simple – it generates random numbers and converts them to a string inside a loop a set amount of time. This loop is executed four time – twice by a single thread, and then twice by two threads simultanously. We would hope to see the multithreaded version run quicker on a hyperthreaded CPU.

Results:

2984.2 MHz Intel Pentium 4 3 GHz (Hyperthreading On):

Java Environment
    Sun Microsystems Inc. Java HotSpot(TM) Client VM 1.5.0_01-b08
Native Environment
    Windows XP 5.1 on x86
    2 CPU(s) detected
Please wait. Running Tests..
Single Threaded Test completed.
Dual Threaded Test completed.

Results
----------------------------------
Single Thread Time = 49241 ms.
Dual Thread Time = 35776 ms.

2984.2 MHz Intel Pentium 4 3 GHz (Hyperthreading Off):

Results
----------------------------------
Single Thread Time = 46101 ms.
Dual Thread Time = 50646 ms.

1668.8 MHz AMD Athlon(tm) XP 2000+:

Java Environment
Sun Microsystems Inc. Java HotSpot(TM) Client VM 1.5.0_01-b08
Native Environment
Windows 2000 5.0 on x86
1 CPU(s) detected
Please wait. Running Tests..
Single Threaded Test completed.
Dual Threaded Test completed.

Results
----------------------------------
Single Thread Time = 54844 ms.
Dual Thread Time = 66937 ms.

As you can see, hyperthreading really does work (27% quicker). However, we shouldn't just fire off threads everywhere possible, because multithreaded code runs will run significantly slower (9.8% on the P4 and 22% on the Athlon) than single threaded code on conventional CPUs.

Code similar to the following may be a suitable strategy:


int numThreads = osMBean.getAvailableProcessors();
ExecutorService executor = Executors.newFixedThreadPool(numThreads);
while (tasksNotExecuting()) {
   executor.submit(someLongRunningTask());
}

I'd be interested in other people results with the same tests. The more the better, but I'm especially interested in some more exotic environments – PowerPC5 based server, multi CPU machines and G5 Macs.

My program is available as a jar (run using java -jar threadtest.jar under Java 5). I use CPUChk to get the CPU id. Please record your results in the comments, with the machine CPU used to generate them and (if possible) the hyperthreading status (it can usually be enabled & disabled in the bios).

Comments

Please send 400 Bad Request and don't drop connection

Christopher Baus suggests that HTTP Servers should not send 400 Bad Request but should drop connections instead.

As a HTTP client developer let me beg people not to do this. While it is fairly rare that bad requests happen they still do occur occasionally and each one is a nightmare to debug.

For instance, Dave Johnson ran into a fairly typical problem last week. A particular server company had configured their software so it blocked any requests from software that had the word “Java” in the user agent. This took some effort to debug, but would have just about been impossible if the connections were being dropped.

Unfortunately software that intercepts, processes and sometimes modifies requests and responses like this are becoming increasingly commmon. While they seem to be a good idea, and appear to work okay when you browse the website with a common webbrowser they often break things in non-obvious ways.

The deeper I get into the internet software stack the more amazed I am that anything actually works at all. You'd think that TCP/IP->HTTP->XML/HTML is so comon that all the bugs would be ironed out by now – but that isn't true. It is full of edge cases and unexplored scenarios where things just break – or at least no one knows the correct way to do things.

Anyway – please don't go and create adhoc modifications to the HTTP spec like this suggestion (although it is fine to modify the error messageso it doesn't give too much information away)

Comments