Beware the fast follower

Regarding Google To “Out Open” Facebook On November 5 – beware the fast follower.

Facebook has a pretty nice API, but depending exactly what Google shares it could be possible to build some pretty impressive applications. Imagine knowing the frequency each Gmail contact was emailed… that would make facebook.friends.areFriends look kind of primitive.

Oh, BTW – I was wrong (or more charitably – misguided) about this stuff. Brad was right – making public data portable is the only safe way to go.

Quick & Dirty Server Monitoring

Sometimes it’s difficult to setup Nagios for server monitoring. This is what I do instead.

Firstly, for load monitoring:


#!/bin/bash

FILENAME=< absolute path >/monitoring/logs/load-$(date +%Y%m%d).txt

cat /proc/loadavg | awk '{print strftime("%Y/%m/%d %H:%M:%S", systime()), $1, $2, $3}' >>  $FILENAME

Run it both from cron, and then I use another cron script and gnuplot to graph the output.

genloadgraph.sh:



DATE=$1
if [ -z $DATE ]; then DATE="$(date +%Y%m%d)"; fi
FILENAME=load-$DATE.txt
cp < absolute path >/monitoring/logs/$FILENAME < absolute path >/monitoring/load.txt
gnuplot < absolute path >/monitoring/loadplot.p
rm < absolute path >/monitoring/load.txt

loadplot.p:


set terminal png large size 800,600
set xdata time
set timefmt "%Y/%m/%d %H:%M:%S"
set title "Load"
set format x "%H:%M:%S"
set out '< absolute path >/monitoring/load.png'
plot "< absolute path >/monitoring/load.txt" using 1:3 title '1 min average' with lines, "< absolute path >/monitoring/load.txt" using 1:4 title '5 min average' with lines, "< absolute path >/monitoring/load.txt" using 1:5 title '15 min average' with lines
set output

Gives a graph like this:

Load Graph

It possible to do a similar thing for website monitoring:



#!/bin/bash

FILENAME=< absolute path >/monitoring/logs/nicklothian-$(date +%Y%m%d).txt
(time wget -q --delete-after http://nicklothian.com/blog/) 2>&1 | awk '/real/ {print strftime("%Y

/%m/%d %H:%M:%S", systime()), $2}' >> $FILENAME

Preserving privacy while promoting social network portability

Brad Fitzpatrick and David Recordon recently wrote an interesting paper Thoughts on the Social Graph which gathered quite a lot of attention. They addressed some themes which I’ve been thinking about for quite a while now, and certainly moved the issue on a lot more than the recent Wired article did.

There’s no doubt that Brad & David know what they are talking about, either. Indeed, if Tim O’Reilly invented Web 2.0, then I think it’s not much of an exaggeration to say that Brad wrote the software which powers it.

However, I think their approach to the social network problem is surprising. In particular, I think it’s odd that the people who invented OpenID are proposing a centralized repository for all social networking data.

I believe there are better approaches. I’ve proposed and built a demonstrator for a system using what must be one of the most under appreciated data structures of all time: the Bloom filter. In short, a Bloom filter is a compact data structure which will remember if it has seen a piece of data previously, without remember the data itself. Obviously, this is useful in the social networking context because you can do things like load up all a users contact and then make the Bloom filter public. That allows system to query the filter to see if they know another user, without exposing their contact list to privacy leaks.

Incidentally, that demonstrator is my first Facebook app. Writing Facebook apps turns out to be pretty nice, although in this case I wrote it in PHP -which is less enjoyable. Have I ever mentioned that I’m not a huge PHP fan? Perhaps that’s partially because I don’t know PHP at all, but it’s just such a goopy language. Mucking around with Ruby (which I don’t know either) makes you go hmmm.. that’s nice. Even in Javascript I find myself going hmm… okay.. not quite what I expected, but it kind of makes sense. Doing the same in PHP just makes you go hmmm… – not in a good way, either.

Open social networks?

At work I’m building a custom vertical social network. It’s interesting work, and so I’ve been following some of the stuff about how social networks need to become “open”

I had a half-written post about how an “open” social network means such different things to different people that it is pretty much meaningless. Dare says it much better, than I could.

I still think someone needs to point out what a crap article the Wired piece “Slap in the Facebook: It’s Time for Social Networks to Open Up” is.

They spend a long time listing different web tools you can use to build some kind of nice looking website, and then miss the “social network” bit of building a social network.

A social network isn’t about a stupid frigging BLOG (yes, I’m quite aware of the irony of saying that on a blog). It’s about the personal interactions and relationships the software enables. Go and listen to some of the Danah Boyd podcasts recorded at the education.au seminars – you’ll note she talks about the social pressures of how to order your friends lists, how bands on MySpace are identity markers and how the “wall” is useful as a publicly witnessed space. There’s nothing in about blogging or social bookmarking or group calendaring – as useful as those things may be.

So anyway – the social network I’m building is going to be as open as I can make it – but it’s MY definition of open. Specifically, it’s going to make it as easy as possible to use external applications like blogs, and yet still tie them into your identity on the system. That sounds pretty obvious of course, but that doesn’t mean it is wrong.

Hmm.. I seem to be discussing work projects on here a lot more than I used to.. not sure what that means.

Sleep.. glorious sleep

Our boy Alex is 21 months old now. During the first 20 months of his life he sleep though 10 times, and we were often up for a couple of hours during the night and/or had to get up well before 6:00am. That was pretty tough, but then he learnt how to climb out of cot.. We had to buy him a bed and suddenly it was taking 2 hours to get him to sleep, and he was still waking up a couple of hours later.

After a week or two of that I gave in and agreed to see the sleep doctor. To my absolute and utter astonishment Alex is now going to sleep without crying and sleeps though the night at least 2 out of ever 3 nights. Even better – when he does wake up he goes back to bed himself.

So.. if there are others of you suffering though this.. there is hope!

Guessing is much quicker than debugging

My previous post I’ve already tried the ‘waving a dead chicken over our servers’ trick attracted a bit of attention, and quite a number of suggestions – thanks to all who contributed. The suggestions seemed to fall into four main categories:

  1. Database tuning.
    • This is a good suggestion, and is something we’ve done a fair bit of. In this case it doesn’t really help because the problem wasn’t performance but stability.
  2. Introduce a caching layer
    • We’d already done this, twice. We initially used an ehcache caching filter to fix some pretty serious performance problems. We later added some OSCache JSP cache tags in some critical areas in some templates (it was the addition of OSCache which caused the performance boost seen in my post on monitoring performance using the Google Webmaster Tools). As it turned out this combination may have been what caused our problem.
  3. Rewrite everything
    • Thanks. Let me know when you get a job in the real world.
  4. Debug the problem
    • This is what I figured we’d have to do. It’s something I was attempted to avoid because the issue seemed to be threading related, and we couldn’t reproduce it anywhere except our production environment.

We did have one stroke of good luck. We were able to predict when the site would stop working by monitoring the number of threads Apache was using and we could use this information to preemptively restart the site. We were able to modify the restart script to generate stack traces for all the JMV’s threads (kill -SIGQUIT <jvm pid>).

Since it looked like I’d actually have to start debugging this problem I started looking through the stack traces and I noticed that lots of the threads were in the ehcache filter. Now this wasn’t necessarily a bad thing, since all http request would be passed though it. However, it did make debugging harder, was easy to remove (just comment it out in the web.xml) and did have some potential to be a source of problems – in particular the cache-invalidation part.

So we took a punt and removed the filter and… it fixed the problem. Yay! I’m a genius and all that.

Except…. now the CMS is crashing with a NullPointerException deep in the data persistence layer. There’s also the small problem that I don’t have a clue why that change fixed it. Using the ehcache filter on its own worked fine, and there is no programmatic interaction between the ehcache and oscache code.

There is an alleged fix for the NullPointerException – but we have to take a point release of the CMS, and then patch it with a service pack to get it. Our previous experience with upgrades have been less than confidence inspiring.

In the mean time we have a script watching the site and restarting it when it crashes. It’s kind of like failover, without the over bit.

“You should just buy Google”

Me to Unix Admin @ work: Hey – so I’m doing estimates for a proposal which needs between 30 and 250 TB of storage – what do you know about mass storage?

Unix Admin: Hmm.. you should just buy Google….

Okay then! But seriously – Amazon S3 seems the obvious solution, and I’m also looking at OmniDrive. Any other suggestions are welcome. No hardware suggestions, though – I don’t have enough confidence in our operations group to do it in house (mainly because we don’t have an operations group…)