Kindergarten, The Digg Effect, Shared Hosting and Traffic Rankings

Apr 8, 2007

Hows that for a complicated title?

I posted All I Need To Know To Be A Better Programmer I Learned In Kindergarten last Thursday morning, and watched it do absolutely nothing that day (600 readers or so, way below an average day).

The next day it was added to Reddit and during the day had about 6,000 readers (good, but no record). That night I went to Good Friday services and shared a meal with friends. Before going to bed (late) I check the stats and noticed 2,000 people had read the article in the last hour. I saw that it had 69 diggs so I visited Digg to see where it was. It turned out it was #2 on Digg. This was the first time one of my articles went anywhere on Digg (usually Reddit is my main traffic site). In the next two days I had nearly 40,000 readers and over 1,000 diggs.

This site runs on a shared host on linux (some kind of mid-level single CPU Athlon box) at kattare.com with apache on the front end passing all traffic to my jetty server, which runs my java blog application (BlogFiche) on top of my own web application framework (Fiche) using H2 as the database. This instance of jetty is also running some other (lower traffic) sites. I wasn't remotely afraid of the "Digg Effect" since I knew exactly how much traffic this platform was designed for. I wouldn't get worried until traffic hit 5 per second or so sustained (that's 18K+ per hour). The max I saw was probably around 3K per hour, so the system was still mostly idle.

My site caches all generated page content and only regenerates it if something changes (I edit the site or a comment is posted). All static content is served via the jetty default servlet, which also does caching. Jetty itself uses Java NIO (non-blocking IO), so it's extremely efficient at serving web content. So in effect the only thing happening on 99.9% of all connections is IO. Digged? No problem.

Of course I wouldn't advertise myself as knowing and solving performance issues if my own site performance sucked. Seeing blogs getting digged and die is a little irritating to people; I can understand a graphics or video site getting swamped, but a simple blog shouldn't have any trouble with 1 connection per second (3,600 per hour).

The funny thing about the traffic is the difference between Google and Alexa. Even though I had 40,000 readers for this one article, my Alexa ranking dropped a lot over the same timeframe. Google's index is filled with references (and outright copies!) of the article, but Alexa shows nothing happened at all. Maybe fewer people are using the Alexa toolbar these days.

I tend to look at my adgridwork account for quick updates to my traffic. Although I am collecting some stats I have been lazy to update the admin portion of this blog so I can see (and show) actual counts on the articles.

It was pretty cool (and humbling) that so many people read the article from all over the world. The web is an amazing place.