The SimCity Debacle From a Different Direction

March 11, 2013

The SimCity release debacle where people who have bought the game are generally unable to play has been hashed to death already on the subject of the always on DRM scheme. But the thing I find a most interesting mystery which has been mostly ignored in the discussions (or more exactly yelling) is why has EA been unable to scale the servers up?

I've been involved in enough large systems to know the process of architecting for scale isn't easy, but it isn't rocket science either. It obviously depends on the size of the userbase, what they are going to do, how efficient your systems are and what you are willing to pay ahead of time. Games with enormous early demand and years of experience releasing such games which a company like EA should have should have made this fairly easy to predict. EA has gobs of money to spend on hardware, they have an enormous staff to devote time on both understanding the usage of gameplay, the bandwidth requirements of servers, systems, databases and the network itself. Surely tons of testing everything should have happened long before hardware and licenses were ordered.

Yet EA totally screwed the pooch. How is this even possible?

I can understand a small startup failing to deal with scale, the internet is rife with stories of how desperate people were when success happened way too fast. But most of those stories are from people who failed to realize what a good thing they were offering, and they simply didn't have enough money to plan ahead. EA should have had none of these issues.

A company with the size and experience of an EA should have had instrumentation on everything. They should have had a massive "big data" warehouse full of every bit of data that testing produced. Nothing should have happened during the long testing phase (both internal and external) that didn't add to the data. Each player of the game does only a few things at a time and the simulation engine turns over the same kind of data at a similar rate continuously. All of this can easily be determined over time.

Similarly the number of customers that will buy the game the first week should not have been any surprise, people love this game and all the heavy marketing leading up were clearly going to stir up huge demand. I'm not marketing expert, but given how many big time games have been shipped in the last decade and what their numbers were shouldn't have been that much of a mystery.

So if you can guess the demand, you know your systems performance, you understand every bit that the game generates, and you have enormous money to invest to ensure everything works correctly, how do you screw this up so badly?

I know nothing of EA other than what I read, but the only answer I can think of is executive idiocy. Maybe someone somewhere refused the architect's and engineering plans and decided based on something other than data to spend less money. I suppose the architects and engineers might have been totally clueless or there was no attempt at tracking performance and data usage. That would also constitute executive cluelessness in hiring people with a similar lack of clue. But given the amazing game that was developed I can't imagine the engineering was so utterly inept.

Given my several decades of experience I've seen a lot of leadership that negated whatever brilliance the actual engineering team had, and in most cases where systems had terrible problems the leadership was generally to blame. Engineers can figure out almost anything, but they can be constrained so much that there is nothing they can do to make things work right. Having incompetent engineers can happen too, but it's far less likely.

During my brief stint at Apple, I saw both how poor leadership and terrible engineers happened during the Copland disaster in the mid-90's. The idea was bad, the leaders didn't have a clue, and a hoard of unmanaged contractors wrote terrible code, leaving the good engineers that Apple had even then without any way to fix things. But Apple was a disaster itself and nearing death: EA is a big successful game company doing just another game.

I have no idea what will happen now other than lots of layoffs will happen, this is common practice in the big game industry anyway. Will anyone learn from this? Probably not, since this isn't the first time such stupidity has happened. Of course most games don't require you to be online to even play by yourself. If SimCity allowed offline play at least the outcry would have been muted.

Basically if you are going to require online always (like the MMO I worked on) you better make damned sure you know exactly how much server capacity you need and make it available on day one. Better plan to have too much than too little and hope to ramp up over time.

If nothing else, if you plan on releasing a big complicated system to a desperate public, spend the time and money up front to instrument everything from day one. The more you know, the less you get surprised.