Anatomy Of A Successful Project #1, Fuzzy Vehicle Search Engine

February 18, 2007

I write a lot of posts on software project disasters and stupidity which can be both instructive and entertaining. It's time to begin a series on projects from the other side of the coin: they actually succeeded.

The first post in this series covers the development of the Consumer's Digest Online website in the late 1990's. Despite the success of the project itself, don't both looking for the site, as CD Online failed to pay their bills for the development and hosting of the site in 2001 and it was killed. The magazine still exists but has never built a replacement website since as far as I know, the owner of the debt may still have a valid claim. So in a way the project both succeeded and failed.

The company I worked for, Tensor Information Systems (a consulting firm which last quite a while but ultimately died in 2001) received a contract from CD Online to build an extremely challenging website and product search engine. The core idea was to provide the customers on the web the ability to search for products based on wide-ranging set of criteria, with the ability to find vehicles that could be configured to meet the criteria as the major feature. The key to the vehicle search would be to allow the user to specify features, and then have the search engine attempt to find cars that could be configured based on the auto maker's own rules for packages and features. This had never been done (even by the auto makers themselves) and as far as I know, hasn't been repeated since. This feature had scared every firm that CD had approached (one had actually tried to do it earlier and gave it up after a year of effort) but our firm was aggressive (and possibly foolhardy as well). The magazine had printed in the magazine that the site would be available in November around the time that we accepted the project, which gave us a six month development window. Unfortunately contract negotiations whittled the time to an almost impossible 3 months before coding could actually start.

Since our company had no actual experience with search engines we hired a contractor who supposedly had some experience while the rest of the team worked on the remaining site features. The technology we used at the time was Apple/NEXT's WebObjects written in Objective-C. We would be hosting the site ourselves on HP/UX and using Oracle as a database. CD Online would be responsible for entering all the data from their product databases in addition to all the data the vehicle manufacturers provided (model information, feature/package rules, pricing, etc) once we had built a data entry tool.

I wasn't a part of the team to start with.

The contractor we hired wasn't up to designing the core search database so one of our senior people was dumped into the project and had to design a database on the fly so that development could continue. The core problem was that the rules determining what feature and packages were combinable for vehicles were complex and not easily coded into a relational database. Basically features (like leather steering wheels, heavy-duty suspensions, engines, etc) are combined into packages (collections of features) that can be combined according to rules (package X requires packages Y and Z and optionally G or E, but disallows packages Q and T). The rules were inheritable, so if you had package X and required package Z, you would also require whatever package Z required. Some vehicles had as many as 10,000,000 different potential package combinations. The rules existed for both the obvious (you can't have two engines) and the sublime (you could only have leather seats with sport mirrors).

The database would ultimately have ten's of thousands of features, products, packages and their rules.

Building a search engine for such a fuzzy set of rules coded in a relational database seemed impossible (to all the other firms who had turned down or failed to build this). The contractor we had hired to build this core technology delivered some working code halfway throughout the short schedule. It did a preliminary search of the database and then sorted through the results in memory, which worked well for him in his development database which had only a few items in it. Once it was run on the growing production database (data was still being entered) it became unusable. Running on production hardware it took an average of 70 seconds to return a single set of results, and often the results were completely wrong. Now the project was six weeks away from being public and the core (and star) feature was unusable. Bad karma. He was fired.

Our resident DBA (not really a designer) and the architect (who had designed the database in a hurry) now tried to optimize the SQL query at the core of the problem, which when printed out was 5 pages long and appeared to to be the main culprit. No matter what they tried nothing really changed. It was clear that a redesign of the database might help but there was no time since so much of the code (and the data entry) was already complete. The hardware was already purchased and setup, and adding 10x hardware was not affordable to CD Online (and in case would not drop the response time to an acceptable level anyway).

I shared an office with the architect and overheard the wailing and gnashing every day (I was in charge of a project for the post office) and offered to help around 4 weeks before the deadline. My idea was a way to solve a problem where you couldn't change the code, the database design, or the hardware, and do it in only 4 weeks. It took some mighty convincing arguments before I was even allowed to investigate the idea (they continued to try the optimization of sql). My guess was that even though the number of combinations appeared almost infinite, the might be a way to "predigest" the data in such a way that it could be search entirely in memory instead of in the database and bypass Oracle entirely.

My first plan was to read the requisite data from the database and then do some statistics on it. Basically the rules data resulted in a tree of possibilities (and/or/not) for each vehicle model (non vehicles were also in the database but rarely had rules for configuration). With hundreds of models and about a dozen manufacturers the trees appeared infinite but I found I if could winnow them down to sets of recurring subtrees, the tree of trees could be represented fully in memory in around 20MB. The data was sorted by price (so as you went down a tree path you knew the price of the vehicle based on the current configuration) and only sufficient data was kept to allow one to run the search; at display time Oracle was consulted to fetch the actual displayable detail.

The key here was to extract the data from Oracle weekly into a set of static files which were digested into a form which could be read into memory when the application launched (actually into each instance). Then I wrote the actual search engine in C that traversed the trees based on the existing code that collected the user's requests. Another complicating factor was the the HP/UX memory allocated was horrible on deleting objects (I had written a commercial memory allocator library and knew all the tricks) so I created a suballocator for my own use and pre-allocated enough space at launch for both the data and the searches to avoid the HP/UX overhead. After the search ran I would grab all the data from Oracle and pass the result back to the existing code which displayed it to the user.

All of this happened in four long weeks of work. It worked, returning searches in less that 1 second on any set of criteria (which were extensive) and no code changes were necessary in the rest of the application so we were able to start serving CD Online around the original magazine promise date. The website was an immediate hit and would serve around one to two thousand users simultaneously on most days.

The most amazing thing to me (other than this worked at all!) was that no bugs were ever found in the runtime search engine during its lifetime. Later on to increase the number of instances available to serve the site, someone else built a shared memory version of the data so only one copy would load (it was static); this allowed us to double the instances without adding more RAM.

Sadly CD Online ran up a bill of around $800,000 before Tensor pulled the plug. This killed CD Online and ultimately contributed to Tensor's downfall as well. Since then a lot of vehicle search engines have appeared (think for example) but no one has attempted anything like ours again (I could be wrong).

The lesson to be learned from this project is that sometimes an impossible problem can be solved, if you only look differently at the problem. Another lesson I learned is that fuzzy data and relational databases are a nasty combination, and that often RAM is your friend. Today, I would probably look at a distributed solution (think Google) but in 1999, and facing the limit of unchangeable code, database and hardware), this turned out to be the only successful solution.

Next time: Sabre save a bundle.