Home About The Codist RSS Feed

Amazon's Commerce API (Not For Me)
Mar 27, 2007 09:35 perm link Readers: 715

I wanted to build a simpler storefront for certain Amazon stores using their REST api, but it turned out to be way too limiting.

With the 1 call per second per IP limitation, an api designed only for limited and predictable uses, frustrating documentation and really odd results made it too much of a mess for what I wanted to do. As long as your needs are modest and you can live with the limitations it's OK, but they really don't want you to go too far.

One of the biggest problems from my perspective is that while all information is available, you can only access it via the API. This is a serious limitation if what you want is a list of products. Even a count of products in a particular catalog node is difficult to get. Their product lists are stored in separate search indexes, which you have to manually include in the parameters. Often it's not obvious which index goes with which node (and sometimes it may be multiple ones). For example the base node in Books combined with the Books index returns nothing. Just for fun I called each root catalog node (which is the main storefront page for that category, like DVDs or Books) with each search index and got all sorts of random results, many of which made no sense. Sometimes it was impossible to tell which index went with which category with all the clearly wrong numbers (like groceries in tools). I never did figure out where the books were hidden.

I used the REST api, which was at least sane to call. The Java based SDK is based on Apache Axis and called the SOAP api. After about 5 minutes of looking at its complexity I tossed it out completely.

Yesterday in talking with the iTunes folks I discovered that Linkshare has about 180 merchants which provide product data on a daily/weekly basis, all available via FTP in two formats (XML and delimited). This is way easier to work with (Amazon of course is not one of them) so I think my original idea will now be based on these (particularly iTunes but others are interesting as well).

Amazon's commerce api makes sense if you have a few things you want to sell, and you can grab the data on the fly (they don't allow you much caching ability at all). If you want to build a site around a whole list of products (combined with other information) or even build a price comparison site (depending on the licenses) it makes far more sense to work with a list and build an api around your own database.

My Tags:

Building a Better Amazon.com Experience
Mar 15, 2007 10:10 perm link Readers: 648

I admit I love shopping at Amazon, they sell virtually everything, you can order it online and get it quickly. I also admit I hate the amazon.com website, there is simply too much crap on every page, finding stuff can be tedious, and you can easily get overwhelmed with too much information. A sample page I clocked with Firebug had 43 images and took 330K of network bandwidth to display.

So I decided to start building an alternative interface. With Amazon's web interfaces you can pull down almost everything they know about everything they sell and then some. There is a big catch though, you can only call them once per second per IP (more or less), to keep from overwhelming their servers. Of course you can set up a server farm with dozens of IPs, each accessing them only once per second, so I'm not sure how meaningful the limit is. You could even use Amazons EC2 service to set up the farm.

So if you have only one IP (like I have at home) you can make about 86,000 API calls (in my case via REST) per day. I am currently getting the entire catalog tree, and then will be getting the entire product list for each node in the tree. They only return 10 items per call so the most I can get is 860,000 or so products per day. I imagine I will have to choose only certain stores to keep this reasonable and be able to keep it fresh. If this idea gets traction I can set up a farm and make this faster.

I wish Amazon had a dump API like Wikipedia's.

You might wonder why I don't do what other Amazon web services users do, and only grab information via the APIs as needed. The main reason is to build a different kind of search technology than what Amazon natively does and for that I need the whole data for each item. Their search is extensive but like so many search APIs you get way too many useless results and unless you ask a precise question you may miss many possible matches. I call these (highly untechnically) "useless hits and missing bits". More than 10 years ago I thought up with a different way to find stuff in a large collection but never pursued it, and this seems like an interesting application and test case. After the work I did on the Consumer's Digest website (documented in a post) in 1998 I've always wanted to get back into search, if only part time.

Building a better way to browse and find items in Amazon is not easy, but the main issues are not technical (other than overcoming the API limitations) but user experience issues. I have been working on UI (to use the old terminology) designs for most of my career, so it's not a stretch. Can I build something people would use (or for that matter Amazon itself could use) as an alternative? Part of my desire is to blend (mashup) an alternative experience and other sources of information like Wikipedia and web searches into the mix, assuming I can add value without complicating the design and winding up back where amazon.com is today.

The beauty of the web today is that there are so many opportunities to explore given the wealth of public APIs, sources of information, and affiliate programs that provide raw materials for new ideas. Combine that with a lot of new languages, frameworks and technologies to build with and it's no wonder you read every day of something new being released. Of course in a darwinian environment a lot of stuff winds up in the deadpool but that's the beauty in experimentation. There was a time when eBay, Amazon and Google were startups with an uncertain future.

So while I work on starting up my consulting business (coming soon), I will be spending some time working on the "Amazon Experiment".

At the current rate of 1 call per second, I should be ready with data sometime next century so there's plenty of time.

My Tags:

Name:


Optional URL:


Comment:


Save Cancel

Copyright © 2007 By Andrew Wulf