Sunday, November 27, 2005

CFMX and Amazon web services, pt 2

Now that we've done a cfdump of the xml results packet, you can see just how much data there is for every product!

Since we're doing an ItemSearch, the root of the data packet is called 'ItemSearchResponse'. The first row gives specific data relating to the results. The xmlattribute tells which version of CommerceService is being used. The HTTPRequest identifies what programming language is being used to access the web service (ASP.net, Coldfusion, etc). The next section, RequestID, is a session counter. The Arguments attributes displays the user-specified search criteria and item page number. This information, while useful, is not of any immediate use and can be disregarded.

Now, to the heart of the matter.

The next data row, Items, contains all the search results data. Amazon has a built-in limit of 10 items per data page. However, there are thirteen results items.

The first item listings are Request, TotalResults and TotalPages. TotalResults and TotalPages are useful for extracting the total number of search results or setting up record paging.

After that, things get a little trickier. Each product has several levels of data relating to its ASIN, product images, product pricing, URLs to the corresponding Amazon.com and more. Extracting the data is fairly straightforward but cumbersome. For example, to find the products ASIN number, you would use the following:

<cfset ASIN = tmp.ItemSearchResponse.Items[1].Item[1].ASIN.xmltext>

To view the product price, you would use the following:

<cfset Price = tmp.ItemSearchResponse.Items[1].Item[1].ItemAttributes.ListPrice.
FormattedPrice.xmlText>


In the previous articles' example, the search is for Books with a Keyword of Coldfusion. The amount of data for one book is staggering. There are over forty different xmlattributes and xmltext fields to sift through. Here's a sample output for the ColdFusion MX 7 WACK:

ASIN
DetailPageURL
SalesRank
Small Image (has xmlattributes for image URL, height and width)
Medium Image (has xmlattributes for image URL, height and width)
Large Image (has xmlattributes for image URL, height and width)
Category
Category Small Image (has xmlattributes for image URL, height and width)
Category Medium Image (has xmlattributes for image URL, height and width)
Category Large Image (has xmlattributes for image URL, height and width)
Author (will appear multiple times if there is more than one author)
EAN
Edition
ISBN
Amount
CurrencyCode
FormattedPrice
NumberofItems
NumberofPages
PackageDimensions (has xml attributes for height, weight and width)
PublicationDate
Publisher
Title
OfferSummary
LowestNewPrice
LowestNewFormattedPrice
LowestUsedPrice
LowestUsedFormattedPrice
CurrencyCode
TotalNew
TotalUsed
TotalCollectible
TotalRefurbished
EditorialReview (xmlattributes for Source and Content)

Whew! And this is just for ONE book!

As you might expect, the ItemAttributes vary from SearchIndex to SearchIndex. DVD's results are different from books which are different from music, etcetera.

In the next part, I'll go over how to mine the data for key information.

Manana,

Chris

2 comments:

Anonymous said...

This looks really good Chris. Keep it up!

Chris said...

Thanks, it's been a fun trip so far. The next part will be posted later tonight.