Making the Web Faster With Computer Vision Technology

Peter Blum

Today we are introducing some crazy new technology called SmartVision that uses cutting edge computer vision technology to deliver blazing fast web performance — even with big, high-resolution images sent over unreliable wireless networks to mobile devices.

A year ago, when we launched out of stealth, we showed the world our fantastic new Image Streaming feature. This radically different approach allowed us to deliver faster performance plus full image fidelity by painting the images on the screen in two passes, as opposed to the traditional single full pass, which requires having to send all the image information up front. To do this, Image Streaming splits up the image data into two fragments for the two-pass display process. We did so with a static split point for the entire web property. That is, we would set the feature so that some fixed percentage of the initial image data was sent in the first pass.

Over time we tried different split points and noticed that the quality of experience across different images could be different at the same split points. Set at 40%, some images looked great, while others could be a bit blurry. Over time we figured out that with a split point of 60%, nearly every image looks good for the first paint of the screen (and of course the remaining 40% comes moments later in the second pass to provide full fidelity for all the images.)

Here's an example below with a stroller image:

First Paint Pass: 60% of image data

Second Paint Pass: adds the remaining 40% of image data

It worked well, and boosted the performance for our customers’ sites, but we always felt like we were leaving some performance on the table by picking a conservative split point to get a uniform quality of experience.

We had this crazy idea that, if we could only understand each image’s unique characteristics, we could actually pick the right split points per image to give even faster performance and a greater quality of experience to our customers’ end users. So we looked into the world of computer science for an answer and we learned it was time to hire some image scientists to help us in this journey.

Today we are really excited to introduce the fruits of that work with our SmartVision technology, which gives our Image Streaming technology another performance boost. It uses sophisticated computer vision algorithms and all sorts of crazy math to automatically analyze images to understand their unique characteristics and make smart decisions, per image, on where to choose the best split point of how much data to send up front.

Now imagine this for an e-commerce or travel & hospitality site, with tons of images and zillions of product searches per day. Rather than choose some one-setting-fits-all split point, we now provide the optimal split point for every single image – automatically.

How SmartVision works

The first time an image is requested by an end user through the service it is also sent to the SmartVision processor in the cloud for advanced image analysis. The SmartVision technology first takes each image through a primary pass analysis, which produces a large number of identification signatures for each image. The system then divides the image up into a number of versions, each at a different quality level. The SmartVision processor then analyzes each of these quality levels against the original image to create information on how each version degrades across a number of image signatures.

At this point the system uses an advanced affinity propagation algorithm, together with a large manually-classified image corpus maintained by the service, to identify the characteristics of each image. Examples of this are images which have zoomed-in close ups of people or products, zoomed-out images of products with few solid colors, medium-zoomed product images with heavy patterns, or even zoomed-out architectural elements like a house or a broad city view.

These are real examples below of the system finding related images with similar characteristics:

Based on this information, the system builds a visual quality-of-experience map for each image that can be used to make a decision on the best initial data split point per image which will balance between initial fidelity and performance.

And this is the result:

SmartVision will be available end of June and we can’t wait to deploy it for our customers.