Computer Vision Test: Amazon v. Google v. IBM v. Microsoft v. Pinterest

A comparison of major tech companies’ computer vision technologies based on the labels they affixed to 10 photos.

Chat with MarTechBot

Artificial Intelligence Ai Machine Learning Brain Ss 1920 Rrxbhl

Major tech companies are developing artificial intelligence technology to teach computers how to see images the way people do, from detecting individual objects to recognizing entire scenes. Amazon, Google, IBM, Microsoft and Pinterest have been using their computer vision capabilities to help people to find products, media companies to automatically edit videos and marketers to target ads.

But how well can computers actually see? To test for an answer, I took 10 photos — five professionally shot product images, five amateurishly shot by me — and ran them through the five aforementioned companies’ computer vision tools (Amazon’s, Google’s, IBM’s and Microsoft’s computer vision APIs and Pinterest’s in-app Lens feature) to see how the labels that each company affixed to each image compared. Watch the video below to see what they saw.

[youtube]https://youtu.be/-SYchGku4RE[/youtube]

So how well could the computers see?

For a computer, pretty good. All five companies recognized that shoes were shoes, a shirt was a shirt, a dress was a dress, a desk was a desk, a bag was a bag and a couch was a couch (or a sofa). And some were able to get even more specific. Pinterest used its metadata from people’s pins to identify that a sneaker was more specifically a Vans sneaker. And both IBM and Pinterest picked up on the fact that the couch was a sectional.

For a person, though, they might need to update their prescription lenses. Microsoft thought a tomato was an apple, and while Amazon and Pinterest were confident that the tomato was a tomato, they also thought it might be a persimmon or apple, respectively. And when it came to the pictures I took of my shoes on the floor, Microsoft paid more attention to the floor than the shoes. IBM also seemed a bit overzealous with some of its results; I don’t know what “jodhpur breeches” are, but those Levi’s were not them, and my messenger bag is not a mailbag, as much as that could be a cool work bag. And as capable as Pinterest was at recognizing my high-tops were Vans, it thought my running shoes were Nikes, even though it says Hoka across the heel.



Do computers have perfect vision? No, not now, and maybe not ever. But do they have adequate vision? Yeah. It may be ideal for a computer to always be able to identify the brand behind a product — be it a t-shirt, a pair of shoes or a messenger bag — in order to recognize if a person has an affinity for that brand or to show similar products from that brand, through an ad or otherwise. But to know, at least, that a shirt is a shirt and a shoe is a shoe shows how capable computers have become at shedding light on what was once a completely black box.


Contributing authors are invited to create content for MarTech and are chosen for their expertise and contribution to the martech community. Our contributors work under the oversight of the editorial staff and contributions are checked for quality and relevance to our readers. The opinions they express are their own.


About the author

Tim Peterson
Contributor
Tim Peterson, Third Door Media's Social Media Reporter, has been covering the digital marketing industry since 2011. He has reported for Advertising Age, Adweek and Direct Marketing News. A born-and-raised Angeleno who graduated from New York University, he currently lives in Los Angeles. He has broken stories on Snapchat's ad plans, Hulu founding CEO Jason Kilar's attempt to take on YouTube and the assemblage of Amazon's ad-tech stack; analyzed YouTube's programming strategy, Facebook's ad-tech ambitions and ad blocking's rise; and documented digital video's biggest annual event VidCon, BuzzFeed's branded video production process and Snapchat Discover's ad load six months after launch. He has also developed tools to monitor brands' early adoption of live-streaming apps, compare Yahoo's and Google's search designs and examine the NFL's YouTube and Facebook video strategies.

Fuel for your marketing strategy.