The art of selling photos, how computer vision will shift the landscape of commerce.
What’s the difference between language and pictures? The old saying goes a picture is worth a thousand words, but have you ever considered that a picture can be described in any language? A quick search on google told me there are up to 6,500 dialects spoken across the world… So is a picture really worth 65,000 words – Maybe not, but it can be considered that pictures are a universal medium for communication that can bridge between cultures with different dialects. Evident when you look at platforms like instagram where niche brands with a unique perspective are able to build a global following thanks to the imagery they create.
Put yourself in the shoes of a buyer online, what formulates your purchase choices? More often than not products with accurate and representational photography are more appealing than ones with limited or subprime shots. Given 2 identical products, I would spend my money on whoever has the most comprehensive listing.
Now consider the new landscape of ecommerce, sellers are able to reach buyers regardless of location or language and buyers are able to discover an almost unlimited number of items that could fit their personal style. User adoption of visual search technology will inevitably benefit both sellers and buyers because of generalized AI’s ability to generate fine grain object descriptions of buyers and products. If you consider the increasing availability of visual search technology, then the imagery that drives ecommerce can be seen to have increased value in the shifting world of new retail.
In this case, fine grain object descriptions can be described as a comprehensive index of features extracted from a input image. AI services company ScopeMedia attributes over 40,000 categories to index fashion imagery for its Stylist platform, resulting in a searchable collection of attributes assigned to each item in any given inventory. Focused on attributes like form, style, colour and fit, the models help digital systems understand the more human aspects of fashion and can recommend items based on a customers personal style.
The visual search technologies that power the Stylist platform allow for personalized interactions and search results even with a major language barrier between buyer and seller. The universally understood nature of imagery is such that language is irrelevant when understanding what it means to have style. Indeed, language becomes useful when trying to describe style, but as we know, the same visual attributes can be described in different languages to derive the same meaning.
As images can be used to bridge language barriers, we have noticed that employing deep learning technologies and computer vision for specific applications, like fashion, create a similar bridge between culture and computer. Providing the tools and training that help computers understand the nature of culture erases the need for customers to find the words to describe it, thus improving product discovery and the relevance of search results. Regardless of language, widespread adoption of visual search technologies will enable browsing and buying that’s accessible to anyone anywhere.