In our previous blog, we discussed the new types of search functionality for eCommerce stores; the power of Visual Search to engage, connect, and deepen customer relationships. We also hinted at a new powerful way to harness AI for textual search – Natural Language Processing or NLP for short. In this week’s blog, we will touch on some key features of NLP that are also interesting.
Almost all textual search engines for eCommerce sites rely on a very old technology referred to as TF-IDF retrieval functions. This is also the backbone of the major internet search engines as well. The basic operations are quite easy to grasp. First, the frequency of all the terms (words) in a document or your website are calculated. This is called Term Frequency (or TF). Then the inverse of frequency and/or prevalence of that term on your site or in a load of documents is calculated (Inverted Document Frequency, or IDF). These are the only variables used to calculate relevance scores for each document-query pair, which is then used for ranking and displaying results.
Thus, if a relatively rare word like “sparkly” appeared in a product description, this product would rank higher. Whereas the word “black” would appear lower down because it is quite a common word in fashion. There are, of course, clever ways to rearrange the words, create interesting weighting functions and add features such as auto-complete or white list unusual phrases. But, at its core, TD-IDF is what runs most of our searches.
That’s not all. We have become so used to this simplistic way of searching documents we have forgotten that we do not go into a store and say “denim black straight”. We might say something along the lines of, “I am looking for a pair of jeans that is a bit loose around the waist, flattering but with no flare at the bottom. Preferably in black”. This is a much more ambiguous and interesting problem to solve.
So how bad is this gap between we are forced to use and the way we actually communicate? Some numbers from the UK are quite interesting to read. Kudos to Duncan MacRae of TechForge Media for these numbers.
Recent online research conducted by YouGov on behalf of Yext, an AI search firm, found that more than half (56%) of UK adults believe that site search provides them with unrelated search results online and 46% of users believe site search does not understand their questions. Another 35% say that site search provides them with out of date or inaccurate information. This has a direct impact on the customer experience! Apart from this, 60% of UK adults who use site search functions are more likely to go straight to a competitor’s website if they provided direct answers, with 28% stating they have frequently bought a product or service from a different business than they intended, because they couldn’t find the information they were looking for online!
What can a Chief Marketing Officer do?!
There is always talk of AI in the news but it is so pervasive, it seems as if it is the answer to all your worries. Let’s be honest. Mostly, it is mind-blowingly amazing but it is also a computer program that needs to be put to work on very specific tasks. At the most basic level, we are taking unstructured text and mapping it to searchable and filterable fields in an index.
NLP functions create interesting solutions to many search problems. You can look for a specific entity (show me all searches that are for any type of beverage), you can detect the language a person is using to route them to the correct customer rep, you can ask for all the key phrases in a load of searches (are people searching for stretchy trousers or more formal clothes?) and even put it to work looking at sentiment (are our customer reviews positive or negative? These are very focused solutions so it is important that the limitations are understood. We may talk about these in future posts, but for us at Delvify, one of the most exciting uses of NLP is in semantic search.
“Semantic Role Labeling is the task of determining the latent predicate argument structure of a sentence and providing representations that can that can answer basic questions about sentence meaning, including who did what to whom, etc.” Another way to put this is after we have pre-processed the textual data through parsing, analysis, tokenizing and scans over the inverted indexes we can look for similar terms, sentences and paragraphs that are similar based on the calculated degree of linguistic similarity between the query terms and matching terms in the index.
From this, you can proceed to use this matching in several ways. One way is to extract key passages that best summarize a result. The passage with the highest similarity scores. If the search is a question – and answers are required to be returned, we can include a text passage that best matches the question.
Of course, it is not like a human being. For any answer, existing text must be used. The models do not make new sentences or phrases like a human does. Nor can it follow the logic to arrive at conclusions.
How might you use this? You can match very long descriptions or sentences and find products that more closely match the intent of the user. The long natural query we highlighted above might bring a pair of loose fit dark color jeans. It may not exactly match the keywords but it gets us to a more intuitive sense of what a customer is looking for. And this is important to note. Semantic search is a newer technology so we should set our expectations about what it can and cannot do. We can definitely find matches that are semantically closer to the intent of original query and can match answers to very carefully defined questions. Asking for the trendiest skirt will not return a result if there are no skirts that are defined as trendy! Additionally, semantic search cannot correlate or infer information from different content. If you ask for a bracelet popular in Monte Carlo, nothing will be returned if no bracelets on your site are mentioned as being used in Monte Carlo.
There are interesting additions to remove some of these limitations. Delvify has implemented a very clever addition to our set called synonym matching. By creating a map of similar words and phrases, we can infer what a customer might want even if that word is not in the descriptions. This is one way Delvify is taking existing AI and moving it forward to the next level.