At Halla, we talk a lot about the bad recommendations that shoppers are served by online grocery recommendation algorithms poorly suited to food. Avoiding those pitfalls is a challenge that every online grocer must reckon with. But rather than dwelling on what’s wrong with most grocery recommendations these days, why don’t we talk about doing it right? Instead of merely avoiding the embarrassment and ineffectuality of making bad recommendations, how can online grocers successfully make “precisely the right recommendation?“
There are two classic approaches: content-based systems and collaborative filtering-based systems.
It turns out that making the right recommendation is harder than avoiding making the wrong recommendation by an order of magnitude. And the reason why has a lot to do with the state-of-the-art in recommendation algorithms today.
There are two approaches to making recommendations
The obstacle to making good recommendations lies within the way that data scientists build recommendation algorithms in the first place. There are two classic approaches: content-based systems and collaborative filtering-based systems.
1. Content based systems rely upon knowledge that programmers can input, like SKU data, broad food categories like “fruit” or “dairy,” and perhaps even rudimentary food pairings like “peanut butter and jelly.” The problem with this approach is that rudimentary data and broad heuristics can not take into account the infinitely possible interactions between food ingredients and the people who eat them. Programmers can easily insert the notion that “ham goes with cheese,” but most shoppers would recoil at the proposal of Paneer to go with that Prosciutto in your shopping cart. Likewise, limited content-based heuristics will have a very hard time with special dietary preferences ranging from vegan to halal.
2. Collaborative filtering, on the other hand, is derived from actual user data. Instead of relying upon blind content assumptions that are often wrong, collaborative filtering goes strictly by past user behavior and expressed preferences to build a model of how shoppers really buy. Collaborative filtering works better and better the more data is available, and is the foundation for personalized machine learning. It is no surprise, then, that collaborative filtering is the “weapon of choice” for most personalized recommendation engines today.
But collaborative filtering has its problems too. The patterns it picks up are often uninteresting or untrue. They offer no real insight into the personal preferences of an individual shopper, especially those who are new to the platform. Moreover, they just don’t get to the problem of understanding what a buyer is really looking for. To do that would require a level of depth and understanding that is far beyond any collaborative filtering method that is out there today.
Good recommendations come when you combine multiple techniques
In order to at least make a decent suggestion, recommendation algorithms need to combine the two approaches of collaborative filtering and content, with solid content knowledge input, to get the best of both worlds. Start with a strong database of heuristic and historical knowledge to create a framework of what “should” make for good recommendations. Then augment and edit that using real-life data that you capture “from the wild.” The two, combined, make for a powerful combination of the expected and the actual.
But to truly predict and inspire the tastes of consumers, recommendation algorithms have one more arrow in the quiver that they can access: Natural Language Processing (NLP).
However, for truly the best recommendation possible, you can go yet one step further. Combining logical assumptions and a strong supporting cast of recipes, menu items, and other sources of food knowledge with versatile and dynamic behavioral data from actual shopping sessions makes for a potent pairing. But to truly predict and inspire the tastes of consumers, recommendation algorithms have one more arrow in the quiver that they can access: Natural Language Processing (NLP).
3. Natural Language Processing (What’s in a word? A lot.)
NLP is one of the most resilient and pure areas of Artificial Intelligence, first proposed in 1950 by Alan Turing, (of the famous “Turing Test” which determines the efficacy of AI systems) and it remains a state-of-the-art technique in machine learning technology today. NLP combines linguistics, computer science, and other disciplines in order to analyze and utilize language to the benefit of AI.
What does that mean for online grocery recommendations?
Well, NLP enables leveraging of unstructured contextual food content such as cookbooks, menus, and ingredient databases to derive a very “human” understanding of food shopping choices. For example, it isn’t “four SKUs” in your cart, it is a fresh pear, a log of goats’ cheese, a packet of toasted walnuts, and a bundle of arugula. Recognizing these items by name allows the algorithm to cross-reference them with recipes and discover that this particular combination matches a salad recipe.
Using natural language processing in the recommendations engine not only lets you know that a recommendation for salad dressing is appropriate in this case. It also tells you that a sweet balsamic dressing would be an amazing recommendation! It would even let you suggest beets when pears are out of season—tapping into the recipes with NLP tells the algorithm that the beets go well with the same co-ingredients. That is something you could not get from human heuristics nor from collaborative filtering. It is this extra layer of analysis that gets recommendation engines dangerously close to “precisely the right recommendation.”
In a world where consumers now fully expect personalized recommendations when shopping online, using content systems or collaborative filtering alone is better than making no recommendations, and may drive some incremental revenue, but it will not get you near “making precisely the right recommendation.” To maximize the potential of recommendations, you need to have an engine which leverages deep content systems together with collaborative filtering and NLP, simultaneously.