Today we’ll discuss about the leap Everli (formerly Supermercato24) took regarding the search engine. 🔎
The journey began with the search engine being based directly on the data we had in our MySQL databases.
Back in 2017, Everli (formerly Supermercato24) started growing, reaching 150k unique products in more than 1000 categories.
Dealing with this amount of products we felt that MySQL full-text search was a bottle-neck for our customers in terms of search experience.
Typos weren’t accepted, synonyms neither, we were too limited.
On my first days of onboarding at Everli (2017) I was handed by @hex7c0 a huge book called ‘Elasticsearch / The Definitive Guide’.
I was intrigued at first, wondering why would they ask me to implement a new search engine based on a new (for me) technology, why am I not asked to do some PHP magic?!
Back then, I only have heard about Elasticsearch but it was all new to me, so… challenge accepted! 💪
As the entire ecosystem is SOA-based (Service Oriented Architecture) we decided that the best approach is to have a new service responsible of the autocomplete and search mechanisms.
After brainstorming with the team, we ended up naming the project Kiwi; the bird, not the fruit! Why?
'Kiwi have a highly developed sense of smell, unusual in a bird, and are the only birds with nostrils at the end of their long beaks. Kiwi eat small invertebrates, seeds, grubs, and many varieties of worms. They also may eat fruit, small crayfish, eels and amphibians. Because their nostrils are located at the end of their long beaks, kiwi can locate insects and worms underground using their keen sense of smell, without actually seeing or feeling them.' [Wikipedia]
Kiwi was built on Elasticsearch for data storage and handling of the search/autocomplete requests, being layered with Laravel for I/O.
Further more, we use Kibana for data analysis and RabbitMQ for data synchronization between MySQL and Elasticsearch.
We have thousands of custom synonyms mapped by our dedicated catalogue team after they analyze the searches made by our customers.
This consists now of thousands of synonyms definitions, which ensure a better user experience.
Getting smarter – auto translate
During the time we have noticed there were customers searching for English keywords. For example if a customer searched for ‘toothbrush’ or ‘toothpaste’ there were no results.
We knew we could help the customers and make it so they don’t neglect their oral hygiene 🦷.
So we implemented an automated solution which would detect the language of the searched query in case there were no results.
If the language is English, we create a synonym definition at runtime with the translations, providing results to this customer as well as any others which would do the same search in the future.
So in our example there would be a synonym defitinition like
`toothbrush => spazzolino`.
Oh yes, this was fun and it’s still experimental!
I’m pretty confident you didn’t know that you can use emojis when you search on Everli. Give it a try and search ‘hamburger 🐤’!
We basically mapped the emoji characters as synonyms in our Elasticsearch analyzers, so
'🐤 = pollo', as example.
Based on your past searches and purchases we provide custom-tailored results for you, both on autocomplete and search, by boosting the products you like. 🚀
If you search a couple of times for “pizza funghi”, the next time this will pop as the first suggestion when you start typing “pizz…”.
We won’t go deep in the tech details here as it’s a secret recipe 🤫
Our information about the products is stored in MySQL and we synchronize it to Kiwi’s Elasticsearch indices via RabbitMQ.
So when a colleague from the catalogue team adds or edits a product, the information is instantly updated in our search engine.
Always something fun. Dove?
Back in the days we found that if you searched for ‘dove’ (you know… the brand) there were no results. A big mistery! 🕵️♂️
After some fast research we found that “dove” (which means “where” in Italian) was included in the default Elasticsearch “stop-word” definition for Italian, which we used.
In computing, stop words are words which are filtered out before or after processing of natural language data. Though “stop words” usually refer to the most common words in a language, there is no single universal list of stop words used by all natural language processing tools, and indeed not all tools even use such a list.
Find out more about stop words on Wikipedia.
Did you ever wonder what are the top 5 searched terms in Italy in 2020 so far? Here you are:
latte (milk)– 1.59% 🥛
yogurt– 1.58% 🥛
acqua (water)– 1.34% 🚰
biscotti (biscuits)– 1.23% 🍪
pane (bread)– 1.13% 🍞
What’s up next?
New challenges, daily! 💪
Since Kiwi was launched, Everli kept growing and also launched in Poland.
This means we are now handling way bigger amounts of data and searches daily.
In this moment we have nearly 500k products (compared to ~150k in the past), which provides new challenges and make us optimize the code and data.
Hope you enjoyed the story of Kiwi, cya!