Mar 24

Circle – Enhancing a Retailer’s Online Search Returns Using Machine Learning

By bradallred Uncategorized

Enhancing a Retailer’s Online Search Returns Using Machine Learning

By Yunxuan Wang, Tianhua Zhu, Tianyu Li, Yiqing Li, Yi Wu, Graduate Students, Integrated Marketing Communications Program, Medill, Northwestern University

February 18, 2020

From April to June 2019, a team of five graduate students at Northwestern University conducted a research project to consider ways to enhance the online user search experience for a major retailer’s website. The project showcased applications of a variety of programming and machine learning techniques, including TFIDF, text-similarity, Latent Dirichlet Allocation, and K Nearest Neighbors.

At the beginning of the project, the team discovered two opportunities for search improvement on the website. One was that some search terms returned no search results. For example, if the search term entered was “Christmas gift for girlfriend” on the website, no product would be returned. However, this is a search term that a customer would very possibly use. The second issue discovered was that some customers perceive a product in a way that is different from how the website management team does. For example, in customer reviews, an item labeled by the retailer as a “casual dress” was described by a customer as a “fancy dress.” This gap creates a potential challenge for generating matching search results and sales.

The two issues both led to undesirable search results that were either irrelevant or null. Only if the results were relevant and accurate would the customer get to the last step of a purchase decision. Otherwise, the customer may leave the site, and there would not be a sales conversion. Working with the Retail Analytics Council AI Lab at Northwestern University, the retailer sought to resolve this discrepancy to optimize the conversion rate on its e-commerce site and achieve greater sales lift.

The team started with the premise that a model taking into account not only the objective product descriptions but also the customers’ perception of the products would be able to produce more satisfying search results. Therefore, the goal is to build a machine learning model that is better at capturing the intended meaning of the search terms by incorporating customer reviews. To accomplish this, the very first step is to prepare text-data for machine learning models. The performance of the final model relies on the quality of text pre-processing.

To convert text to a format that the “machine” can understand, the team first parsed the text to remove punctuation and stop words and conducted lemmatization. After these initial cleanings, each product had a pool of words associated with it, which was referred to as a document. Then, the team experimented with two ways of text vectorization to flag the features of each product, defined by word occurrences. The first was a simple occurrence encoding, using CountVectorizer from Python’s Scikit-learn to tokenize, build a corpus, and then encode a document. The encoded vector contains an integer count of the number of times each word appears in a document. The problem with word count, however, is that commonly occurring words have large counts in the documents but provide little meaning. To address the issue, the team implemented the TFIDF feature generation approach. Essentially, TFIDF takes into account word occurrence both locally within a document and globally across the documents, highlighting words that are more interesting to a specific document.

With all the documents encoded, the next step is to quantify the similarities among documents and the search terms to be entered by the customers. The team tried an unsupervised version of the nearest neighbor model to find the closest instances in term of the inter-document distances represented in a vector space. To avoid Euclidean distance’s disadvantage of dealing with documents of uneven lengths, the team used cosine- similarity to find the nearest samples with features named in the search terms.

This model was able to match a search term like “great Christmas gift for girlfriend” with products related to festive occasions or a gifting purpose or girlfriend, thanks to previous customer reviews that mentioned customers’ post-purchase interaction with the products. This test case demonstrates that even when the customer did not even specify the desirable kind of products but specifically pointed out the occasion and purpose of the purchase, the model in training was able to provide some relevant options for further review.

However, this model still has its limit in processing the information by each individual word, ignoring both the link among words and the words not included in the current corpus. In order to capture more of the ambiguity in search terms, the team further explored topic modeling, which is a technique that helps extract hidden topics from texts. It would help identify key factors pertaining to customer online shopping experiences so that the team could make recommendations on search and non-search improvements.

The team chose to experiment with Latent Dirichlet Allocation, a topic modeling technique that is often useful for search engines, news article analysis, etc. LDA assumes that each document is generated from a collection of topics and each topic is generated from a collection of words. Given a set of documents, LDA would reverse engineer the process to find the topics that make up the documents in the first place. Using the Gensim package from Python, the team was able to implement LDA and extract several segregated and meaningful topics that unveil customer sentiment, preferences, and concerns.

Overall, the team was able to identify search and non-search related solutions to improve the online shopping experience of the customers. Notwithstanding, this research has certain limitations, such as only using a few product categories to build the models, only using website data over a short time period, and lack of objective and systematic ways to test the models. However, it provided a simple demonstration of using machine learning techniques to solve problems in an online retail scenario.

Sometimes the topic keyword may not provide enough information to make sense of a specific topic. To address the issue of insufficient information provided by the weightage of keywords in each topic, the team pulls out and examines the most important documents for each topic. By manually going over the customer reviews, the team is able to capture the important nuances between the topics and develop a more well-rounded understanding of factors salient to consumer online experience.

Feb 18

Love0

Testing Process

By bradallred Fashion, Food for thought

I am text block. Click edit button to change this text. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo. I am text block. Click edit button to change this text. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Jan 03

Love0

Enhancing a Retailer’s Online Search Returns Using Machine Learning

By bradallred Uncategorized No Comments

Uncategorized

Circle – Enhancing a Retailer’s Online Search Returns Using Machine Learning

Fashion Food for thought

Testing Process

Uncategorized

Enhancing a Retailer’s Online Search Returns Using Machine Learning

Sep 21

Love266

The most beautiful canyons you’ll ever see

By bradallred Travel No Comments

Praesent in nunc sit amet orci dignissim mollis. Pellentesque elementum lacinia urna, sit amet scelerisque libero blandit vel. Aliquam erat volutpat. Praesent vel nunc orci. Suspendisse quis mauris sed ipsum lobortis semper id nec diam. Donec a porta nibh. Vivamus nibh metus, facilisis ut maximus eget, lobortis at erat. Nulla luctus nec eros ac vehicula. Nullam scelerisque laoreet lorem a sodales.

Integer convallis, odio ut rutrum euismod, mi purus pulvinar justo, quis mollis metus metus vitae nibh. Proin eget tincidunt arcu. Donec ante mi, elementum non adipiscing vitae, pharetra quis mauris.

Quisque at dolor venenatis justo fringilla dignissim ut id eros. Quisque non elit id purus feugiat vestibulum. Phasellus eget sodales neque. Read More

Jul 15

Love1186

The top 10 places to go hiking tomorrow

By bradallred Philosophy 2 Comments

Even the all-powerful Pointing has no control about the blind texts it is an almost unorthographic life One day however a small line of blind text by the name of Lorem Ipsum decided to leave for the far World of Grammar. The Big Oxmox advised her not to do so, because there were thousands of bad Commas, wild Question Marks and devious Semikoli, but the Little Blind Text didn’t listen. She packed her seven versalia, put her initial into the belt and made herself on the way. l using her.Far far away, behind the word mountains, far from the countries Vokalia and Consonantia, there live the blind texts. Separated they live in Bookmarksgrove right at the coast of the Semantics, a large language ocean. A small river named Duden flows by their place and supplies it with the necessary regelialia.

”But nothing the copy said could convince her and so it didn’t take long until a few insidious Copy Writers ambushed her
Robert JohnsonThemeNectar

The Big Oxmox advised her not to do so, because there were thousands of bad Commas, wild Question Marks and devious Semikoli, but the Little Blind Text didn’t listen. She packed her seven versalia, put her initial into the belt and made herself on the way. When she reached the first hills of the Italic Mountains, she had a last view back on the skyline of her hometown Bookmarksgrove, the headline of Alphabet Village and the subline of her own road, the Line Lane. Pityful a rethoric question ran over her cheek, then she continued her way. On her way she met a copy.

l using her.Far far away, behind the word mountains, far from the countries Vokalia and Consonantia, there live the blind texts. Separated they live in Bookmarksgrove right at the coast of the Semantics, a large language ocean. A small river named Duden flows by their place and supplies it with the necessary regelialia.

Apr 21

Love434

We encountered an actual bird paradise

By bradallred Travel No Comments

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi vitae dui et nunc ornare vulputate non fringilla massa.

Mar 23

Love3409

Deep down in the water

By bradallred Philosophy No Comments

Separated they live in Bookmarksgrove right at the coast of the Semantics, a large language ocean. A small river named Duden flows by their place and supplies it with the necessary regelialia. It is a paradisematic country, in which roasted parts of sentences fly into your mouth. Even the all-powerful Pointing has no control about the blind texts it is an almost unorthographic life One day however a small line of blind text by the name of Lorem Ipsum decided to leave for the far World of Grammar. The Big Oxmox advised her not to do so, because there were thousands of bad Commas, wild Question Marks and devious Semikoli, but the Little Blind Text didn’t listen. She packed her seven versalia, put her initial into the belt and made herself on the way. l using her.Far far away, behind the word mountains, far from the countries Vokalia and Consonantia, there live the blind texts. Separated they live in Bookmarksgrove right at the coast of the Semantics, a large language ocean. A small river named Duden flows by their place and supplies it with the necessary regelialia.

But nothing the copy said could convince her

and so it didn’t take long until a few insidious

Copy Writers ambushed her

Mar 23

Love40

Be My Guest Concert First Look

By bradallred Music No Comments

Separated they live in Bookmarksgrove right at the coast of the Semantics, a large language ocean. A small river named Duden flows by their place and supplies it with the necessary regelialia.

It is a paradisematic country, in which roasted parts of sentences fly into your mouth. Even the all-powerful Pointing has no control about the blind texts it is an almost unorthographic life One day however a small line of blind text by the name of Lorem Ipsum decided to leave for the far World of Grammar. The Big Oxmox advised her not to do so, because there were thousands of bad Commas, wild Question Marks and devious Semikoli, but the Little Blind Text didn’t listen. She packed her seven versalia, put her initial into the belt and made herself on the way.

But nothing the copy said could convince her

and so it didn’t take long until a few insidious

Copy Writers ambushed her

The Big Oxmox advised her not to do so, because there were thousands of bad Commas, wild Question Marks and devious Semikoli, but the Little Blind Text didn’t listen. She packed her seven versalia, put her initial into the belt and made herself on the way. l using her.Far far away, behind the word mountains, far from the countries Vokalia and Consonantia, there live the blind texts. Separated they live in Bookmarksgrove right at the coast of the Semantics, a large language ocean. A small river named Duden flows by their place and supplies it with the necessary regelialia.

Mar 22

Love0

Getting out of bed super early today

By bradallred Gaming, Music No Comments

”But nothing the copy said could convince her and so it didn’t take long until a few insidious Copy Writers ambushed her
Robert JohnsonThemeNectar

Feb 22

Love566

We hired a new employee

By bradallred Philosophy No Comments

Integer convallis, odio ut rutrum euismod, mi purus pulvinar justo, quis mollis metus metus vitae nibh. Proin eget tincidunt arcu. Donec ante mi, elementum non adipiscing vitae, pharetra quis mauris.

Quisque at dolor venenatis justo fringilla dignissim ut id eros. Quisque non elit id purus feugiat vestibulum. Phasellus eget sodales neque. Morbi eget odio nec justo consequat gravida. Phasellus dolor nisl, venenatis eget euismod et, dapibus et purus. Maecenas interdum nisi a dolor facilisis eu laoreet mi facilisis. Mauris pharetra interdum lorem eu venenatis. Praesent est diam, fringilla in hendrerit vel, ullamcorper et mauris.

Mauris vel tortor accumsan, faucibus orci non, varius turpis. Aenean ac eros libero. Quisque quis sapien in ante scelerisque volutpat. Cras et libero iaculis, consequat nisi nec, tristique metus. Praesent eu odio in velit maximus accumsan vitae id lectus. Aenean ullamcorper vitae tortor vitae blandit. Nullam placerat eleifend metus, at tempus lacus suscipit non. Praesent in nunc sit amet orci dignissim mollis. Pellentesque elementum lacinia urna, sit amet scelerisque libero blandit vel. Aliquam erat volutpat. Praesent vel nunc orci. Suspendisse quis mauris sed ipsum lobortis semper id nec diam. Donec a porta nibh. Vivamus nibh metus, facilisis ut maximus eget, lobortis at erat. Nulla luctus nec eros ac vehicula. Nullam scelerisque laoreet lorem a sodales.

Integer convallis, odio ut rutrum euismod, mi purus pulvinar justo, quis mollis metus metus vitae nibh. Proin eget tincidunt arcu. Donec ante mi, elementum non adipiscing vitae, pharetra quis mauris. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Curabitur eget nibh non odio iaculis posuere. Sed ante tortor, pharetra vitae iaculis id, sodales ac tellus. Ut viverra, nulla et adipiscing condimentum, libero nisi condimentum tellus, vel pharetra neque ligula sit amet mi. Sed rutrum consectetur purus ac tincidunt.

bradallred

Circle – Enhancing a Retailer’s Online Search Returns Using Machine Learning

Enhancing a Retailer’s Online Search Returns Using Machine Learning

Test

Popular Now

Published Recently

Testing Process

Enhancing a Retailer’s Online Search Returns Using Machine Learning

Circle – Enhancing a Retailer’s Online Search Returns Using Machine Learning

Testing Process

Enhancing a Retailer’s Online Search Returns Using Machine Learning

The most beautiful canyons you’ll ever see

The top 10 places to go hiking tomorrow

We encountered an actual bird paradise

Deep down in the water

But nothing the copy said could convince her

and so it didn’t take long until a few insidious

Copy Writers ambushed her

Be My Guest Concert First Look

But nothing the copy said could convince her

and so it didn’t take long until a few insidious

Copy Writers ambushed her

Getting out of bed super early today

We hired a new employee