Understanding the Consumer Voice using Natural Language Processing

Understanding the Consumer Voice using Natural Language Processing


In the past the way companies and consumers interacted was simple, slow, and predictable.

Brands would research their market through traditional surveys and focus groups. Once a new product had been developed, brands would advertise through traditional media such as TV, radio, print, billboards, and we, the consumer, would go out and buy them.

A lot has changed in the past couple of decades. Consumers now research products in an instant via search engines, talk openly about the brands and product they like or dislike on social media and leave feedback immediately in the form of reviews on eCommerce sites. 

This means there is a huge swathe of data companies can use to better understand the digital consumer instantly. There are advantages in this new world, in that customer feedback is faster, companies can identify consumer trends quicker and create products and services to suit their needs. However, without the right tools and techniques, this mass of unstructured data can easily turn into a cacophony, which is difficult to decipher and derive useful insight for brands.

Much of this consumer data, whether that be customer reviews, social media posts or search engine queries is in the form of natural language. Natural language processing (NLP) is a range of techniques for analysing and representing naturally occurring text.

Here we will talk through some of the natural language processing techniques and use-cases that brands are using to better understand the voice of the digital consumer.

Customer Data

Customers are leaving a huge digital footprint online, some forms of this unstructured data that can be analysed with NLP include:

Product (eCommerce) Reviews

Consumers leave direct feedback on the products they purchase via ratings and reviews. In a single day, 200,000 product reviews are written on Amazon. This data has amazing potential for brands to leverage to understand what consumers like or dislike about their products and their competitors’ products, and where they can improve.

Video Reviews

More and more, influencers and consumers are reviewing products they have purchased in the form of online videos. These videos can be automatically transcribed using AI-driven speech-to-text, so the content can be analysed by brands.

Customer Satisfaction & Loyalty Surveys

Frequently when customers have made a purchase, received a product or service, or interacted with a customer service agent online, they are prompted to answer a satisfaction survey. Some of this feedback is in the form of a structured response (e.g., a rating), but much of the subtlety of their specific experience can online be captured in the form of their unstructured free-form feedback.

Call transcripts

When customers call customer service lines, their conversation can be transcribed using AI-based speech-to-text, so that these transcripts can be used for compliance. For some companies, these calls number in the thousands every data, and it would be a slow and laborious task to readthrough these transcripts or listen through the original calls, so natural language processing can be used to understand the common themes and topics within the customer conversations which can be addressed.

Chat transcripts

Alongside call centres, many companies interact with customers via live chat, again this unstructured conversation can be analysed using NLP.

Chatbot conversations

More recently, companies have turned to AI-based chat bots to automate their interactions with customers. As well as leveraging this data to iteratively improve the accuracy of the chatbot, companies can analyse the natural language data from these chat logs to understand how they can improve their products or services.

Email messages

A lot of interactions between companies and customers is still in the form of email, and these emails conversations can equally be analyse using NLP.

Social media posts

Consumers are discussing and conversing with brands and products all the time on social media, whether that be sharing a product they love on Instagram, commenting on a brand’s latest video on TikTok or complaining about a dysfunctional product on Facebook. In one day, 500 million tweets are written, 95 million photos and videos are shared on Instagram, and 720,000hours of fresh video content are uploaded to YouTube.

Many brands use social listening to gather and analyse this data about their brands and products, but manually reading through posts or using word clouds doesn’t cut it if the posts are in their thousands, and NLP can be used to better make sense of this data and find actionable insight.


Consumers have discussions within different communities on online forums, and there are forums for every niche group you could think of, whether that’s Mumsnet or a subreddit dedicated to video games, their and what they are saying about particular products and brands can be incredibly insightful.

News Articles & Blogs

Brands and products are often discussed in news articles or blog posts, think “Top 10 Wireless Headphones of 2022”, and brands can analyse this data to understand how their and their competitors’ products are being rated by industry experts. PR issues also emerge on online news in an instant, and brands can track this data to detect and respond to issues quickly and effectively.

Search Terms

Finally, customers are writing natural language search queries either on search engines, eCommerce stores, or company websites. Customers use search to find products that can solve their problems (think “How can I remove this sauce stain on my carpet?”), research a new product they have heard of (think “What is kombucha?”), or compare different brands (think “What is the best brand of washing machine?”). And this is one of the most untapped sources to better understand the mind of consumers. An enormous 5.6 billion searches are made on Google every day, and NLP can be used to analyse search terms by volume and growth.


Natural Language Processing techniques applied to Customer Data

Text Classification

Text classification also known as text tagging or text categorization is the process of categorizing text into organized groups.

Let’s say for example, a global eCommerce retailer had a large set of customer feedback responses, and you wanted to automatically categorise them into different predefined groups, such as “complaint about packaging”, “complaint about delivery”, “complaint about faulty product”, “praise for product received”, “praise for delivery service”. A trained text classification model would allow you to automatically categorise these feedback responses into the different groups.

Named Entity Recognition
Named entity recognition (NER) is the process of identifying and categorizing key information (entities) intext. These entities could be things such as locations; organisations; people; or places.

Let’s say for example, a company wanted to extract all the brands mentioned within online forums around a particular topic such as skin care. Named entity recognition could allow the company to quickly extract the brands mentioned which would be a slow process if done manually.

Topic Modelling

Topic modelling refers to the task of discovering the topics that occur in a collection of documents.

Take for example, a brand which wanted to better understand the topics of conversation on social media about the brand. Topic modelling could identify topics within the mentions of the brand or at the brand, to enable to brand to understand how it was being discussed online.

Text Summarisation

Text summarization is the task of condensing apiece of text to a shorter version, generating a summary which preserves the meaning while reducing the size of the text. Text summarisation can be used for companies to take long pieces of text, for example a news article, and summarise the key information so that readers can digest the information quicker.

Machine Translation

Machine translation is the task of automatically translating natural language from one language to another. Most people will have experienced this first-hand using Google Translate, but machine translation can also be used to translate online conversation in different languages. Many companies sell their products and services across countries, where the customers will provide feedback in a different language. Machine translation can translate this conversation into the company’s main language, so that they are less reliant on foreign language speaking employees or translation services in serving these customers.

Sentiment & Emotion Analysis

Sentiment analysis is a subset of text classification and is the task of categorising text as having positive, negative, or neutral sentiment. Companies can use sentiment analysis to for example, classify survey responses into positive, negative or neutral to better understand whether consumers had a positive or negative experience with the product or service. Emotion analysis takes this one step further and allows the classification of text into more fine-grained emotions, such as anger, excitement, sadness or relief.

Building a platform

To ingest, analyse and present useful insights around text data, a modern data platform enables this to be done at scale with real time insights. Fortunately the barrier to entry is lower than ever with cloud service providers having an eve growing list of NLP solutions, and open source companies such as HuggingFace sharing free language models.

It used to be the case that to develop and deploy state-of-the-art NLP models, companies had to train models on huge volumes of annotated data on many expensive GPUs at a huge cost.
Now with advances in transfer learning, companies can fine-tune huge models on small volumes of data to be applied to their specific use-case.

In the next part of this series, we will dive into what it takes to develop a modern text analysis data platform.

In summary

The interaction between customers and brands is faster and more complex than ever before, and without the right approach brands can easily become overwhelmed by the amount of data and lose the ability to authentically engage with their customers and respond to their needs. Natural language processing provides to ability for companies to make sense of this data automatically and at scale, so that they can provide better products, services and an overall experience for their customers and pull ahead of the competition.