Skip to main content

Observing Consumer Online Word of Mouth On Social Media Under Scarcity

Date

Investigating the online discourse around financial hardship and product shortages on social media, our research aims to reveal the nuanced ways the cost-of-living crisis and supply chain disruptions affect public sentiment and well-being, highlighting a need for targeted interventions to support those most vulnerable.

Project Overview

Financial and product scarcity not only precipitate economic challenges but also contribute to declines in psychological health (Sommet & Spini, 2022). When experiencing scarcity, individuals tend to either take efforts to acquire scarce products (for example, hoarding), or seek a sense of control through means such as consuming high-calorie foods or purchasing luxury items they cannot afford (Cannon, Goldsmith & Roux, 2019).  This project seeks to find solutions to alleviate these adverse effects by analysing the evolution of discourse around financial and product scarcity on social media, with the goal of 1) identifying specific concerns and vulnerable consumer groups, and 2) understanding how discussions around product and financial scarcity evolve over time. This pioneering project is the first to harness extensive social media datasets and a wide range of text mining techniques to investigate the multifaceted concept of scarcity.

Figure 1: A Conceptual Model of Scarcity's Impact on Consumer Behaviour and Well-being.

Figure 1: A Conceptual Model of Scarcity's Impact on Consumer Behaviour and Well-being.

Data and Methods

In our study, we explored British grocery customers’ online word of mouth on social media throughout 2022, from week 1 to week 46, when UK inflation reached 11.1%, a 41-year high (Harari et al., 2022).

Our datasets contain more than 600,000 social media posts, where each post mentioned at least one of four major British grocery retailers within this observation period.

Here is an overview of our methodological approach:

1) Identifying Scarcity

Our focus was on identifying posts related to scarcity, specifically targeting conversations pertaining to product scarcity, financial scarcity, and price increases. To achieve this, we employed two methodologies: Machine Learning and Large Language Models (LLMs). Each approach provided unique insights and enables us to accurately identify scarcity messages from the social media posts.

Machine Learning

Strategy: We employed a semi-supervised learning approach using a Random Forest model, an advanced machine learning technique that excels in handling complex datasets with non-linear relationships (Schonlau & Zou, 2020). Hartmann and colleagues (Hartmann, Huppertz, Schamp & Heitmann, 2019) compared ten approaches of classifying the text of social media posts, the Random Forest classification model exhibited consistently high performance. We therefore chose to employ a Random Forest model approach.

Data Preparation: The initial phase involved manually labelling 6,000 social media posts to serve as training data for the Random Forest model. These posts were classified based on their discussion of product scarcity, financial scarcity, or price increases.

Feature Engineering: To enable the Random Forest model to identify scarcity-related posts, we engineered features to facilitate pattern recognition within the labelled data, guiding its predictions across the rest of the dataset. We utilised document-level word embeddings to capture the semantic similarities between posts, and a LDA topic model to reveal thematic similarities.

Model Optimisation: Faced with the challenge of imbalanced data - a common issue where discussions of scarcity are relatively infrequent - we up sampled the less common categories. This adjustment ensured our model avoided learning biases towards the majority class and enabling it to learn the relationships within the minority class (i.e. scarcity) more effectively.

Large Language Model

Crafting Intuitive Prompts: Our LLM strategy was prompt engineering - a method that involves designing targeted prompts to guide the language model in analysing text data. By crafting prompts that specifically asked the LLM to identify and categorise discussions about product scarcity, financial scarcity, and price increases, we directed the focus and improved its ability to label relevant conversations.

Iterative Refinement: The process of prompt engineering was iterative. We started with broad prompts and gradually refined them based on the LLM's responses, homing in on prompts that produced the most accurate labels.

2) Discovering Topics

To investigate the conversations about scarcity, we took a closer look at the topics emerging from posts categorised as relating to product scarcity, financial scarcity, and price increases. Our goal was to uncover the specific themes and nuances of these discussions. An LDA topic model - an unsupervised statistical method for discovering the abstract topics among a collection of documents - served as our tool for this exploration.

Targeting Bigrams: To refine our topic model, we employed part of speech tagging, a process that identifies the grammatical categories of words. This enabled us to focus on bigrams - pairs of words that frequently occur together and are significant in our context, such as noun-noun and adjective-noun combinations. This approach allowed us to extract more precise insights that are specifically relevant to the context of grocery shopping.

Refining Analytical Lens: We compiled a scarcity-related lexicon of stop words to filter out irrelevant stop words that are not related to scarcity. This step was crucial in maintaining the relevance and accuracy of our topics

Determining the Optimal Number of Topics: A key aspect of our work with the LDA topic model was establishing the optimal number of topics. This determination was guided by an evaluation of the model's perplexity (Grün & Hornik, 2011) - a measure of its predictive performance - and its coherence (Friedman, 2022), which assesses the meaningfulness of the identified topics. Additionally, we employed the R package ‘ldatuning’ (Nikita, 2020) for a comprehensive assessment, utilising metrics such as Arun2010, CaoJuan2009, Deveaud2014, and Griffiths2004 (Arun et al., 2010; Cao et al., 2009; Deveaud, SanJuan & Bellot, 2014; Griffiths & Steyvers, 2004). These evaluations, combined with a qualitative review of the topics for interpretability, informed our decision-making process.

Fine-tuning for Distinctiveness: By adjusting the alpha parameter to 0.01, we aimed to promote a more pronounced differentiation among topics. A lower alpha value encourages the model to distribute topics within documents in a way that each topic is more distinct, thereby enhancing both the interpretability and visualisation of the topics identified.

3) Sentiment and Thematic Comparison

As part of our exploration into scarcity-related discussions, our analysis extended beyond identifying topics to extracting and comparing sentiments within these conversations. This involved two steps: sentiment analysis and the development of an interactive visualisation to examine the dynamics of topics and sentiments over time.

For the sentiment analysis, we chose the ‘sentimentr’ package (Rinker, 2019), which is tailored for sentence-level sentiment detection in text data. This package is designed to account for valence shifters, such as negations and intensifiers, and emoticons making it proficient at understanding the sentiment expressions common in social media posts.

To provide insight of the evolving sentiment and themes, we developed an interactive visualisation using the web application framework package ‘shiny’ (Chang et al., 2023). This tool displays the distributed percentages of topics and sentiment scores based on the selected type of scarcity (i.e. product, financial, or both) and the topics chosen for exploration, with a focus on temporal changes over weeks. This temporal analysis offers a dynamic view of how public conversations for each type of scarcity change, revealing trends and patterns of scarcity discussions.

Additionally, the tool provides a comparative mode that allows users to compare topic distributions between product and financial scarcity over time. This comparative analysis uncovers how the thematic focus and public interest may shift between different forms of scarcity as events unfold or as public awareness changes.

Key findings

The classification model and large language model approach identified 20 topics each. The classification model found 12 topics related to scarcity whereas, the large language model found 14 topics. There was overlap in the topics identified by both approaches with 9 comparable topics. Notable differences also emerged with the LLM identifying discussions related to the broader impacts on scarcity. Table 1 present the topics associated with scarcity as identified by each approach.

Table 1. Overview of Topics Identified Through Classification and Language Model (LLM) Approach.

Classification   LLM  
Topic Themes Topic Themes
Economic Hardship Individuals discussing low pay and insecure contracts, leading to hardship and difficulties in affording healthy food, not receiving a living wage, and observing inequality in pay alongside food waste during societal struggles. Living Wage and Income Disparity Discussions focus on the living wage, cost-of-living increases, energy bills, rent, and income inequality, drawing comparisons between consumer struggles and corporate profits.
Online Product Availability Queries about stock levels and product availability highlight issues with finding specific items, requests for product returns, online unavailability, and limited stock. Product Availability Consumers face challenges with stock availability, difficulty finding stores selling specific products, and questions about discontinued items, while discussing broader supply chain issues.
Limited Choices for Customers with Dietary Restrictions Challenges in finding vegan, gluten-free, and diary-free products, including those with meat for non-vegan individuals with gluten intolerance. Dietary Preferences and Accessibility  The lack of options for gluten-free, vegan, and other dietary preferences, along with changes in product recipes.
Order and Delivery Issues Customers face refund issues, including not being paid in full, charges for unordered products, missing items, and problems with substitutions. Delivery Issues Customers report recurring problems with delivery, including missing items, difficulty obtaining refunds, and unsatisfactory substitutions, particularly impacting housebound individuals and those with specific needs.
Delivery Cancellations Issues with cancelled or unarrived deliveries, lack of notification yet still being charged. Highlights impact on vulnerable consumers who are housebound, including those with disabilities, COVID-19, or pregnancy, are recurring themes. Delivery Cancellations Delivery service issues, including delayed or cancelled orders without notification and failed payments, lead to inconvenience and immediate food access problems, especially for vulnerable groups.
Charity and Support Discussions about food banks, donations, local community support, providing back-to-school items for those in need, addressing homelessness, and supporting baby banks. Food Insecurity and Community Support Themes revolve around food banks, donations, volunteering, shoplifting to survive, the reliance on charity, and prioritising those with dietary restrictions.
Product Quality: Expiration Concerns about short-dated, mouldy, and out-of-date products. Product Quality: Expiration Frequent issues with short-dated, mouldy, and out-of-date food lead to waste, meal preparation challenges, and dissatisfaction due to perceived money loss.
Product Price and Quantity Specific price increases, reductions in product quantity, and counter-intuitive offers, such as larger packs costing more. Promotional Problems Shrinkflation, deceptive discount practices, and inconsistent pricing strategies across stores lead to consumer frustration and scepticism towards promotions.
Discount Unsatisfied with promotions such as reduced items being scanned at full price, paying the same amount as other stores despite discounts, and price increases before promotions, alongside discussions on price matching and loyalty schemes. Retail Operations

 

Discussions cover a range of issues, including the availability of open tills, store stock levels, store appearance, and cleanliness.
In Store Product Shortage / Unavailability Customers cannot find desired products, face empty shelves, and encounter closed counters such as fish, deli, and cafes, and stores closing early. Macro Impacts The impact of Brexit and the war in Ukraine on supply, quality, and cost is a concern, alongside a desire for support for British farmers.
Milk Scarcity Discussions primarily focus on concerns about milk sell-by dates being removed (relying on the sniff test), supply chain issues affecting milk, and general food waste. Socio-Political Concerns Concerns range from suppliers violating policies, the impact of sacking unvaccinated NHS staff, to biofuel's environmental impact.
Financial Strain: Fuel Prices Mainly focusing on fuel-related financial strains, themes include issues with fuel cards, extra fuel charges, affordability of fuel, and incidents of paying twice for groceries, leading to customers leaving their shopping. Cost of Living Crisis Increasing prices and supermarket pricing strategies are central to conversations about the cost-of-living crisis.
  Product Quality: Safety and Integrity Incidents of food smelling or tasting off, contamination with pests or foreign objects, and changes in recipes not matching advertisements.
Product Quality: General Complaints include the general quality of products such as taste, damaged items and mouldy products.

Our research found that the priority groups that have difficulty in relation to scarcity is those who are housebound and rely on delivery services (e.g. COVID, elderly, pregnancy, disability), those on low income and rely on food banks, and those who have restricted diets.

An examination of the prominence of topics within product scarcity and financial scarcity revealed distinct patterns within the online discourse. While there is a broad engagement across all topics in discussions pertaining to both forms of scarcity, our analysis identified a differential prevalence of certain topics within each category. Conversations about product scarcity tend to focus on the following topics: product quality (general, expiration, and safety), product availability, delivery issues and cancellations, dietary preferences, and retail operations. In contrast, discussions about financial scarcity gravitate towards the following topics: macro impacts, socio-political concerns, living wages and income disparity, cost-of-living crisis, food insecurity and community support, and promotional problems. For a detailed visual representation of these findings, refer to Figures 2 and 3.

Figure 2: Product Scarcity - Topic Distribution and Sentiment Over Time

Figure 2: Product Scarcity - Topic Distribution and Sentiment Over Time

Figure 3: Financial Scarcity - Topic Distribution and Sentiment Over Time

Figure 3: Financial Scarcity - Topic Distribution and Sentiment Over Time

When comparing the proportion of discussions by product and financial scarcity over time, posts about the cost of living predominantly aligns with financial scarcity, reflecting economic concerns, while macro impact posts relate more to product scarcity, hinting at global supply issues. Notably, macro impact discussions spiked in the context of financial scarcity during week 46, perhaps reflecting a shift in economic focus. Peaks in discourse - macro impacts in week 16 for product scarcity and the cost of living in week 26 for financial scarcity - may relate to specific events or policy changes. These trends are visually detailed in Figure 4.

Figure 4: Weekly Fluctuations in Discussions of 'Macro Impacts' and 'Cost of Living Crisis' by Type of Scarcity.

Figure 4: Weekly Fluctuations in Discussions of 'Macro Impacts' and 'Cost of Living Crisis' by Type of Scarcity.

Value of the research

Our research offers multiple insights and strategies aimed at empowering consumers and improving their health and well-being by fostering a greater sense of personal control. The key recommendations from our findings include:

Building Trust through Transparency and Communication: Our findings indicate a significant consumer need for clear information, especially during times of scarcity. Transparent communication regarding promotions, product availability, changes, and issues addresses consumers' expressed frustrations. This approach is critical in enabling consumers to make informed decisions and provides them with a sense of personal control to manage their resources more effectively. To address this, policies could mandate standardised units for product quantities and clear unambiguous labelling, allowing easy product comparison and informed decisions. Furthermore, policies requiring retailers to offer accurate and clear information on stock levels online, delivery disruptions, and substitutions will boost consumer sense of control, trust, and satisfaction.

Personalised System Improvements: The identified topics around delivery issues, dietary preferences, and product expiration highlight a consumer demand for more tailored shopping experiences. By developing systems that learn to align product offerings with individual dietary needs and preferences, improve the reliability and personalisation of delivery services, offer suitable substitutions with an option for consumers to decline, and manage the sale of short-dated items to reduce food waste, retailers can better cater to the challenges and needs of consumers. This approach is especially important for those with dietary restrictions or people who are dependent on delivery services.

Enhanced Refund Processes: The recommendation for enhanced refund processes is driven by the need to alleviate financial stress among consumers, particularly those living paycheck to paycheck. This need becomes important when retailer errors result in individuals being left without food and without additional financial resources. Quick and accessible refunds can provide immediate relief in such situations, ensuring consumers can afford food and are not penalised for errors beyond their control. Implementing policy mandates for refund processes ensures consistency across retailers, safeguarding consumer well-being.

Positive Framing in Communications: Boosting personal sense of control and proactive behaviour through advertisements and public messaging that emphasise positive actions and solutions. By focusing on empowerment and the steps individuals can take to improve their scarcity situations, may instil a personal sense of control and direct attention to manage scarce resources effectively.

Digital Self-Management Tools: Enhancing consumer empowerment and control can be achieved by integrating financial literacy and consumer rights education into digital self-management tools. This approach would offer users accessible education on financial planning, consumer rights, and informed purchasing decisions through apps that assist with budget management, expense tracking, and accessing affordable products. By combining these elements, individuals gain the knowledge to mitigate economic challenges and optimise spending, while also being equipped with the tools to take charge of their financial and shopping decisions. This solution promotes autonomy and consumers personal sense of control.

Our research presents multiple insights and strategies aimed at empowering consumers and enhancing their well-being by promoting a sense of personal control. These recommendations may be particularly effective when combined. For example, our suggestion for personalised system improvements can be integrated with promoting healthy affordable eating.

Insights

  • Differentiated Scarcity Discussions: Our research demonstrates that the discourse surrounding financial and product scarcity on social media differs, pinpointing specific topics and concerns relevant to each area. These delineations highlight the need for customised strategies to effectively tackle the distinct challenges presented by each type of scarcity.
  • Vulnerable Consumers: Our findings uncover the needs of vulnerable groups, including those reliant on food banks and delivery services and those with dietary restrictions, highlighting the importance of prioritising these consumers in interventions.
  • Strategic Recommendations: We propose actionable recommendations such as enhancing transparency, personalising shopping experiences, and promoting accessible, healthy food options, aimed at empowering consumers to navigate scarcity with greater autonomy.

References

Arun, R., Suresh, V., Veni Madhavan, C. E., & Narasimha Murthy, M. N. (2010). On finding the natural number of topics with latent dirichlet allocation: Some observations. In Advances in Knowledge Discovery and Data Mining: 14th Pacific-Asia Conference, PAKDD 2010, Hyderabad, India, June 21-24, 2010. Proceedings. Part I 14 (pp. 391-402). Springer Berlin Heidelberg.

Cannon, C., Goldsmith, K., & Roux, C. (2019). A self‐regulatory model of resource scarcity. Journal of Consumer Psychology, 29(1), 104-127.

Cao, J., Xia, T., Li, J., Zhang, Y., & Tang, S. (2009). A density-based method for adaptive LDA model selection. Neurocomputing, 72(7-9), 1775-1781.

Chang, W., Cheng, J., Allaire, J., Sievert, C., Schloerke, B., Xie, Y., Allen, J., McPherson, J., Dipert, A. & Borges, B. (2023). shiny: Web Application Framework for R_. R package version 1.7.5.1, <https://CRAN.R-project.org/package=shiny>.

Deveaud, R., SanJuan, E., & Bellot, P. (2014). Accurate and effective latent concept modeling for ad hoc information retrieval. Document numérique, 17(1), 61-84.

Friedman D (2022). topicdoc:  Topic-Specific Diagnostics for LDA and CTM Topic Models_. R package version 0.1.1, <https://CRAN.R-project.org/package=topicdoc>.

Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National academy of Sciences, 101(suppl_1), 5228-5235.

Grün, B., & Hornik, K. (2011). topicmodels: An R package for fitting topic models. Journal of statistical software, 40, 1-30.

Harari, D., Francis-Devine, B., Bolton, P., & Keep, M. (2022). Rising cost of living in the UK. London: House of Commons Library https://commonslibrary. parliament. uk/research-briefings/cbp-9428.

Hartmann, J.,  Huppertz, J., Schamp, C., & Heitmann, M. (2019). Comparing automated text classification methods. International Journal of Research in Marketing, 36(1), 20-38.

Nikita, M. (2020). Select number of topics for LDA model. CRAN R Project.

Rinker, T. W. (2019). sentimentr: Calculate Text Polarity Sentiment. http://github.com/trinker/sentimentr

Schonlau, M., & Zou, R. Y. (2020). The random forest algorithm for statistical learning. The Stata Journal, 20(1), 3-29.

Sommet, N., & Spini, D. (2022). Financial scarcity undermines health across the globe and the life course. Social Science & Medicine, 292, Article 114607.

Research theme

  • Health
  • Societies

Programme theme

  • Statistical Data Science
  • Artificial Intelligence

People

Liam Bailey, Data Scientist, Leeds Institute of Data Analytics (LIDA)

Dr Boshuo Guo, Lecturer, School of Design, The University of Leeds

Dr Aulona Ulqinaku, Associate Professor, Business School, The University of Leeds

Partners

A major British grocery retailer

Funders

Funded by Consumer Data Research Centre (CDRC)