Holly Clarke is an intern on our Data Scientist Internship Programme – She is working for the LifeInfo project with Michelle Morris, researching attitudes towards novel lifestyle and health data linkages and how access to this information could improve public health. This is the first of a series of blogs she has written for the Consumer Data Research Centre.
The ongoing COVID-19 pandemic means governments have been looking for technological solutions in order to reduce the spread of the virus. Contact-tracing apps are now being used, from Singapore’s ‘TraceTogether’ to ‘StopKorona!’ in North Macedonia. As restrictions on movement are eased in many countries, these apps aim to identify if an individual has been in contact with an infected person through Bluetooth and/or GPS signals. This provides alerts to users and creates early warnings of new outbreaks. As these apps have been adopted, a huge amount of online discussion has followed about the benefits and concerns around sharing personal data for the benefit of public health.
So much of this conversation seems novel. Several months ago, most in the UK would have gawked at the possibility of a government app privy to information about who they come into contact with. Yet, the phrase “we are in unprecedented times” has been difficult to escape in recent weeks.
For me, the onset of the pandemic has coincided with a new research position with the LifeInfo project, under the supervision of Michelle Morris‘ expertise in Health Analytics. This project focuses on people’s attitudes towards sharing their lifestyle data – from supermarket loyalty card to fitness apps – and linking this to health records to drive research into the risk factors of non-communicable diseases such as diabetes, heart disease and certain cancers. Access to these data could have immense benefit as millions of yearly deaths can be attributed to poor diet and physical inactivity.
At the heart of this project is also the vital recognition that we must understand people’s concerns about such initiatives and adapt research accordingly. Part of my role is analysing free-text survey responses about the circumstances under which people would share different types of lifestyle data for health research and factors that might impact their decision to do so. While the conversation about contact tracing apps and their place in our lives is certainly novel, many of the words and topics about these apps mimic those that come out of my analysis.
This made me wonder how I could tap into the conversation about contact tracing apps and the insights this could give about data sharing, privacy, surveillance and public health. For the past two weeks I have been scraping tweets about coronavirus apps and will continue to do so as they are developed, trialed, and used in countries around the world.
This is the first of a series of short blog posts about attitudes towards contact tracing apps and data-sharing for public health. Using text analysis and Natural Language Processing (NLP), I will be answering questions about the conversation around these apps. What topics are prevalent and how do people feel about sharing their data? How does this look in different countries and what role does context play? How does this relate to more general attitudes about data-sharing for public health benefit and what might the impacts be going forward? Twitter is by no means a direct expression of public opinion, but analysing tweets can give us important insights about people’s attitudes, news stories that shape narratives, and shifts in opinion over time.
The first thing to establish is whether people actually care about contact tracing apps. Here, the answer is an undeniable “yes!”. A total of 12,593 tweets were collected on the topic of COVID-19 apps produced during the two-and-a-half-week period between 24 April 2020 and 12 May 2020 (and limiting collection to those in the English language)1. Governments need around 60% of the population (80% of UK Smartphone users) to enable contact tracing apps for them to be effective, which could influence many people to consider their relationship with data-sharing that haven’t given much thought to it before.
Tweets about coronavirus apps have gone from relatively low numbers (just 203 tweets on 25 April, the first full day of collection) to peaks of over 1300 tweets per day on the 27 April and 5 May. These peaks can be linked to the ‘COVIDSafe’ app release in Australia and the announcement that the NHS ‘track and trace’ app was to be trialed on the Isle of Wight in the UK (see figure 1a).
Some tweets are geo-located, indicating the country and even city the tweets were produced. Although these tweets make up only a small proportion (2.5%) of the total tweets collected, they act as a sample to indicate where people were tweeting about COVID-19 apps.
Most tweets are shown to be produced in the UK and Australia. In these countries contact tracing apps have been nationally introduced and promoted (in the case of Australia) or locally trialed (in the case of the UK). Canada and the US currently constitute only a small proportion of tweet locations; however, this could change in the forthcoming weeks as these countries are yet to announce apps.
India is the third most popular country for tweet locations where the contact tracing app ‘Aarogya Setu’ has been introduced with associated controversies about personal privacy. Many more tweets about this app have likely been created but in languages other than English, so are not included within the dataset. This is important to consider as insights gained from analysing tweets will reflect a majority Western perspective. Some of the first countries to introduce contact-tracing apps are non-English speaking (for example South Korea) and additionally have restriction on access to Twitter (in the case of China).
Over 80% of the geo-located tweets were produced in two countries – the UK and Australia. Yet, this is not consistent across time. As shown in figure 1b, during the first week of data collection the conversation was dominated by the Australian context (shown in blue), and this is consistent with the first peak of tweets related to the roll-out of the Australian contact tracing app. Following this, the second week of data collection shows the conversation has shifted towards the UK context (shown in orange) as the NHS app is trialed in the Isle of Wight.
Next week’s blog post will focus what people are saying about Covid-19 apps, whether attitudes are positive or negative, and if this differs based on the country and context. The wordcloud below gives an initial insight into the current conversation around these apps. Two findings stand out. First, context appears to play a large role in shaping the conversation. Words referring to key places and actors (both technological and state) are frequently included in tweets. These include ‘government’, ‘nhs’, ‘apple’, ‘google’, ‘India’, ‘Australia’ and ‘Isle [of] Wight’. Second, it is striking that the words ‘privacy’ and ‘trust’ are amongst the most frequent words used, showing data management and personal privacy to be at the forefront of discussion.
¹Search terms included any reference to ‘corona/coronavirus/covid/covid-19 app’ as a single phrase and inclusive of alternative punctuation and spacing
²Note: common ‘stop words’ are excluded, for example ‘is’ or ‘and’, also the words ‘corona’, ‘covid’ and ‘app’ are excluded as these were the search terms and thus highly frequent.