Promoting Recruitment for Open Science and Pipelines of Employability & Retention (PROSPER)
By Kylie Norman (LIDA Data Scientist Development Programme Manager) & Dr Emily Ennis (Research Culture Manager) University of Leeds
Data professionals are key to delivering on a wide range of public goods, for example the international SDGs, better healthcare, or understanding how people interact in an environment to improve emergency planning or urban infrastructures. By rigorously profiling, cleaning, analysing and visualising data, data professionals tell us not only what is important, missing, biased or possible with data, but also what is actionable and insightful. This is of key importance for researchers and policymakers. We need more data professionals and more diversity in the data workforce, so that data methods, insights and products are more representative and rigorous.
Yet the UK supply of data professionals currently falls far short of demand (ca. 10,000 UK graduate supply: 178,000 data roles demand),[1] and there is widespread misunderstanding about what data professionals need – from training, management or team members – in order to do their jobs.[2] The Technician Commitment[3] goes some way towards addressing the gap between technicians’ progression and recognition pathways and those of academics, but this commitment is not unique to data professionals, who are currently subject to an even sharper set of circumstances:
- Availability of data professional roles which exceeds supply and which does not define differences well between data roles or requirements role-to-role (e.g. Research Software Engineer vs Data Scientist).[4] This makes the recruitment terrain difficult to navigate for emerging data professionals.
- The international data workforce and field of data science are expanding at pace and ahead of sector-recognised authorities or standards of practice, so that it’s possible for bad data practice to go unchallenged “in the wild”.[5]
- With the rapid evolution of data methods and practices, the skillset of a data professional is at risk of being subject to infinite demand, which can make specialising seem risky in terms of future job opportunities.[6] If a commitment is not followed through to enable requisite training time for data professionals, they may feel that their methods and approaches are obsolete or that they are falling behind in terms of their own development and knowledge of data tools.
- There is an acknowledged underrepresentation of protected characteristics (and those from low socioeconomic backgrounds) in data professional roles.[7] This means that there is a risk that (a) biased datasets may not be as widely or frequently challenged or understood and (b) data methodologies or models may not account for a representative spectrum of human lived experience (for example where facial recognition systems have been trained using white men only and cannot therefore distinguish between the characteristics of those who do not fit this category).[8]
- Peri- and Post-pandemic periods have seen an uptick in instances of mental ill health in the UK, particularly amongst young people.[9] In the UK, there is a gap between perception of (i.e. underestimating) the scale of mental ill health and actual instances of mental ill health.[10] For early career data professionals, whose needs are not fully understood or supported “in the wild”, this could make them higher risk for mental ill health.
Project overview
PROSPER arose out of a growing awareness of the need to achieve the following three aims for LIDA’s Data Scientist Development Programme (DSDP):
- A more diverse and inclusive recruitment process so that the Programme, and data science conducted through it, could be more representative;
- A learning environment that would equip early-career data scientists for the challenges endemic in the data workforce;
- Up-skilling the DSDP leadership team to deliver an employability-building programme that responds to (and helps shape) the needs of the current UK data workforce.
Not much attention has yet been paid to the diversity of needs of early-career data professionals in practice and outside of academic literature.[11] Presented with a need to address this for LIDA’s own DSDP, project team members Norman and Ennis applied for Research England Enhancing Research Culture (ERC) funding and the project was awarded £18k of funding 2023-24 to deliver on the following objectives:
- Establish and resource a DSDP-specific mentoring service and establish a M&E framework for assessing its effectiveness;
- Understand more about the interpersonal skill-building needs of early-career data professionals to enhance their abilities to self-manage; grow a training curriculum for this;[12]
- Provide greater wellbeing supports for DSDP data scientists;
- Assess the impact of positive action for DSDP recruitment of women and Global Majority candidates on the data science enacted on the Programme;
- Assemble a positive action case for those from low socioeconomic backgrounds. Although this is not a protected characteristic under the 2010 Equality Act, it is an underrepresented demographic in the UK data workforce.
Fig. 1 Project Metrics Summary: the who, what, when, where and how.
The project team and DSDP mentors, Kim Wright and Karen Fletcher, undertook necessary training at the outset of PROSPER in order to undertake research and to deliver on the objectives, including Mental Health First Aid (Norman and Wright), Research Methods (Norman and Ennis) and ADHD & Anxiety (Norman, Wright and Fletcher). Norman was able to implement this training in designing and delivering new DSDP training; Norman and Ennis used qualitative research methods training to structure surveys and interview questions and to identify emerging themes; and Norman, Wright and Fletcher used accessibility training to assist in building a resource library on supporting neurodivergent people on the DSDP.
Ten data subjects from the 2023-24 DSDP data scientist cohort selected to become a part of the PROSPER research study, consenting under the project’s ethics. These data subjects were surveyed four times throughout the year via anonymised pulse surveys to ascertain their thoughts over time on DSDP recruitment, mentoring, and employability and wellbeing provision. PROSPER PI, Norman, was able to use survey results to sense check the DSDP interpersonal curriculum and mentoring provision in real time, adapting materials and approaches based on recommendations in feedback.
To establish the DSDP mentoring partnership, the 2023-24 data scientist cohort took part in a mentoring initiation workshop in October 2023, at which they co-produced the DSDP mentoring mission with their mentoring service leads and agreed a set of hero themes for mentoring. These themes included skills such as Advocacy & Allyship, Leadership and Confidence-building, and perception of their importance across the cohort of data scientists was tracked over time through the quarterly pulse surveys.
The mentoring team supported each mentoring hero theme by co-producing a mentoring resource library of worksheets, materials and reading per theme, led by PI Norman, who is CMI L5 Coaching certified. Through mentoring peer support sessions, the mentoring team agreed strategies for appropriate mentoring partnership contracting, goal-setting, safeguarding and service evaluation. One of the ideas that arose as a result of these sessions, which project funding was able to realise, was a mentoring card deck for the DSDP, designed to be used between mentor and mentee in-session. The ‘suits’ for these cards were determined by feedback received through the project’s research and was directly informed by its data subjects. An example of two of the mentoring cards from the deck can be seen below in Fig. 2.
Mental Health First Aid training for Norman and Wright helped inform design of wellbeing themed cards in the mentoring card deck, for example understanding workable range and de-escalating anxiety. This training also enabled Norman to design a new wellbeing induction to the Programme, foregrounded in clear signposting for services, supportive line-manager ‘traffic light’ meetings and safe spaces into which to bring emergent issues affecting mentees such as interpersonal – and even international – conflict (for example the war in the Middle East affecting some data scientists 2023-24).
Fig. 2 Mentoring card deck sample cards © Nifty Fox.
The success of these safe spaces relied on them being properly and conscientiously established from the outset of the Programme through discussion of values, ways of working and research culture. Integral to this was the fact that all data scientists should feel included in the DSDP team and able to show up as their authentic selves: to feel that this was a place of belonging.
This project also saw and assessed DSDP 2023-24 data scientists tackling the leadership challenge of designing and delivering primary school data skills engagement as part of LIDA Open Data Science for Schools (LODSS).
Fig. 3 Live-scribed pictogram of LIDA Open Data Science for Schools local challenges discussion with Victoria Primary schoolchildren, July 2024.
Carrying on the Programme’s previous work on inclusive recruitment through positive action, PROSPER PI, Norman, and the DSDP equity, diversity and inclusion Chair, Dr Sajid Siraj, collaborated on the University’s first evidence base from the UK data workforce for positive action recruitment of those from a low socioeconomic background. Positive action is the stipulation in the 2010 Equality Act that an applicant of equal merit, who has a protected characteristic underrepresented in the workforce, can be preferred over a candidate of equal merit who does not share this characteristic. The evidence of the underrepresentation was compelling[13] and the case passed, resulting in the 2024-25 recruitment campaign attracting a high number of applicants self-identifying as from low socioeconomic backgrounds. This resulted in 3 appointments to the Programme 2024-25 of those identifying as being from a low socioeconomic background.
Data and methods
PROSPER Co-I, Ennis, conducted 5 recorded interviews with 2023-24 DSDP data scientists who self-selected to be data subjects, and 5 recorded interviews with industry/ public sector (non-academic) data professionals, all of whom had previously worked as LIDA Data Scientists. The questions asked at interview were devised from the PROSPER project brief and Ennis used the Research Methods training to ensure a safe and inclusive interview space. These interviews were then transcribed and have been subjected to initial topic analysis (see Fig. 4 below) by Norman.
Fig. 4 Word cloud demonstrating the prominence of words in PROSPER interviews with international data scientists 2023-24.
The Fig. 4 cloud demonstrates the extent to which visa issues and sponsorship to work in the UK absorb a lot of the talking space in conversations around career progression for international data scientists hoping to secure data roles in the UK. “Rejection” and its gerunds appear throughout, which is indicative of the negative lens through which many 2023-24 employability conversations took place.
Fig. 5 Word cloud demonstrating the common themes in discussions of data science, mentoring and leadership in industry contexts. People, mentoring and communication come through strongly as significant for early career data professionals in these contexts.
PROSPER pulse survey data show that all hero themes agreed with service users at the outset of mentoring were held to be important, validating confidence in their initial selection, with no theme scoring below 60% at any point (see Fig. 6 below). Of note is the way in which the perceived importance of certain themes appreciates over time (e.g. Influencing & Negotiating and Self-awareness), whereas others (e.g. Leadership) undergo ‘seasonal’ variations, depending on the stage of the Programme and DSDP project and which skills the data scientist is needing to use during that time. Such ‘seasonal’ shifts can be used by the DSDP leadership team to provide reinforcement learning on certain hero themes in anticipation of ramping-up of activity.
Other themes show a depreciation in significance over time, notably Understanding Neurodiversity and Confidence Building. Whilst both still score as important, this drops over time, from which it’s possible to infer either that a skill may lose perception of importance when in competition with other skills (that are perhaps used more frequently), or that the Programme needs to rethink how and when it delivers learning on these themes.
Fig. 6 Graph describing the perceived importance of DSDP mentoring hero themes over a 12 month period.
Data Constraints
The following may have impacted the data:
- UK Visas & Immigration (UKVI) policy changes (Dec 2023) meant that employability of international DSDP data scientists for the UK data workforce became a critical and time-sensitive challenge. The changes to policy engendered understandable panic amongst the 2023-24 cohort of data scientists, 63% of whom were employed under skilled worker visas and who were at the vanguard of understanding changes to policy at the University.
- Conflict in the Middle East and UKVI policy changes acted as bafflers in answers to the mental wellbeing control question in the PROSPER quarterly pulse survey, as indicated by free text survey responses. DSDP research interview answers indicate this may also have impacted confidence in the perceived effectiveness of mentoring. Interviewee B states that there is a glass ceiling through which mentoring is unable to propel international data professionals seeking salaried jobs that would enable them to adhere to new UKVI minimum salaries.
- There was a change in the DSDP mentoring team due to staff illness in May 2024, resulting in some mentees being reallocated to other mentors. This may have had an effect on the mentoring experience of the three data scientist mentees directly impacted. However, the pulse survey data indicate a steady confidence in mentoring provided, with “mentoring contributes to my wellbeing” receiving ratings in a range of 87-91% over the course of the survey period. Furthermore, confidence in the structures and resources for DSDP mentoring remained consistently high at 93-94% throughout the year.
Key Data Findings
- Confidence-building is the most spoken-about desirable interpersonal skill in the PROSPER interviews. Perception of importance of confidence-building as a hero theme stayed within a range of 80-89% in pulse surveys throughout the survey period.
- The pulse survey results reveal that the mentoring hero theme perceived as most important is Communication & Networking, which is consistently valued higher than the other themes (see Fig. 6).
- There is a persisting miscognition of what positive action is and seeks to achieve. Some still cannot distinguish it from its illegal predecessor, positive discrimination, not recognising that positive action has its basis in law and operates only in cases of equal merit. This was especially borne out by interview subject D who shared a concern that they might be selected for roles based on background rather than merit as a result of positive action. Keeping positive action on the DSDP as ‘opt in’ at application stage is therefore necessary so that applicants have a choice as to whether or not to engage with it.
- Increasing mental health support provision doesn’t decrease instances of mental ill health, but rather increases resilience over time and means that mental health incidents are appropriately and promptly triaged. In Fig. 7 below, we can see from the median average that there was a steady rise in mental wellbeing amongst PROSPER survey respondents throughout 2023-24.
Fig. 7 Box plot describing the range in responses to the scaling question in PROSPER pulse surveys asking data scientists about their mental wellbeing. In this question, mental wellbeing is defined as encompassing thoughts, behaviours and feelings in general, both in and out of work. The box plot and whiskers indicate the range in responses per survey, in particular in February 2024, the period during which the international subset of the Programme was experiencing the negative effects of UKVI restrictions and ramping-up of conflict in the Middle East.
- Providing early career data scientists with leadership opportunities in a supportive environment enables them to develop leadership skills in a safe and accelerated way. In response to being invited to take part in LIDA Open Data Science for Schools (LODSS), data scientists gave a confidence rating of 93-100% over 6 months in pulse survey responses to attest to the leadership-building of taking part in LODSS.
- Feedback from sector-based data scientists at interview reveals that mentoring/ reflective time is seen as “nice to have” in wider data professional teams, rather than part of business as usual. This was seen as a disadvantage.
Initial Insights
- Interviews with data scientists 2023-24 and alumnae reveal 2 common elements to career progression are (1) having reflective space to convert experience into knowledge and (2) having relatable role models (“you can’t be what you can’t see”).
Fig. 8 Box plot showing the general tight clustering (with a couple of outliers) of high confidence in the relationship between mentoring and belonging on the DSDP, in particular in February-May 2024.
- By co-creating a mentoring mission, service users’ needs were included as part of service design upfront and there was an 88% survey confidence rating in the mentoring mission.
- Developing tailor-made mentoring, co-designed with service users, not only increases mentees’ professional & self-management skills, but also their belonging.
- Positive action (PA) recruitment increases diversity in data science; we were able to make a case for low socioeconomic background as grounds for positive action in data careers due to the current underrepresentation.
- Disinformation about what PA is and seeks to achieve persists across the University community in spite of a 91% confidence rating in PA messaging from the pulse survey results. This could be addressed through an institutional EDI communications campaign.
- Interpersonal skills have achieved parity with technical skills training in terms of perception of value on the DSDP, with PROSPER interviewees saying that sector progression is impossible without them.[14]
- Data on the impact of UKVI policy changes for skilled worker visas demonstrate a significant negative impact on data scientists’ well-being as well as (a) increased difficulty in finding jobs and (b) the highest rate of DSDP early leavers to date 2023-24 (36%) due to perceived job insecurity. Note, again, the wide range of responses in Feb 2024 to the mental wellbeing survey question (Fig. 7 above). Note also (Fig. 8) that Feb 2024 was generally a period of feeling belonging for the cohort. Further analysis of the data may reveal an interesting relationship here.
- Designing cohort training focused on belonging and collaboration increases research culture and perception of employability. Pulse survey respondents rated their confidence in the DSDP to upskill them for future data careers as µ 84%.
- Confidence in mentoring as a tool to enhance employability is borne out in survey respondents reporting 80% of mentoring goals were achieved after eleven months.
- Mentoring supports an upwards trajectory in data scientists refining their career next steps (see Fig. 9 below). The 100% pulse survey confidence rating in the Programme providing enough opportunities to discuss career progression, and the importance placed by PROSPER interviewees on reflective space to support determining career next steps (in particular interviewee 5), shows mentoring to play a vital role in development.
Fig. 9 Line graph showing the average confidence rating of 2023-24 DSDP data scientists in their clarity over career next steps.
Value of the research
This project has been led by the DSDP leader, PI Norman, who has been able to respond to feedback in real time and who has been well-placed to implement a mentoring service at pace fitted to the emerging needs of service users. Conducting service-based research in this way, by those traditionally seen as research-enablers, demonstrates that there is a place for research to be conducted by professional services as well as technical and academic staff.
The DSDP’s mentoring service as a model for early-career data professionals can be “lifted and shifted” to apply more widely across data roles at the University to deliver on the Technician’s Commitment. PROSPER’s positive findings from mentoring for data professionals suggest that mentoring would benefit other technicians if it were to become part of normative practice, in particular in supporting and signposting better recognition, training and career development opportunities.
It is hoped that mentoring can also be used as a means of encouraging more data professionals to step into leadership roles, or at the least to be able to advocate for their own critical thinking and problem-solving abilities in diverse teams which may comprise academics, other technicians and industry professionals. This would help to mitigate some of the perceived bias affecting data professionals expressed in the quotation below.
“As a technician in an academic team without a PhD, it’s difficult to be recognised and to be given space. Your judgement and identification of project scope are not valued as highly. There’s a sense of you existing in a box which is limited by people’s bias about what technicians can and can’t contribute. It means the technician has to build perception of their value from scratch each time with each project and team.”
LIDA Data Scientist 2023-24
More senior data professional role models representing a diversity of backgrounds are needed at the University in order to engage, inspire and enable access for a diversity of early career data scientists into the data workforce. Positive action is an effective initial tool to ensure that recruitment into senior posts is diverse and inclusive.
Given the current gap between supply and demand of data professionals in the UK, the rise in skilled worker visa minimum salary threshold as a result of UKVI policy change in December 2023 is highly damaging to the sustainability of the UK data workforce.[15] The policy change actively discourages international data talent which the UK workforce needs because of the ineligibility of many entry-level data professional roles for a skilled worker visa under the new policy.
Initial Impacts from PROSPER
- In spite of early leavers and UKVI policy restrictions, 2023-24 has seen the greatest retention of DSDP data scientists for the University to date (45%) thanks to active career mentorship.
- More time is now protected on the DSDP for reflective learning as this was a key need identified both by the 2023-24 DSDP cohort and alumnae interviewed by PROSPER.
- The DSDP has developed a flexible model of leadership, which is not static, but which enables everyone to consider themselves capable of leading at appropriate times where they hold the expertise and can drive the team and project forward.
- Students from Victoria Primary school in Keighley, who were the subjects of DSDP digital skills leadership engagement, reported changed views on coding as “fun” and “accessible” and said that they would like to attend more ‘DigiFun’ events in the future.
- Conducting research interviews with them has consolidated relationships with DSDP alumnae and their organisations and proved an effective (though protected) means of sharing knowledge of skills and the data landscape.
- 5 DSDP practice changes have been initiated as a result of data findings from this project, including: line manager ‘traffic light’ meetings to triage emergent situations, a new Ways of Working induction and a Career Story Workshop.
- There is a case and precedent now, at least in the data workforce, for considering socioeconomic background as a protected characteristic.
Research themes
- Education & Training
- Equity, Diversity & Inclusion
- Digital Futures
- Research Culture and the Technician Commitment
Thanks to
The 2023-24 cohort of LIDA Data Scientists who consented to be a part of this study, and who generously took part in helping to establish mentoring and fed back on this and other DSDP initiatives.
DSDP alumnae who generously consented to give time to attend research interviews in service of this project.
The DSDP mentoring team, who have been instrumental in establishing a new service and ensuring that it is co-created with its end users.
Generation UK & Ireland (in particular Jess Young) – who contributed thoughts and ideas on how to make DSDP recruitment more accessible to those coming from a low socioeconomic background.
The team at Nifty Fox Creative (in particular Claire Hubbard and Laura Evans-Hill) – for their collaborative vision in bringing the mentoring card deck to life.
Funder – This project was funded by the Research England Enhancing Research Culture open call, 2023-2024.
[1] Fearns, Harriss and Lally UK Parlaiment Post report, Data Science Skills in the UK Workforce, June 2023, p.1.
[2] See Harvard Data Science Review article: https://hdsr.mitpress.mit.edu/pub/mrf00vp9/release/2, especially section 6, ‘Myriad of Roles and Broad Access’.
[3] See https://www.techniciancommitment.org.uk/ for the national-level UK Technician Commitment.
[4] See https://hdsr.mitpress.mit.edu/pub/jhy4g6eg/release/9, 2019 and Hernandez’s ‘Learn to differentiate these data roles’ from 2022.
[5] Which has given rise to new standards since 2022 such as the UK Alliance for Data Science Professionals.
[6] See Harvard Data Science Review article: https://hdsr.mitpress.mit.edu/pub/mrf00vp9/release/2, especially section 6, ‘Myriad of Roles and Broad Access’.
[7] See the Royal Society’s 2014 Executive Summary, “A picture of the UK scientific workforce: Diversity Data
Analysis for the Royal Society”. The other protected characteristics examined were gender, disability and ethnicity. See also the ONS ‘Young people in the labour market by socio-economic background, UK: 2014-2021’, by Taylor, Santiago and Stripe (May 2022).
[8] See https://news.mit.edu/2018/study-finds-gender-skin-type-bias-artificial-intelligence-systems-0212.
[9] See House of Commons Library, Mental Health Statistics report, March 2024. For findings on mental health of children and young people, see NHS Digital’s Mental Health of Children and Young People in England, 2023 report.
[10] See Minh Kieu, Global Risk modelling 2024: https://leminhkieu.github.io/.
[11] See Geiger et al. ‘Career Paths and Prospects in Academic Data Science: Report of the Moore-Sloan Data Science Environments Survey’, 2018.
[12] See figure 21 from the ‘Quantifying the UK Data Skills Gap’ report that shows results from a survey of UK employers on the importance to them of various ‘soft’ skills for their data professionals. It ranks professionalism as most important, closely followed by communication, problem-solving and collaboration.
[13] Those from low socioeconomic backgrounds are more than five times less likely to enter the UK data workforce. See the Royal Society’s ‘A Picture of the UK Scientific Workforce’ report, p.3.
[14] See, again, figure 21 from the ‘Quantifying the UK Data Skills Gap’ report.
[15] See section 5.2.3 of the UK National Data Strategy which specifically refers to not jeopardising flow of international data professionals into the UK workforce.