Data Study Group Principal Investigator

Alan Turing Institute London United Kingdom Academic Engagement
Warning! Vacancy expired

Company Description

Named in honour of Alan Turing, the Institute is a place for inspiring, exciting work and we need passionate, sharp, and innovative people who want to use their skills to contribute to our mission to make great leaps in data science and AI research to change the world for the better.

Please find more information about us here


Data Study Groups (DSG) are the Turing’s version of a hackathon, yet much more collaborative in nature. DSG events are held a number of times over a year. They provide an engagement tool for postgraduate students and above (participants), as well as industry, government and third sector (Challenge Owners, CO) to engage with the Turing Institute. For participants it is a training activity, primarily peer to peer learning, where they get to work on real-life data science problems. For COs, it is an entry level engagement tool to working with the Turing, with our objective to develop the outputs of the DSG into further and bigger research projects.

DSGs require both logistical planning and data science expertise to help prepare challenges for the events, which is provided by a multidisciplined team coordinated by the Institute.


The DSG Principal Investigator (DSG PI) will be required to take academic ownership of a singular DSG challenge. They are responsible for:

  • Scoping the overall DSG challenge into something that will be suitable for DSG participants
  • Supporting the DSG participants during the event and acting as quality control on what they write, code and developed solutions, (e.g., completeness and scientific integrity)
  • Reporting to ensure the final report is of publishable quality for the Turing website, qualifying the outcomes and suggesting how the DSG project can be developed into a broader and longer-term research project.

This is an opportunity for early career researchers to gain valuable real-world experience working collaboratively with industry, government or third sector. Support and training will be offered in and around ethics, communicating with industry and project design.

DSG Challenges:

  • Dstl – The Discovery project seeks to enable semi-automated horizon scanning to understand the shifting science and technology (S&T) landscape. Topic modelling is used to find topics and trends in content from academic journals for monitoring purposes, allowing the user to identify emerging technology
  • Leading Pharmaceutical Company – The project aims to develop a solution that can allow researchers to extract and consolidate actionable insights from published sources in near real-time leveraging natural language processing (NLP). This will be a proof-of-concept project focusing on extracting actionable insights
  • Biodiversity Challenge - This challenge presents an exciting opportunity for researchers to work on the UK’s largest publicly accessible source of biodiversity data. The challenge of this data set is considerable: heterogeneous sources of data, aspects of both spatial and temporal analysis.
  • Manufacturing Challenge - The project aims to develop an effective machine learning methodology capable of dynamic predications that will enable, at minimum, more effective optimisation of dynamic plants, hopefully leading to more optimal chemical plants can reduce environmental impact, by consuming less resources per unit of production.


Scoping: The DSG PI leads on the academic design of a challenge, working closely with the Challenge Owner. The DSG PI is the academic lead, the CO the problem and context giver. The DSG PI will scope the problem, taking it from an industrial/commercial problem and turning it into a multi-directional academic challenge that can be presented to participants for the DSG and tackled in four days (nine days if event is online). This includes ensuring the problem is novel, the data is enough to support a solution, and that potential solutions are not too complex to implement.

Supporting: During the event itself, the DSG PI should provide academic input and suggestions to the group about the challenge. They should not direct but support the group in what they are investigating. The DSG PI will be supported by a facilitator (taken from the group cohort) who will manage the day-to-day group coordination during the DSG event. The DSG PI will also need to review the contents of the report, ensuring that the narrative is coherent and well-organised, relevant to the DSG question, and scientifically rigorous (e.g., with assumptions and shortcomings clearly stated, and achievements not over-stated).

Reporting: The project will conclude with a published report (on the Turing website); co-authored by the DSG group and finalised by the DSG PI. As part of this, the DSG PI will further evaluate the results and expand on potential follow-on engagement opportunities to continue the work started during the DSG.

DSG PIs will also need to keep a work diary of what they are doing and for how long and log in the HR Portal. This will be for monitoring and payment.

For more detailed overview of the role, the prospective candidate should review the DSG PI supplementary doc.


  • A PhD (or equivalent experience and/or qualifications) in a relevant area including Statistics, Mathematics, Engineering, Computer Science, or related discipline.
  • Familiar in a wide range of data science and AI techniques
  • Fluency in one or more modern programming languages used in research in data science and artificial intelligence (e.g. Python)
  • Coordinating and editing a multi-author academic paper or report
  • Proven experience in data science techniques and real-world datasets particularly relevant for the challenge at hand
  • Designing an academic study with experiments
  • Experience with messy real world data (dependant on challenge)
  • Experience managing, structuring, and analysing research data

Other information

If you are interested in this opportunity, please click the apply button below. You will need to register on the applicant portal and complete the application form including your CV and covering letter.

The cover letter (up to 2 pages) should demonstrate your ability to suggest multiple potential

methodological approaches to the challenge being applied for, as well as demonstrate:

  • Experience in applied data science
  • Willingness for multi-disciplinary collaborative work
  • Enthusiasm for working with industry, government and third sector to take their business problems and convert into data science research projects

If you have questions about the role or would like to apply using a different format, please contact us on 020 3970 2148 or 0203 862 3340, or email [email protected].

CLOSING DATE FOR APPLICATIONS: Sunday 26 February 2023 at 23:59

If you are applying for more than one role at the Turing, please note that only one Cover Letter can be visible on your profile at one time. If you wish to apply for multiple roles and do not want to overwrite your existing Cover Letter, please apply for the role using the button below and forward your additional cover letter directly to [email protected] quoting the job title.

If you are an internal applicant and wish to apply, please send your CV and Cover Letter directly to [email protected] and your application will be considered.

We are committed to making sure our recruitment process is accessible and inclusive.

This includes making reasonable adjustments for candidates who have a disability or long-term condition. Please contact us at [email protected] to find out how we can assist you.


This is a zero hours position per project/challenge. The hourly rate is £25.

Time commitment is an average of 135 hours (however, this could increase or decrease dependant on the challenge), typically spread over 25 days in a 4 – 6 months period, but not evenly distributed.

Please consult the DSG PI supplementary doc for a detailed breakdown.

The time commitment can be roughly broken down as follows:

  • Pre-event stage: c.70 hours spread over 10/11 working days to prepare for the event.
  • Event stage:3-4 hours per day, total of 20 hours for the week – to take place outside of normal working hours – e.g. 5-9pm Monday-Friday
  • Post-event stage: c.10 hours spread over 8 weeks to complete and finalise the report.

The Alan Turing Institute is based at the British Library, in the heart of London’s Knowledge Quarter.

We are currently assessing the results of our hybrid working trial, which ran for six months. We will soon publish a long-term workplace policy: as a guide, we anticipate this will be between 2-4 days per month. Some roles may require the jobholder to spend a greater number of days in the office, but the hiring manager will be able to confirm this during the interview.


The Alan Turing Institute is committed to creating an environment where diversity is valued and everyone is treated fairly. In accordance with the Equality Act, we welcome applications from anyone who meets the specific criteria of the post regardless of age, disability, ethnicity, gender reassignment, marital or civil partnership status, pregnancy and maternity, religion or belief, sex and sexual orientation.

Reasonable adjustments to the interview process will be made for any candidates with a disability.

Please note all offers of employment are subject to obtaining and retaining the right to work in the UK and satisfactory pre-employment security screening which includes a DBS Check.

Full details on the pre-employment screening process can be requested from [email protected].