We are seeking a talented bioinformatician or genomics data scientist to deliver genome data QC and high-throughput assembly for the Moore Foundation Aquatic Symbiosis project.
The Tree of Life Programme at the Wellcome Sanger Institute is dedicated to the generation and analysis of high quality genome sequences from across eukaryotic biodiversity. The Tree of Life team is initiating a new, three-year project on the Genomics of Symbiosis in Aquatic Systems, funded by the Gordon and Betty Moore Foundation. We will use advanced genomics toolkits to describe and decipher the biology of thousands of species that live in intimate association. The project will collaborate internationally to deliver exciting research outcomes and further understanding of the importance of symbiosis in generating diversity and maintaining function in aquatic ecosystems. To deliver this new project we are recruiting a team of molecular biologists, computing scientists and project management support.
About the Role:
You will contribute to the development of methods and software for genome data QC and assembly. Sequencing technologies are constantly evolving in terms of the type and volume of the sequence data they produce. The recent progress in long-read sequencing technologies means that we are now able to efficiently deliver high quality genome assemblies for species that did not previously have such a resource. We will produce thousands of genome assemblies for symbiotic organisms, and analyse these to understand their biology. There are opportunities and challenges to design scalable and robust informatics solutions for the data tracking, storage, and analysis of this data. One of the most challenging aspects of this role will be to produce high-quality scientific results on a large scale while adapting to rapid developments in sequencing technology and software.
You will have some previous experience with genome bioinformatics or other large scale scientific data analysis, or a newly qualified graduate student with data science skills interested in DNA sequence data. While desirable, previous experience with DNA sequencing data is not strictly necessary for the position. We have a strong publication record and culture of producing open data resources and open source software development. This role requires an investigative and solution-oriented mindset and excellent communication skills to work effectively within large national and international consortia.
- A degree in a scientific discipline related to bioinformatics, or equivalent experience
- Record of multiple years of computational scientific data analysis
- Expert knowledge of the unix computing environment
- Proficiency in one or more scripting languages, preferably Python and Perl
Competencies and Behaviours:
- Excellent critical and problem-solving skills
- Attention to detail and the ability to work to meet timelines
- Ability to quickly adapt to new problems and ideas
- A high level of communication skills to be able to elicit complex requirements from, and convey complex information to, groups with different levels of technical knowledge
- Experience of managing and motivating junior staff
- Knowledge of new sequencing data and technologies
- Experience in genome assembly
- Experience with the git version control system
- Experience with running software on a compute farm, cluster, or cloud environment
- Previous experience with managing large volumes of data
- Experience with a compiled programming language such as C or C++
- Experience with workflow markup languages
- Experience with database management in MySQL or similar
- Web development experience