We are seeking a Data Scientist to support infrastructure for harmonization and integration of omics related data. Data comes in different formats, and the different programs typically have to be carefully put into workflows where the output format from one method needs to match the input format of downstream methods. Furthermore, data formats in biomedical research are characterized by a high degree of heterogeneity. Such heterogeneity necessitates laborious manual processing, a process that is both time-consuming and prone to errors hampering interoperability. There are “workflow wrappers” for some tasks, however, these are mainly suitable for sufficiently standardized tools using standardized data formats. However, even with these wrappers, most of the bioinformatician’s time in a project goes into this manual gluing of different tools with the data at hand and there is a demand to improve the interoperability. The efforts in this job concern tackling these challenges.
The Data Scientist will be located in the Gorodkin lab in section of Data Health Science and AI , Department of Public Health, Faculty of Health and Medical Sciences at University of Copenhagen and will closely interact with a range of national bioinformatics research groups and in the context of ELIXIR, both the national node (
https://elixir-denmark.org/) and the international hub (
https://elixir-europe.org/). The Data Scientist will be a part of the Novo Nordisk Foundation funded BioGLUE infrastructure project.
Start date is May 15, 2026 or as soon as possible thereafter. The position is permanent.
Our research and research group The bioinformatics group has a strong profile in data analysis, algorithm development and in building bioinformatic tools, many involving computational RNA biology. We have excellent computational infrastructure and access to supercomputing when needed. Our research environment is highly dynamic and international and stimulating with a wide range of activities from seminars, workshops, summer schools and retreats.
Your role and key responsibilities
In this job, you will address the "gluing" of diverse and heterogeneous datasets from both a general perspective and in specific use cases matching parts of the Danish national bioinformatics environment. A key focus will be to leverage Large Language Models (LLMs) to facilitate the "gluing" of diverse and heterogeneous datasets, enabling intelligent and scalable integration workflows. The specific use cases concern RNA structure, CRISPR, mass spec, microbiome, genomics, transcriptomics and personalized medicine. Additionally, the role will also involve server and software maintenance tasks including ensuring a seamless computational infrastructure.
Your key responsibilities will be to
- Utilize LLMs to enable homogenization of heterogeneous datasets for seamless integration at a general level.
- Document implementations, pipelines, methods and infrastructure configurations.
- Collaborate with the ELIXIR bioinformaticians and their research groups to evaluate the integrated data resources and their applicability.
- Manage, maintain, and improve data storage solutions and computational infrastructure for bioinformatics workflows. In here contribute to build server capacity with connection to external supercomputing when/if needed.
- Incorporate user experience.
- Provide training events both online and physically.
You and your qualifications If you are enthusiastic about the described role, are highly dedicated and possess strong skills you can well be the right candidate key to obtain successful results. You will work in a bioinformatically multi-disciplinary environment and collaborate with other researchers internally in the group and with the collaborating groups in the Danish ELIXIR node and European ELIXIR hub.
Essential experience and skills:
- You have completed a Master degree in bioinformatics, computer science or in a similar area
- You are highly experienced in Python and R
- You have strong experience with the Linux/Unix environment, command lines and shell scripting
- Experience with running local LLMs, ideally through frameworks such as LangChain or LlamaIndex.
- Experience with machine learning.
- Familiar with omics data analysis
- Familiar with data standards & API systems
- You have proficient communication skills
- You have excellent English skills written and spoken
- You are emphatic and a strong and supportive team player who can navigate in a multi-disciplinary context.
Key selection criteria:
- A PhD degree in bioinformatics, computer science or in a similar area
- Professional qualifications relevant for the position
- Relevant work experience
- Publications
- Language skills
- Creativity
What we offer - Great opportunities to grow professionally.
- Being on the forefront of tomorrows data tools for seamless data integration in range of rapidly growing bioinformatics areas.
- Unique network possibilities within the Danish and European bioinformatics communities.
- Opportunity to work in a highly international network (ELIXIR).
Place of employment The place of employment is at the Department of Public Health, University of Copenhagen, Øster Farigmagsgade 5, 1353. The physical workplace is currently at Frederiksberg campus.
Terms of employment The average weekly working hours are 37 hours per week.
The starting date is May 15, 2026 or as soon as possible thereafter.
The position permanent employment.
Employment will be as Data scientist (research consultant). Salary, pension and other conditions of employment are set in accordance with the Agreement between the Ministry of Taxation and AC (Danish Confederation of Professional Associations) or another relevant organisation.
Questions For further information about the scientific content of the position please contact Professor Jan Gorodkin, email
gorodkin@sund.ku.dk, phone +45 23375667 or for application procedure and formalities, please contact HR Officer, email
hr-ifsv@adm.ku.dk.
Foreign applicants may find this link useful:
www.ism.ku.dk (International Staff Mobility).
Application procedure Your online application must be submitted in English by clicking ‘Apply now’ below. Furthermore, your application must include the following documents/attachments – all in PDF format:
- Motivation letter of application. In this letter you must briefly detail, one by one, how you meet the requirements for each item listed under “Essential experience and skills” and “Desirable experience and skills”.
- CV incl. education, work/research experience, language skills and other skills relevant for the position.
- A certified/signed copy of Master certificate including transcripts, and if obtained, then also PhD certificate
- List of publications.
- Personal Recommendations. If none, please explain why it has not been possible to obtain.
Deadline for applications: 6 April, 2026, 23.59pm CET We reserve the right not to consider material received after the deadline, and not to consider applications that do not live up to the above-mentioned requirements.
The University of Copenhagen wish to reflect the diversity of society and encourage all qualified candidates to apply regardless of personal background.