The aim of survival analysis – also referred to as reliability analysis in engineering – is to analyse the time until one or more events happen. Examples from the medical domain are the time until death, until onset of a disease, or until pregnancy. In engineering, the time until the failure of a mechanical system is a common application. In a typical clinical study, the exact time of an event will remain unknown for a subset of individuals, simply because some remained event-free before the study ended or decided to withdraw from the study. For these patients, it is unknown whether they did or did not experience an event after termination of the study. The only valid information is that any (unobserved) event must have occurred after the study ended. This property needs to be considered when applying machine learning to these type of data.
In this talk, I will give an introduction to survival analysis and demonstrate how to analyse survival data using scikit-survival (https://github.com/sebp/scikit-survival): a Python module for survival analysis built on top of scikit-learn. I will introduce survival data from various domains and explain why traditional regression and classification methods are unsuitable. Using practical examples, I will demonstrate how scikit-survival can be used to estimate the time until an event and how additional variables can be used to improve prediction. Finally, I will give an outlook on more advanced methods, which are suitable to analyse high-dimensional clinical data.
In recent years, Docker has become an essential tool for software development. We demonstrate that Docker containers together with the GitLab platform can be a useful tool for researchers too. It enables them to easily catch problematic code, automate analysis workflows, archiving of results, and share their software and its dependencies across platforms. While a Docker image bundles the whole development stack and enables its cross-platform sharing, it is often cumbersome and repetitive to build, run, and deploy an image. GitLab is a software development platform built on top of the Git version control system with built-in support for Docker. Using GitLab’s continuous integration pipelines, most tasks related to managing Docker images can be automated. In addition, utilising tools from software development, we can perform automatic code analysis to identify faulty or problematic code as early as possible. We explain how to setup a Docker-powered project in GitLab and how to automate certain tasks to ease the development workflow:
How to automatically build a new Docker image once a project has been updated.