scikit-survival: machine learning for time-to-event analysis

Dec 29, 2016

scikit-survival is a Python module for survival analysis built on top of scikit-learn. It allows doing survival analysis while utilizing the power of scikit-learn, e.g., for pre-processing or doing cross-validation.

About Survival Analysis

The objective in survival analysis (also referred to as reliability analysis in engineering) is to establish a connection between covariates and the time of an event. What makes survival analysis differ from traditional machine learning is the fact that parts of the training data can only be partially observed – they are censored.

For instance, in a clinical study, patients are often monitored for a particular time period, and events occurring in this particular period are recorded. If a patient experiences an event, the exact time of the event can be recorded – the patient’s record is uncensored. In contrast, right censored records refer to patients that remained event-free during the study period and it is unknown whether an event has or has not occurred after the study ended. Consequently, survival analysis demands for models that take this unique characteristic of such a dataset into account.

Sebastian Pölsterl

AI Researcher

My research interests include machine learning for time-to-event analysis, causal inference and biomedical applications.

Posts

scikit-survival 0.23.0 released

I am pleased to announce the release of scikit-survival 0.23.0.

This release adds support for scikit-learn 1.4 and 1.5, which includes missing value support for RandomSurvivalForest. For more details on missing values support, see the section in the release announcement for 0.23.0.

Moreover, this release fixes critical bugs. When fitting SurvivalTree, the sample_weight is now correctly considered when computing the log-rank statistic for each split. This change also affects RandomSurvivalForest and ExtraSurvivalTrees which pass sample_weight to the individual trees in the ensemble. Therefore, the outputs produced by SurvivalTree, RandomSurvivalForest, and ExtraSurvivalTrees will differ from previous releases.

Sebastian Pölsterl

Jun 30, 2024 1 min read

Publications

scikit-survival: A Library for Time-to-Event Analysis Built on Top of scikit-learn

scikit-survival is an open-source Python package for time-to-event analysis fully compatible with scikit-learn. It provides implementations of many popular machine learning techniques for time-to-event analysis, including penalized Cox model, Random Survival Forest, and Survival Support Vector Machine. In addition, the library includes tools to evaluate model performance on censored time-to-event data. The documentation contains installation instructions, interactive notebooks, and a full description of the API. scikit-survival is distributed under the GPL-3 license with the source code and detailed instructions available at https://github.com/sebp/scikit-survival

Sebastian Pölsterl

PDF Code Project