Sepsis-3 criteria in AmsterdamUMCdb: open-source code implementation

Sepsis is a major healthcare problem with substantial mortality and a common reason for admission to the intensive care unit (ICU). For this reason, the management of sepsis is an important area of ICU research. A number of large-scale, freely-accessible ICU databases are available for observational research and the robust identification of septic patients in such data sets is crucial for research purposes, particularly for comparative studies between critical care sub-populations which may vary around the world. However, data structures are poorly standardised due to inevitable variances in clinical electronic health record system vendor and implementation as well as research database design choices. Robust and well-documented cohort selection (such as patients with sepsis) is crucial for reproducible research. In this work, we operationalise the Sepsis-3 definition on the AmsterdamUMCdb, a recently published large European ICU database, publishing open-access code for wider use by critical care researchers.


Sepsis-3
The third consensus definition of sepsis, by the Sepsis-3 Task Force [1], recommended a revised definition to address and ameliorate previous limitations and to allow for greater consistency in operationalising the definition criteria across different centres [3]. Nevertheless, there is not unanimous agreement within the intensive care community on the utility of the new definition [5,6]. The Sepsis-3 clinical criteria are an acute increase in Sequential Organ Failure Assessment (SOFA) [4] of at least 2 points, accompanying a suspected or documented infection, with the criteria for septic shock further requiring both use of vasopressors and a lactate level >2 mmol/L. SOFA measures the severity of organ dysfunction across the six domains of the respiratory, neurological, cardiovascular, liver, coagulation and renal systems.

Amsterdam UMC database
The most seriously ill patients with sepsis are treated on ICUs, which are perhaps the most data-dense clinical environments. Continuous multimodality monitoring, together with clinical expertise, forms the bedrock of patient care. However, the need to carefully balance the potential research benefits against patient privacy, ethics and legal concerns has limited the number of openly-accessible, de-identified large-scale databases of critical care patients to a small handful, such as Medical Information Mart for Intensive Care (MIMIC) [7] and eICU [8]. Differences in ICU demographics, resources, admission criteria and treatment strategies across different countries restrict the ability to generalise knowledge from these databases to other ICU populations and so comparative studies are crucial. Robust and well-documented cohort definition (sepsis, in this case considered here) is crucial to such reproducible large-scale observational data research. However a lack of uniform standard in data collection across different vendors and implementations of electronic health records hamper easy re-usability of models and code which is therefore necessarily database-specific.
Amsterdam University Medical Centers Database (AmsterdamUMCdb) [9] is a new, freely-accessible European ICU database, released in collaboration with the Society of Critical Care Medicine (SCCM) and the European Society of Intensive Care Medicine (ESICM). Compliant with both the U.S. Health Insurance Portability and Accountability Act (HIPAA) [10] and the European General Data Protection Regulation (GDPR) [11] through iterative risk-based patient de-identification, this database contains close to 1 billion data points from 20,109 critically ill patients admitted to Amsterdam UMC between 2003 and 2016. The database consists of patients admitted both to ICU and to the 'medium care unit' (MCU) in Amsterdam UMC. This data, comprised of seven comma-separated value tables, is combined from multiple systems in a 'data lake' structure linked through anonymised identifiers. AmsterdamUMCdb has already been the focus of several multidisciplinary research events, including two ESICM datathons [12] and a Neural Information Processing Systems (NeurIPS) privacy challenge [13].

Sepsis-3 in Amsterdam UMC database
We provide a single script that computes the following: daily SOFA scores (individual components and total score) for each admission, antibiotic escalation on a daily basis, and finally sepsis/septic shock episodes (where one 'day' corresponds to each 24 h period after admission). Our definition of each SOFA component score follows the AmsterdamUMCdb SOFA script, and we extend this computation from just the 24 h period post admission to a longer time period spanning multiple 'days'. This time period may be specified early within our script. Where no SOFA scores were available prior to ICU admission, the SOFA components were assumed to be zero, as per the Sepsis-3 recommendation. However, at least three missing SOFA components resulted in discarding any identification of sepsis or not for that admission/day. A sepsis episode within a 24 h period was then defined as an increase in total SOFA score of at least two points between the previous and current, previous and subsequent, or current and subsequent 24 h periods, alongside an antibiotic escalation within that 24 h period. Finally, antibiotic use that accompanied admission after elective surgery were assumed to be prophylactic and as such was not classified as sepsis either on that 24 h period or the subsequent 24 h period. Any subsequent 24 h period that met the Sepsis-3 definition was however identified as a sepsis episode. Septic shock was defined as a subset of sepsis episodes with a cardiovascular SOFA score of at least 3 (i.e. using vasopressors) and a maximum lactate of at least 2 mmol/L. We assumed that vasopressors were administered if required to maintain a mean arterial pressure at least 65 mmHg, assuming adequate fluid administration.   Current  All admissions  Current  True  False  True  False  Sepsis-3  True  2114  4319  Sepsis-3  True  2410  5145  False  838  12,820  False  996  14,533 There are 25 admissions in total that have insufficient data for a Sepsis-3 diagnosis. Sensitivity for first admissions only is 0.33 and specificity is 0.94. Sensitivity for all admissions is 0.32 and specificity is 0.94. The column 'time' denotes the 'day' of admission, which is the 24 h period after the ICU/MCU admission. A negative 'time' indicates data prior to ICU admission (i.e. a partial SOFA score from when the patient was in a general ward prior to transfer to ICU). NaN indicates that this SOFA component was not measured or could be calculated from the data available. The total SOFA score is the sum of the components, with NaN values replaced by 0, as per [1,4].

Unique first admissions
period after admission could be made via the Sepsis-3 definition, the sensitivity of the above current criteria compared to the Sepsis-3 in the first 24 h is poor (Table 1).   The column 'time' denotes the 'day' of admission, which is the 24 h period after the ICU/MCU admission. NaN in the column 'antibiotic_escalation' indicates that this 24 h period occurs before any antibiotics were first administered. Antibiotic escalation in elective post-operative admissions was assumed to be a prophylactic increase in antibiotic administration, rather than an antibiotic escalation due to an infection. Post-operative s are also likely to have a high SOFA score because of surgery.

AVAILABILITY OF SOURCE CODE AND REQUIREMENTS
A sepsis episode is defined as antibiotic escalation accompanied by an increase in SOFA score of 2 or more. This increase in SOFA can either be over the previous and current 'days', the current and subsequent 'days', or the previous and subsequent 'days'. For the sepsis episode in this table (admissionid 2, time −1), the increase in SOFA from 1 to 6 from day −1 to day 0 is accompanied by an antibiotic escalation on day −1.

DATA AVAILABILITY
The dataset supporting the results of this article (AmsterdamUMCdb) is freely-accessible.

ETHICAL APPROVAL
Ethical approval for the data collection, deidentification and governance are described in [9]. No additional ethical approval was required for this manuscript.

COMPETING INTERESTS
The authors declare that they have no competing interests.