serojump: A Bayesian tool for inferring infection timing and antibody kinetics from longitudinal serological data

Abstract

Understanding acute infectious disease dynamics at individual and population levels is critical for informing public health preparedness and response. Serological assays, which measure a range of biomarkers relating to humoral immunity, can provide a valuable window into immune responses generated by past infections and vaccinations. However, traditional methods for interpreting serological data, such as binary seropositivity and seroconversion thresholds, often rely on heuristics that fail to account for individual variability in antibody kinetics and timing of infection, potentially leading to biased estimates of infection rates and post-exposure immune responses. To address these limitations, we developed serojump, a novel probabilistic framework and software package that uses individual-level serological data to infer infection status, timing, and subsequent antibody kinetics. We validated serojump using simulated serological data and real-world SARS-CoV-2 datasets from The Gambia. In simulation studies, the model accurately recovered individual infection status, population-level antibody kinetics, and the relationship between biomarkers and immunity against infection, demonstrating robustness under observational noise. Benchmarking against standard serological heuristics in real-world data revealed that serojump achieves higher sensitivity in identifying infections, outperforming static threshold-based methods and precision in inferred infection timing. Application of serojump to longitudinal SARS-CoV-2 serological data taken during the Delta wave provided additional insights into i) missed infections based on sub-threshold rises in antibody level and ii) antibody responses to multiple biomarkers post-vaccination and infection. Our findings highlight the utility of serojump as a pathogen-agnostic, flexible tool for serological inference, enabling deeper insights into infection dynamics, immune responses, and correlates of protection. The open-source framework offers researchers a platform for extracting information from serological datasets, with potential applications across various infectious diseases and study designs.

Publication
In PLOS Computational Biology

Supplementary notes can be added here, including code and math.

Serological data Bayesian inference Antibody kinetics Infection timing SARS-CoV-2 Computational biology