Jonathan Lindbloom (U): A Bayesian Gaussian Process Model for COVID-19

Winner: Undergraduate Top 3
Winner: Dedman III (Undergraduate)

https://youtu.be/q6II2XvecSM

We present the Bayesian approach to parameter inference for SIR ODE models using Markov Chain Monte Carlo (MCMC) methods, along with its computational implementation using the PyMC3 probabilistic programming library. We show how changes in the transmission rate over time can be captured by change-point models. However, these change-point models fail to learn the underlying dynamics of the time-dependent transmission rate. To overcome this pitfall, we demonstrate how using Gaussian processes to place a functional prior over the time-dependent transmission rate does a better job at characterizing uncertainty in forecasts. Our approach removes the need to specify priors over change-points, captures uncertainty in the dynamics of the effective reproduction number, and flexibly fits county or state-level data without modification. To validate our model, we evaluate the accuracy of our model’s forecasts using scoring rules and compare its performance with that of other competing models submitted to the Center for Disease Control (CDC).

Jonathan Lindbloom
Majors: 
Mathematics, Finance
Faculty Mentor: Alejandro Aceves

3 thoughts on “Jonathan Lindbloom (U): A Bayesian Gaussian Process Model for COVID-19

  1. Jonathan, very informative talk. I’ve come across so many different models for COVID-19, and I wish we could all eventually agree on one that works best. Does your model take various interventions into account, such as mask wearing or enforced lockdowns? It seems there are a lot of variables to consider!

    1. Hi Dr. Son,

      You might be interested in ensemble models, in which many different forecasts are weighted or averaged in some way in order to produce a new forecast. The ensemble models are usually able to perform better than any single individual model, so I don’t think we’ll get to the point where a single model works best per say.

      To answer your second question, my model by proxy takes these interventions into account. The flexibility of GPs let the effective reproduction number R_t “bend” to fit the effects of these interventions in the past, but also to the effects of other changes in human activity (not necessarily due to interventions). In terms of forecasting, I am using GPs to “learn” the dynamics of R_t based off the past data in order to quantify with error bounds how it might change in the future. So my forecasting model looks at the past (inferred) changes in R_t to predict how it will change in the future, rather than looking at these interventions themselves. Of course, there are strengths and weaknesses associated with this approach – this is why it’s good to have many different types of models using different methodologies!

      Best,
      Jonathan Lindbloom

      1. Thank you! To me, COVID-19 seems very difficult to model which perhaps explains the numerous models out there. Yes, I think it would be good to consult ensemble models which I suppose “smooth” out the variations of the individual models.

Leave a Reply

Your email address will not be published. Required fields are marked *