Mary Lena Bleile: Imputation of Counterfactual Tumor Volumes

Co-authors: Steve Jiang, Dan Nguyen, Debabrata Saha, Michael Story, Casey Timmerman, Robert Timmerman, Yixun Xing

https://youtu.be/uF44O6k4y7s

Dropout is a statistical problem which occurs when an experimental unit that one is taking serial measurements from becomes unavailable for further measurements, prior to the end of the study. One common instance of dropout is found in tumor growth experiments performed on animal subjects: some animals are sacrificed when they are in too much pain, or bad condition. Unlike traditional missing data problems in time series data, this issue poses unique statistical problems due to the fact that the resultant dropout process results in a monotonic missingness pattern: if we observe missingness at a time t, then we necessarily have missingness at all timepoints t* >t. We introduce a novel method for imputation of tumor volume counterfactuals: we build a multivariate growth curve with random effects as inspired by Heitjan, et al (1993), and apply Bayesian methods in order to acquire a random sample for each parameter, in concordence with the literature on multiple imputation. One can then leverage conditional distribution theory to acquire a complete dataset from each of the random posterior samples. We additionally supply an R package for ease of execution of our method.

Mary Lena Bleile
Program: PhD in Biostatistics
Faculty mentor: Daniel Heitjan

Shuang Jiang: BayesSMILES: Bayesian Segmentation Modeling for Longitudinal Epidemiological Studies

Winner: Biostatistics (Graduate)

Co-authors: Quan Zhou, Xiaowei Zhan, Qiwei Li

https://www.youtube.com/watch?v=hac49ntMlLQ

The coronavirus disease of 2019 (COVID-19) is a pandemic. To characterize its disease transmissibility, we propose a Bayesian change point detection model using daily actively infectious cases. Our model builds on a Bayesian Poisson segmented regression model that 1) capture the epidemiological dynamics under the changing conditions caused by external or internal factors; 2) provide uncertainty estimates of both the number and locations of change points; and 3) has the potential to adjust for any time-varying covariate effects. Our model can be used to evaluate public health interventions, identify latent events associated with spreading rates, and yield better short-term forecasts.

Shuang Jiang
Program: PhD in Biostatistics
Faculty mentor: Xiaowei Zhan

Micah Thornton: Examining Uses of DFT distance metrics in SARS-CoV-2 Genomes

Co-authors: Monnie McGee

https://youtu.be/bFW9xMpdSp0

The Fourier Coefficients (FC) of a genomic sequence can be calculated according to a method proposed earlier this decade by Yin et al. Here we are concerned with the efficacy of these coefficients in capturing useful information about viral sequences. The FCs are rapidly computable and comparable which allows for speedy real-time numerical analyses of sequences. In this work we investigate using the FCs as summaries of SARS-CoV-2 sequences by applying regional classification procedures, and graphical examination. Specifically we extract geographic submission location from sequences submitted to the GISAID Initiative, and attempt to use the FCs to classify these sequences in addition to displaying them visually utilizing dimensionality reduction. We show that the FCs may serve as useful numerical summaries for sequences which allow manipulation, identification, and differentiation via classical mathematical and statistical methods not readily applicable to character strings. Further we argue that subsets of the FCs may be usable for the same purposes, indicating a reduction in storage requirements. We conclude by offering extensions of the research, and potential future directions for subsequent analyses and further theoretical development of techniques specific to the FCs and suggesting different kinds of series transforms for discretely indexed signals like genomes.

Micah Thornton
Program: PhD in Biostatistics
Faculty Mentor: Monnie McGee

Yuqui Ian Yang: A decentralized sparse fare splitting algorithm

https://youtu.be/hAgmXErY-xI

A fare splitting algorithm that is guaranteed to remove all redundant transactions is proposed. We proved the equivalence of the non-existence of redundant transactions and the minimization of the total transaction amount in the system. We also showed that although the resulting transaction amount is unique, the final transaction graph is not. By selecting a basic feasible solution, however, we can achieve a sparse solution.

Yuqui Ian Yang
Program: PhD in Biostatistics
Faculty mentor: Daniel Heitjan