This blog is the last one of the series about SIRD model and machine learning. In this blog, I will cover the detailed procedures of the introduction blog using the data in the exploration analysis blog.

Model Fitting

After the data clean-up as explained in the previous blog, I constructed a clean dataset of 198 days from April 24th to November 7th, which contains 198 days. I set the first 100 days as a training set and the remaining 98 days as a test set. Specifically, the training set comprises 51…

This blog is a continuation of the previous introduction blog. In this blog, I will explore the patterns of COVID data in conjunction with demography and mobility, which is further used in an enhanced SIRD model in the next blog.

State Demography

Since the effect of COVID on people with different demographic backgrounds may be different, I first analyze the demographic pattern across states. I collected the demographic dataset from United States Census Bureau, which includes the age, sex and race characteristics in each state. …

Estimating the spread of disease is a crucial topic in epidemiology, and stochastic process models have been favored by the community of mathematicians and (bio-)statisticians. Among these stochastic process models, SIRD (Susceptible-Infected-Recovered-Deceased) model is a popular choice owing to its general yet expressive definition to capture the major transition of patient status. The evolution of the SIRD model requires a set of pre-defined parameters, which measures the likelihood of patients’ status transition. Traditionally, these parameters can be either empirically elicited using epidemiology knowledge or numerically inferred using some mathematical models. …

