Master's Project Title:

Evaluation of Missing Data in Chlamydia Cases Surveilled by the Minnesota Department of Health from 2012-2014

MCH Student:

Jessica Jensen

Date of Defense:

May 4, 2016



In 2014, the Centers for Disease Control and Prevention reported that chlamydia was the most prevalent disease ever reported. Chlamydia is a bacterial infection that is treatable if diagnosed, but due to its lack of physiological symptoms it frequently goes undiagnosed and unreported. If a case is diagnosed, the medical provider or laboratory identifying the case is legally obligated to report the case to the Minnesota Department of Health. Unfortunately, many of the case reports are incomplete with regards to key demographic, diagnosis, and treatment variables. A 2013 preliminary study completed by the Minnesota Department of Health identified that 40% of female cases are missing a value for pregnancy status, 40% of cases are missing a value for gender of sexual partner, and 8% are missing a value for address. This project aims to expand on the preliminary study by using logistic regression to characterize the missingness observed in the 2012-2014 data.


Race, age, sex, and facility type were used as predictors of missing(zip code), missing(pregnancy status), and missing(gender of sexual partner). Interactions were assessed in the 3 logistic regression models.  Adjusted odds ratios and 95% confidence intervals are reported. Race, age, sex, and facility type were also used as predictors in a multinomial regression model looking at total number of missing values.


Between 2012 and 2014, there were 56,580 cases of chlamydia. Of the eligible cases, 31.16% of chlamydia cases identified as male, and the mean age 23.6 years. Missing values were most frequent for lab report received (n=26,911) and gender of sexual partner (n=16,075). Statistically significant associations of missingness and age, and missingness and gender were observed. Age, race, facility type, and sex were all found to be independent predictors of the modeled outcomes.


The potential future impact of this work is extensive, including identifying current gaps. Once these barriers are identified, development of materials and trainings for medical facility staff that could better prepare them for tracking cases. Having a surveillance system that accurately captures all information from diagnosed cases would allow for quick identification of potential regional or demographic characteristics of outbreaks. Improved understand of these gaps could foster targeted interventions that prevent the future infections.