Human-learned lessons about machine learning in public health surveillance

Presented December 13, 2018.

For public health surveillance, is machine learning worth the effort? What methods are relevant? Do you need special hardware? This talk was motivated by these and other questions asked by ISDS members. It will focus on providing practical—and slightly opinionated—advice about how to determine whether machine learning could be a useful tool for your problem.


December 21, 2018

Digital Epidemiology: designing machine learning approaches to combine Internet-based data sources to monitor and forecast disease activity in multiple locations and spatial resolutions

Presented May 24, 2018.

Mauricio Santillana, MS, PhD describes machine learning methodologies that leverage Internet-based information from search engines, twitter microblogs, crowd-sourced disease surveillance systems, electronic medical records, and historical synchronicities in disease activity across spatial regions, to successfully monitor and forecast disease outbreaks in multiple locations around the globe in near real-time.


May 24, 2018

Opioid Surveillance using Social Media: How URLs are shared among Reddit members

Nearly 100 people per day die from opioid overdose in the United States. Further, prescription opioid abuse is assumed to be responsible for a 15-year increase in opioid overdose deaths. However, with increasing use of social media comes increasing opportunity to seek and share information. For instance, 80% of Internet users obtain health information online, including popular social interaction sites like Reddit (, which had more than 82.5 billion page views in 20153.

January 21, 2018

A Suite of Mechanistic Epidemiological Decision Support Tools

We present the EpiEarly, EpiGrid, and EpiCast tools for mechanistically-based biological decision support. The range of tools covers coarse-, medium-, and fine-grained models. The coarse-grained, aggregated time-series only data tool (EpiEarly) provides a statistic quantifying epidemic growth potential and associated uncertainties. The medium grained, geographically-resolved model (EpiGrid) is based on differential equation type simulations of disease and epidemic progression in the presence of various human interventions geared toward understanding the role of infection control, early vs.

January 25, 2018

Qualitative and Quantitative Predictions of Infectious Diseases in Shirak Marz

The frequency of disease outbreaks varies as a result of complex biological processes. Analysis of these frequencies can reveal patterns that can serve as a basis for predictions.


The goal of this study was to identify the periodicity of seven zooanthroponoses in humans, and set epidemic thresholds for future occurrences.

January 21, 2018

Niche Modeling of Dengue Fever Using Remotely Sensed Environmental Factors and BRT

Dengue Fever (DF) is a vector-borne disease of the flavivirus family carried by the Aedes aegypti mosquito, and one of the leading causes of illness and death in tropical regions of the world. Nearly 400 million people become infected each year, while roughly one-third of the world’s population live in areas of risk. Dengue fever has been endemic to Colombia since the late 1970s and is a serious health problem for the country with over 36 million people at risk.

January 25, 2018

Free-Text Mining to Improve Syndrome Definition Matching Across Emergency Departments

Standard syndrome definitions for ED visits in ESSENCE rely on chief complaints. Visits with more words in the chief complaint field are more likely to match syndrome definitions. While using ESSENCE, we observed geographic differences in chief complaint length, apparently related to differences in electronic health record (EHR) systems, which resulted in disparate syndrome matching across Idaho regions.

January 21, 2018

Quantifying Model Form Uncertainty of Epidemic Forecasting Models from Incidence Data

Uncertainty Quantification (UQ), the ability to quantify the impact of sample-to-sample variations and model misspecification on predictions and forecasts, is a critical aspect of disease surveillance. While quantifying the impact of stochastic uncertainty in the data is well understood, quantifying the impact of model misspecification is significantly harder. For the latter, one needs a "universal model" to which more restrictive parametric models are compared too.


January 25, 2018

Tracking Neonatal Abstinence Syndrome in Missouri: Trends and the ICD-CM Transition

Neonatal Abstinence Syndrome (NAS) rates have tripled for Missouri residents in the past three years. NAS is a condition infants suffer soon after birth due to withdrawal after becoming opioid-dependent in the womb. NAS has significant immediate health concerns and can have long term effects on child development and quality of life.

January 21, 2018


Contact Us


288 Grove Street, Box 203
Braintree, MA 02184
(617) 779 0880

This Knowledge Repository is made possible through the activities of the Centers for Disease Control and Prevention Cooperative Agreement/Grant #1 NU500E000098-01, National Surveillance Program Community of Practice (NSSP-CoP): Strengthening Health Surveillance Capabilities Nationwide, which is in the interest of public health.

Site created by Fusani Applications