Digital Epidemiology: designing machine learning approaches to combine Internet-based data sources to monitor and forecast disease activity in multiple locations and spatial resolutions

Presented May 24, 2018.

Mauricio Santillana, MS, PhD describes machine learning methodologies that leverage Internet-based information from search engines, twitter microblogs, crowd-sourced disease surveillance systems, electronic medical records, and historical synchronicities in disease activity across spatial regions, to successfully monitor and forecast disease outbreaks in multiple locations around the globe in near real-time.


May 24, 2018

Opioid Surveillance using Social Media: How URLs are shared among Reddit members

Nearly 100 people per day die from opioid overdose in the United States. Further, prescription opioid abuse is assumed to be responsible for a 15-year increase in opioid overdose deaths. However, with increasing use of social media comes increasing opportunity to seek and share information. For instance, 80% of Internet users obtain health information online, including popular social interaction sites like Reddit (, which had more than 82.5 billion page views in 20153.

January 21, 2018

A Suite of Mechanistic Epidemiological Decision Support Tools

We present the EpiEarly, EpiGrid, and EpiCast tools for mechanistically-based biological decision support. The range of tools covers coarse-, medium-, and fine-grained models. The coarse-grained, aggregated time-series only data tool (EpiEarly) provides a statistic quantifying epidemic growth potential and associated uncertainties. The medium grained, geographically-resolved model (EpiGrid) is based on differential equation type simulations of disease and epidemic progression in the presence of various human interventions geared toward understanding the role of infection control, early vs.

January 25, 2018

Qualitative and Quantitative Predictions of Infectious Diseases in Shirak Marz

The frequency of disease outbreaks varies as a result of complex biological processes. Analysis of these frequencies can reveal patterns that can serve as a basis for predictions.


The goal of this study was to identify the periodicity of seven zooanthroponoses in humans, and set epidemic thresholds for future occurrences.

January 21, 2018

Niche Modeling of Dengue Fever Using Remotely Sensed Environmental Factors and BRT

Dengue Fever (DF) is a vector-borne disease of the flavivirus family carried by the Aedes aegypti mosquito, and one of the leading causes of illness and death in tropical regions of the world. Nearly 400 million people become infected each year, while roughly one-third of the world’s population live in areas of risk. Dengue fever has been endemic to Colombia since the late 1970s and is a serious health problem for the country with over 36 million people at risk.

January 25, 2018

Free-Text Mining to Improve Syndrome Definition Matching Across Emergency Departments

Standard syndrome definitions for ED visits in ESSENCE rely on chief complaints. Visits with more words in the chief complaint field are more likely to match syndrome definitions. While using ESSENCE, we observed geographic differences in chief complaint length, apparently related to differences in electronic health record (EHR) systems, which resulted in disparate syndrome matching across Idaho regions.

January 21, 2018

Quantifying Model Form Uncertainty of Epidemic Forecasting Models from Incidence Data

Uncertainty Quantification (UQ), the ability to quantify the impact of sample-to-sample variations and model misspecification on predictions and forecasts, is a critical aspect of disease surveillance. While quantifying the impact of stochastic uncertainty in the data is well understood, quantifying the impact of model misspecification is significantly harder. For the latter, one needs a "universal model" to which more restrictive parametric models are compared too.


January 25, 2018

Tracking Neonatal Abstinence Syndrome in Missouri: Trends and the ICD-CM Transition

Neonatal Abstinence Syndrome (NAS) rates have tripled for Missouri residents in the past three years. NAS is a condition infants suffer soon after birth due to withdrawal after becoming opioid-dependent in the womb. NAS has significant immediate health concerns and can have long term effects on child development and quality of life.

January 21, 2018

Electronic case reporting of STIs: Are non-existent codes the reason for missing information?

Under the CDC STD Surveillance Network (SSuN) Part B grant, WA DOH is testing eICR of sexually transmitted infections (STI) with a clinical partner. Existing standard vocabulary codes were identified to represent previously-identified information gaps, or the need for new codes or concepts was identified.


January 21, 2018

HL7 Terminology Management for Disease Surveillance

In 2013, the Utah Department of Health (UDOH) began working with hospital and reference laboratories to implement electronic laboratory reporting (ELR) of reportable communicable disease data. Laboratories utilize HL7 message structure and standard terminologies such as LOINC and SNOMED to send data to UDOH. These messages must be evaluated for validity, translated, and entered into Utah’s communicable disease surveillance system (UT-NEDSS), where they can be accessed by local and state investigators and epidemiologists.

January 21, 2018


Contact Us


288 Grove Street, Box 203
Braintree, MA 02184
(617) 779 0880

This Knowledge Repository is made possible through the activities of the Centers for Disease Control and Prevention Cooperative Agreement/Grant #1 NU500E000098-01, National Surveillance Program Community of Practice (NSSP-CoP): Strengthening Health Surveillance Capabilities Nationwide, which is in the interest of public health.

Site created by Fusani Applications