EPIAIDEA integrates epidemiologic theory with AI-enabled methods to translate large-scale digital, clinical, and administrative data into validated evidence for public health action. Every output is grounded in causal reasoning — not just prediction.
I study population health through digital traces — Google Search trends, social platform signals, EHR records, and administrative data — applying epidemiologic methods to derive valid inferences from noisy, real-world sources.
Machine learning and NLP pipelines built with epidemiologic awareness: confounding control, external validation, and interpretable outputs that decision-makers can actually use — not just models that perform well on held-out test sets.
My work doesn't end at publication. I build live dashboards, surveillance platforms, and decision-support tools that translate findings into policy guidance, resource allocation, and health system response — at scale.
Most AI applied to health data is built backwards — starting with a model architecture and asking what data fits it. I work the other way: starting with a population health question, identifying the right data source, and only then choosing the method that preserves valid inference.
Epidemiologic grounding is non-negotiable. That means addressing confounding, specifying temporality, accounting for selection bias, and validating outputs against known benchmarks before any result is called decision-grade evidence.
The result is analytics that hold up under scrutiny — from peer review, from policy review, and from the communities whose health decisions depend on them.
Every project begins with a causal or descriptive question, not a dataset. The question defines data requirements, method selection, and validity criteria.
Linking digital signals, clinical records, administrative data, and geospatial layers — each source validated independently before integration.
Models are stress-tested for bias, calibration, and generalizability. Outputs are formatted for policy and clinical audiences — not just academic ones.
Findings ship as dashboards, maps, and surveillance platforms — not just PDFs. Deployed tools are monitored for drift and updated as conditions change.
Using predictive asset sensing and Digital Twin modeling, I identified geographic zones in the Southwest United States where extreme heat events create a complete collapse of cooling and shelter resources — what I term "Critical Deserts."
The analysis quantified a $49.25M systemic risk across affected counties by modeling the gap between heat-induced demand for emergency shelter and the actual structural capacity available. The digital twin framework allows real-time scenario modeling — planners can simulate how different intervention investments change risk exposure before committing resources.
A closed-loop surveillance framework that links population-level digital demand signals — specifically Google Search interest in fibroid treatment — with the structural care capacity available in each California county.
The observatory detects mismatches between where women are actively seeking fibroid care and where gynecologic surgical capacity actually exists. These mismatches identify counties where health system investment is most needed — evidence that would be invisible to traditional utilization data alone, which only captures care that was actually accessed.
An animated public-health command center fusing DMA-resolution Google Health Trends v1alpha search probabilities with a live ArcGIS FeatureServer overlay (Tracking_Hantavirus_2026). The dashboard introduces a Grouped Expression Methodology that defeats Google's k-anonymity suppression in rural Western US DMAs — recovering the "Rural Silence" that single-term queries cannot reach. Features include a virtual-time playback engine with interpolated map markers, a Digital EKG with a moving NOW sweep, breathing high-risk halos over the Four Corners and Pacific NW reservoir corridors, a DMA × Date intensity matrix with active-column tracking, and a 60-second polling overlay of CONFIRMED / DECEASED / SUSPECTED / MONITORING case features from the MV Hondius / Andes virus cluster.
Launch Command Center →A time-aligned policy surveillance platform tracking telehealth utilization before, during, and after COVID-19 alongside a 14-feature payment parity index scored 0–14 across US states. The index captures whether Medicaid and private payers have matched telehealth reimbursement to in-person rates — a structural equity issue that determines whether low-income patients can actually access virtual care. Built to support state Medicaid directors and policy analysts comparing parity trajectories across states.
Launch Platform →An API-driven geospatial surveillance platform mapping the structural overdose prevention gap across North Dakota. Using centroid-based travel time modeling, I calculated realistic drive times from population centers to the nearest harm reduction service — including syringe service programs, naloxone distribution sites, and MOUD providers. The platform surfaces counties where structural access is so limited that individual behavior change cannot realistically be expected to reduce overdose risk without system-level investment.
Launch Platform →A structured misinformation surveillance system analyzing fact-checked claims across GLP-1 receptor agonist therapies (Ozempic, Wegovy, Mounjaro) and unregulated weight-loss supplement markets. Using the ClaimReview schema and Google Fact Check API, I built a pipeline that categorizes claim types, tracks outlet-level misinformation velocity, and maps the information environment patients encounter before talking to a clinician. Relevant to FDA risk communication strategy and direct-to-consumer advertising oversight.
View Analytics →A statewide geospatial analysis mapping the density of licensed vape and tobacco retailers relative to K-12 school locations across North Dakota, with a Fargo-specific view at the census block level. Proximity scores were calculated using buffer zone analysis to identify retailers within state-defined restricted distances from school grounds. The platform provides a ready-to-use evidence base for local health departments pursuing retailer licensing reform, enforcement prioritization, or youth access prevention policy.
View State Map →A state-level surveillance summary drawing on CDC WONDER and FARS toxicology data to characterize cannabis positivity rates and polysubstance involvement — including concurrent opioids, stimulants, and alcohol — in injury fatalities across the United States. Designed for rapid policy scanning, the platform allows cross-state comparison of polysubstance patterns as cannabis legalization expands. The data surface is particularly relevant to traffic safety policy, forensic toxicology standards, and state medical examiner reporting requirements.
View Live Map →A five-year longitudinal analysis (2021–2026) quantifying healthcare access collapse during North Dakota's major winter weather events. Using a Friction Index framework, I modeled how road closures, power outages, and snowfall accumulation progressively isolate rural populations from hospital and clinic services. The platform includes a deployment blueprint for climate-resilient Sentinel Nodes — pre-positioned care capacity that can activate when primary access routes fail. Directly applicable to state emergency management and CMS rural health investment planning.
Launch Platform →An agent-based AI framework for large-scale epidemiological analysis, integrating real-world data, automated inference, and decision-support workflows. The framework coordinates specialized agents across data ingestion, bias assessment, causal modeling, and policy translation — enabling reproducible, transparent population-health analytics at a scale that traditional hand-curated pipelines cannot match.
View on GitHub →