User Tools

Site Tools


Data Science Seminar
Hosted by Department of Mathematical Sciences

  • Date: Tuesday, April 16, 2019
  • Time: 12:00pm – 1:00pm
  • Room: WH-100E
  • Speaker: David Madigan (Columbia University)
  • Title: Towards honest inference from real-world healthcare data


In practice, our learning healthcare system relies primarily on observational studies generating one effect estimate at a time using customized study designs with unknown operating characteristics and publishing – or not – one estimate at a time. When we investigate the distribution of estimates that this process has produced, we see clear evidence of its shortcomings, including an apparent over-abundance of estimates where the confidence interval does not include one (i.e. statistically significant effects). We propose a standardized process for performing observational research that can be evaluated, calibrated and applied at scale to generate a more reliable and complete evidence base than previously possible, fostering a truly learning healthcare system. We demonstrate this new paradigm by generating evidence about all pairwise comparisons of treatments for depression for a relevant set of health outcomes using four large US insurance claims databases. In total, we estimate 17,718 hazard ratios, each using a comparative effectiveness study design and propensity score stratification on par with current state-ofthe-art, albeit one-off, observational studies. Moreover, the process enables us to employ negative and positive controls to evaluate and calibrate estimates ensuring, for example, that the 95% confidence interval includes the true effect size approximately 95% of time. The result set consistently reflects current established knowledge where known, and its distribution shows no evidence of the faults of the current process.

seminars/datasci/190416.txt · Last modified: 2018/09/14 15:21 by qiao