User Tools

Site Tools


seminars:datasci:031825

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

seminars:datasci:031825 [2025/03/11 16:00]
mwang46 created
seminars:datasci:031825 [2025/03/11 16:03] (current)
mwang46
Line 10: Line 10:
 <WRAP centeralign>​**//​Abstract//​**</​WRAP>​ <WRAP centeralign>​**//​Abstract//​**</​WRAP>​
  \\  \\
-The vast repositories of Electronic Health Records (EHR) and medical claims data hold untapped potential for studying rare but critical events, such as suicide attempt. Conventional setups often model suicide attempt as a univariate outcome and also exclude any ``single-record'' ​patients with a single documented encounter due to a lack of historical information. However, patients who were diagnosed with suicide attempts at the only encounter could, to some surprise, represent a substantial proportion of all attempt cases in the data, as high as 70-80%. We innovate a hybrid & integrative learning framework to leverage concurrent outcomes as surrogates and harness the forbidden yet precious information from single-record data. Our framework employs a supervised learning component to learn the latent subspace that connect primary (e.g., suicide) and surrogate outcomes (e.g., mental disorders) to historical information. It simultaneously employs an unsupervised learning component to utilize the single-record data, through the shared latent subspace. Our general formulation covers various outcome types and model specifications,​ including reduced-rank regression and autoencoders. Theoretically,​ we show that utilizing single records leads to a faster convergence rate of recovering the shared subspace. With hospital inpatient data from Connecticut,​ we demonstrate that single-record data and concurrent diagnoses indeed carry valuable information and utilizing them can substantially improve suicide risk modeling. \\+The vast repositories of Electronic Health Records (EHR) and medical claims data hold untapped potential for studying rare but critical events, such as suicide attempt. Conventional setups often model suicide attempt as a univariate outcome and also exclude any "single-record" ​patients with a single documented encounter due to a lack of historical information. However, patients who were diagnosed with suicide attempts at the only encounter could, to some surprise, represent a substantial proportion of all attempt cases in the data, as high as 70-80%. We innovate a hybrid & integrative learning framework to leverage concurrent outcomes as surrogates and harness the forbidden yet precious information from single-record data. Our framework employs a supervised learning component to learn the latent subspace that connect primary (e.g., suicide) and surrogate outcomes (e.g., mental disorders) to historical information. It simultaneously employs an unsupervised learning component to utilize the single-record data, through the shared latent subspace. Our general formulation covers various outcome types and model specifications,​ including reduced-rank regression and autoencoders. Theoretically,​ we show that utilizing single records leads to a faster convergence rate of recovering the shared subspace. With hospital inpatient data from Connecticut,​ we demonstrate that single-record data and concurrent diagnoses indeed carry valuable information and utilizing them can substantially improve suicide risk modeling. \\
  
  
seminars/datasci/031825.1741723258.txt · Last modified: 2025/03/11 16:00 by mwang46