Yilin Ning
Biostatistician, Data Scientist
Currently
Senior Research Fellow @ Centre for Quantitative Medicine, Duke-NUS Medical School.
Specialized in
Biostatistics.
Research interests
Explainable AI, Biostatistics, Epidemiology, Statistical programming.
Education
2016-2020
PhD, NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore.
2010-2014
B.Sc. (Hons. 2nd upper) in Statistics, Department of Statistics and Applied Probability, National University of Singapore, Singapore.
Awards
2021
Khoo Postdoctoral Fellowship Award, Duke-NUS Medical School, Singapore, Singapore
Selected Publications
See my Google Scholar page for a complete list.
Book Chapter
Ning Y, Liu N, Ong MEH (2022). Types of Quantitative Data (Continuous, Categorical, Distributions, Skewness). Introducing, Designing and Conducting Research for Paramedics, p.115. Elsevier Health Sciences.
Journals
AI Fairness & Ethics
Viewpoint
Liu M#, Ning Y#, Teixayavong S, Mertens M, et al (2023). A translational perspective towards clinical AI fairness. npj Digital Medicine, 6:172. (#: equal contribution)
Review
Ning Y#, Teixayavong S#, Shang Y, Savulescu J, et al (2023). Generative Artificial Intelligence in Healthcare: Ethical Considerations and Assessment Checklist. arXiv preprint, arXiv:2311.02107.
Liu M#, Ning Y#, Teixayavong S, Liu X, et al (2024). Towards Clinical AI Fairness: Filling Gaps in the Puzzle. arXiv preprint, arXiv:2405.17921. (#: equal contribution)
Method
Liu M, Ning Y, Ke Y, Shang Y, et al (2024). Fairness-Aware Interpretable Modeling (FAIM) for Trustworthy Machine Learning in Healthcare. arXiv preprint, arXiv:2403.05235.
Interpretable Machine Learning: Method
Shapley Value
Ning Y, Ong MEH, Chakraborty B, Goldstein BA, et al (2022). Shapley variable importance cloud for interpretable machine learning. Patterns, 3(4):100452.
- Python library + R package: ShapleyVIC: Shapley Variable Importance Cloud for Interpretable Machine Learning
Ning Y, Li S, Ng YY, Chia MY, et al. Variable importance analysis with interpretable machine learning for fair risk prediction. PLOS Digit Health, 3(7):e0000542.
Liu M, Ning Y, Yuan H, Ong MEH, Liu N (2022). Balanced background and explanation data are needed in explaining deep learning models with SHAP: An empirical study on clinical decision making. arXiv preprint, arXiv:2206.04050
Scoring System
Xie F#, Ning Y#, Liu M, Li S, et al (2023). A universal AutoScore framework to develop interpretable scoring systems for predicting common types of clinical outcomes. STAR Protocols, 4(2):102302. (#: equal contribution)
- R package: AutoScore: An Interpretable Machine Learning-Based Automatic Clinical Score Generator
- Python library: https://github.com/nliulab/AutoScore-Python
Ning Y, Li S, Ong ME, Xie F, et al (2022). A novel interpretable machine learning system to generate clinical risk scores: An application for predicting early mortality or unplanned readmission in a retrospective cohort study. PLOS Digit Health 1(6): e0000062.
- Press release: Improving Risk Scores With Machine Learning
Saffari SE#, Ning Y#, Xie F, Chakraborty B, et al (2022). AutoScore-Ordinal: An Interpretable Machine Learning Framework for Generating Scoring Models for Ordinal Outcomes. BMC Medical Research Methodology 22:286. (#: equal contribution)
- R package: AutoScore: An Interpretable Machine Learning-Based Automatic Clinical Score Generator
- Python library: https://github.com/nliulab/AutoScore-Python
Xie F, Ning Y, Yuan H, Goldstein BA, et al (2022). AutoScore-Survival: Developing interpretable machine learning-based time-to-event scores with right-censored survival data. Journal of Biomedical Informatics, 125:103959.
Yuan H, Xie F, Ong MEH, Ning Y, et al (2022). AutoScore-Imbalance: An interpretable machine learning tool for development of clinical scores with rare events data. Journal of biomedical informatics, 129:104072.
Li S, Ning Y, Ong ME, Chakraborty B, et al (2023). FedScore: A privacy-preserving framework for federated scoring system development. Journal of Biomedical Informatics, 146:104485.
Li S, Shang Y, Wang Z, Wu Q, Hong C, Ning Y, et al (2024). Developing Federated Time-to-Event Scores Using Heterogeneous Real-World Survival Data. arXiv preprint, arXiv:2403.05229.
Interpretable Machine Learning: Application
Shapley Value
Deng X#, Ning Y#, Saffari SE, Xiao B, et al (2023). Identifying clinical features and blood biomarkers associated with mild cognitive impairment in Parkinson’s Disease using machine learning. European Journal of Neurology, 00:1–9. (#: equal contribution)
Scoring System
Liu N, Liu M, Chen X, Ning Y, et al (2022). Development and validation of an interpretable prehospital return of spontaneous circulation (P-ROSC) score for patients with out-of-hospital cardiac arrest using machine learning: A retrospective study. eClinicalMedicine, 48:101422.
Xie F, Liu N, Yan L, Ning Y, et al (2022). Development and validation of an interpretable machine learning scoring tool for estimating time to emergency readmissions. eClinicalMedicine, 1;45:101315.
General
Liu N, Wnent J, Lee JW, Ning Y, et al (2022). Validation of the CaRdiac Arrest Survival Score (CRASS) for predicting good neurological outcome after out-of-hospital cardiac arrest in an Asian emergency medical service system. Resuscitation, 176:42-50.
Biostatistics and Clinical Epidemiology
Method
Ning Y, Lam A, Reilly M (2022). Estimating risk ratio from any standard epidemiological design by doubling the cases. BMC Medical Research Methodology, 22:157.
Chen Y#, Ning Y#, Thomas P, Salloway MK, et al (2021). An open source tool to compute measures of inpatient glycemic control: translating from healthcare analytics research to clinical quality improvement, JAMIA Open, 4(2): ooab033. (#: equal contribution)
Ning Y, Ho PJ, Støer NC, Lim KK, et al (2021). A New Procedure to Assess When Estimates from the Cumulative Link Model Can Be Interpreted as Differences for Ordinal Scales in Quality of Life Studies. Clinical Epidemiology, 13: 53–65.
Ning Y, Tan CS, Maraki A, Ho PJ, et al (2020). Handling ties in continuous outcomes for confounder adjustment with rank-ordered logit and its application to ordinal outcomes. Statistical Methods in Medical Research, 29(2):437-454.
Ning Y, Støer NC, Ho PJ, Kao SL, et al (2020). Robust estimation of the effect of an exposure on the change in a continuous outcome. BMC Medical Research Methodology. 2020 Dec;20(1):1-1.
Application
Li Y, Yip M, Ning Y, Chung J, et al (2023). Topical Atropine for Childhood Myopia Control The Atropine Treatment Long-Term Assessment Study. JAMA Ophthalmology
Liu N, Ning Y, Ong ME, Saffari SE, et al (2022). Gender disparities among adult recipients of layperson bystander cardiopulmonary resuscitation by location of cardiac arrest in Pan-Asian communities: A registry-based study. eClinicalMedicine, 44:101293.
- Press release: Women who suffer from an out-of-hospital cardiac arrest are less likely to receive CPR from a bystander
Chen B, Bernard JY, Padmapriya N, Ning Y, et al (2020). Associations between early-life screen viewing and 24 hour movement behaviours: findings from a longitudinal birth cohort study. The Lancet Child & Adolescent Health, 4(3):201-209.
Review and Perspective
Liu M, Li S, Yuan H, Ong ME, Ning Y, et al (2023). Handling missing values in healthcare data: A systematic review of deep learning-based imputation techniques. Artificial Intelligence in Medicine, 22:102587.
Li S, Liu P, Nascimento GG, Wang X, …, Ning Y, …, et al (2023). Federated and distributed learning applications for electronic health records and structured medical data: a scoping review. Journal of the American Medical Informatics Association, 2023:ocad170.
Xie F, Yuan H, Ning Y, Ong ME, et al (2022). Deep learning for temporal data representation in electronic health records: A systematic review of challenges and methodologies. Journal of biomedical informatics, 126:103980.
Conference Presentations
2023
Robust and interpretable machine learning assessment of variable importance with moderate to small sample sizes. AMIA 2022 Annual Symposium (2023), poster presentation
2022
A novel interpretable machine learning system to generate clinical risk scores: an application for predicting early mortality or unplanned readmission in a retrospective cohort study. AMIA 2022 Annual Symposium (2022), oral presentation
AutoScore-Ordinal: An interpretable machine learning framework for generating scoring models for ordinal outcomes. AMIA 2022 Annual Symposium (2022), poster presentation
2021
Weighted analyses of survival outcomes under complex study designs: An R implementation. 14th International Conference of the ERCIM WG on Computational and Methodological Statistics, invited oral presentation
2019
Conditional probit model for robust inference on change in continuous outcomes. 40th Annual Conference of the International Society for Clinical Biostatistics, oral presentation
2018
Handling ties in ranks in the rank-ordered logit model. The Joint International Society for Clinical Biostatistics and Australian Statistical Conference 2018, oral presentation
Statistical Software
SamplingDesignTools. Author of the R package that implements tools for working with various epidemiological study designs for studies of binary and survival outcomes. The package is cited in the book Controlled Epidemiological Studies.
Research Experience
Jan 2021-Dec 2023
Research fellow, Centre for Quantitative Medicine, Duke-NUS Medical School
Mar-Oct 2021
Postdoctoral research assistant, Saw Swee Hock School of Public Health, National University of Singapore & Department of Medical Epidemiology and Biostatistics, Karolinska Institutet
2014-2016
Research Assistant, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
Teaching Experience
2022-2023
Lecturer, GMS 5204 Data Science + Healthcare, Duke-NUS Medical School
2022
Guest Lecturer, Beyond Classic Designs and Analysis for Health Data, Karolinska Institutet & National University of Singapore
Lecturer, R for Data Science, SingHealth Academy
May 2019
Lecturer, Introduction to R Commander, Saw Swee Hock School of Public Health, National University of Singapore
Sep 2018
Tutor, StatisticAlps 2018, 8th edition: Extended use of regression models for new epidemiological designs and
analyses, Bicocca Summer School, University of Milano-Bicocca