Yilin Ning

Biostatistician, Data Scientist

Work email: yilin.ning AT duke-nus.edu.sg | Personal email: ningyilinnyl AT gmail.com


Research Fellow @ Centre for Quantitative Medicine, Duke-NUS Medical School.

Specialized in


Research interests

Explainable AI, Biostatistics, Epidemiology, Statistical programming.


2016-2020 PhD, NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore, Singapore.

2010-2014 B.Sc. (Hons. 2nd upper) in Statistics, Department of Statistics and Applied Probability, National University of Singapore, Singapore.


2021 Khoo Postdoctoral Fellowship Award, Duke-NUS Medical School, Singapore, Singapore

Selected Publications

See my Google Scholar page for a complete list.

Book Chapter

2022 Ning Y, Liu N, Ong MEH (2022). Types of Quantitative Data (Continuous, Categorical, Distributions, Skewness). Introducing, Designing and Conducting Research for Paramedics, p.115. Elsevier Health Sciences.



Ning Y, Volovici V, Ong ME, Goldstein BA, and Liu N (2023). A roadmap to fair and trustworthy prediction model validation in healthcare. arXiv preprint arXiv:2304.03779.

Liu M#, Ning Y#, Teixayavong S, Mertens M, Xu J, Ting DS, Cheng LT, Ong JC, Teo ZL, Tan TF, Narrendar RC, Wang F, Celi LA, Ong MEH, and Liu N (2023). Towards clinical AI fairness: A translational perspective. arXiv preprint arXiv:2304.13493. (#: equal contribution)

Xie F#, Ning Y#, Liu M, Li S, Saffari SE, Yuan H, Volovici V, Ting DSW, Goldstein BA, Ong MEH, Vaughan R, Chakraborty B, and Liu N (2023). A universal AutoScore framework to develop interpretable scoring systems for predicting common types of clinical outcomes. STAR Protocols 4(2):102302. (#: equal contribution)

Deng X#, Ning Y#, Saffari SE, Xiao B, Niu C, Ng SYE, Chia N, Choi X, Heng DL, Tan YJ, Ng E, Xu Z, Tay KY, Au WL, Ng A, Tan EK, Liu N, and Tan LCS (2023). Identifying clinical features and blood biomarkers associated with mild cognitive impairment in Parkinson’s Disease using machine learning. European Journal of Neurology 00:1–9. (#: equal contribution)

Li S, Ning Y, Ong ME, Chakraborty B, Hong C, Xie F, Yuan H, Liu M, Buckland DM, Chen Y, and Liu N (2023). FedScore: A privacy-preserving framework for federated scoring system development. arXiv preprint arXiv:2303.00282.


Ning Y, Ong MEH, Chakraborty B, Goldstein BA, Ting DSW, Vaughan R, and Liu N (2022). Shapley variable importance cloud for interpretable machine learning. Patterns 3(4):100452.

Ning Y, Li S, Ong ME, Xie F, Chakraborty B, Ting DS, Liu N (2022). A novel interpretable machine learning system to generate clinical risk scores: An application for predicting early mortality or unplanned readmission in a retrospective cohort study. PLOS Digit Health 1(6): e0000062.

Saffari SE#, Ning Y#, Xie F, Chakraborty B, Volovici V, Vaughan R, Ong MEH, Liu N (2021). AutoScore-Ordinal: An Interpretable Machine Learning Framework for Generating Scoring Models for Ordinal Outcomes. BMC Medical Research Methodology 22:286. (#: equal contribution)

Ning Y, Lam A, Reilly M (2022). Estimating risk ratio from any standard epidemiological design by doubling the cases. BMC Medical Research Methodology, 22:157.

Liu N, Ning Y, Ong ME, Saffari SE, Ryu HH, Kajino K, Lin CH, Karim SA, Rao GR, Ho AFW, Lim SL (2022). Gender disparities among adult recipients of layperson bystander cardiopulmonary resuscitation by location of cardiac arrest in Pan-Asian communities: A registry-based study. eClinicalMedicine, 44:101293.

Liu M, Ning Y, Yuan H, Ong MEH, Liu N (2022). Balanced background and explanation data are needed in explaining deep learning models with SHAP: An empirical study on clinical decision making. arXiv preprint, arXiv:2206.04050

Xie F, Yuan H, Ning Y, Ong ME, Feng M, Hsu W, Chakraborty B, Liu N (2022). Deep learning for temporal data representation in electronic health records: A systematic review of challenges and methodologies. Journal of biomedical informatics, 126:103980.

Xie F, Ning Y, Yuan H, Goldstein BA, Ong ME, Liu N, and Chakraborty B (2022). AutoScore-Survival: Developing interpretable machine learning-based time-to-event scores with right-censored survival data. Journal of Biomedical Informatics, 125:103959.

Xie F, Liu N, Yan L, Ning Y, Lim KK, Gong C, Kwan YH, Ho AF, Low LL, Chakraborty B, Ong MEH (2022). Development and validation of an interpretable machine learning scoring tool for estimating time to emergency readmissions. eClinicalMedicine, 1;45:101315.

Yuan H, Xie F, Ong MEH, Ning Y, Chee ML, Saffari SE, Abdullah HR, Goldstein BA, Chakraborty B, and Liu N (2022). AutoScore-Imbalance: An interpretable machine learning tool for development of clinical scores with rare events data. Journal of biomedical informatics, 129:104072.

Liu N, Wnent J, Lee JW, Ning Y, Ho AF, Siddiqui FJ, Lim SL, Chia MY, Tiah L, Mao DR, Gräsner JT (2022). Validation of the CaRdiac Arrest Survival Score (CRASS) for predicting good neurological outcome after out-of-hospital cardiac arrest in an Asian emergency medical service system. Resuscitation. 176:42-50.

Liu N, Liu M, Chen X, Ning Y, Lee JW, Siddiqui FJ, Saffari SE, Ho AFW, Shin SD, Ma MHM, Tanaka H, Ong MEH, PAROS Clinical Research Network Investigators (2022). Development and validation of an interpretable prehospital return of spontaneous circulation (P-ROSC) score for patients with out-of-hospital cardiac arrest using machine learning: A retrospective study. eClinicalMedicine, 48:101422.


Chen Y#, Ning Y#, Thomas P, Salloway MK, Tan MLS, Tai ES, Kao SL, and Tan CS (2021). An open source tool to compute measures of inpatient glycemic control: translating from healthcare analytics research to clinical quality improvement, JAMIA Open, 4(2): ooab033. (#: equal contribution)

Ning Y, Ho PJ, Støer NC, Lim KK, Wee HL, Hartman M, Reilly M, and Tan CS (2021). A New Procedure to Assess When Estimates from the Cumulative Link Model Can Be Interpreted as Differences for Ordinal Scales in Quality of Life Studies. Clinical Epidemiology, 13: 53–65.

2020 Ning Y, Tan CS, Maraki A, Ho PJ, Hodgins S, Comasco E, Nilsson KW, Wagner P, Khoo EYH, Tai ES, Kao SL, Hartman M, Reilly M, and Støer NC (2020). Handling ties in continuous outcomes for confounder adjustment with rank-ordered logit and its application to ordinal outcomes. Statistical Methods in Medical Research, 29(2):437-454.

Ning Y, Støer NC, Ho PJ, Kao SL, Ngiam KY, Khoo EYH, Lee SC, Tai ES, Hartman M, Reilly M, and Tan CS (2020). Robust estimation of the effect of an exposure on the change in a continuous outcome. BMC Medical Research Methodology. 2020 Dec;20(1):1-1.

Chen B, Bernard JY, Padmapriya N, Ning Y, Cai S, Lança C, Tan KH, Yap F, Chong YS, Shek L, Godfrey KM, Saw SM, Chan SY, Eriksson JG, Tan CS, and Müller-Riemenschneider, F (2020). Associations between early-life screen viewing and 24 hour movement behaviours: findings from a longitudinal birth cohort study. The Lancet Child & Adolescent Health, 4(3):201-209.

2019 Chen Y, Ning Y, Kao SL, Støer NC, Müller-Riemenschneider F, Venkataraman K, Khoo EYH, Tai ES, and Tan CS (2019). Using marginal standardisation to estimate relative risk without dichotomising continuous outcomes. BMC Medical Research Methodology, 19(1):1-14.

Tan CS, Støer NC, Chen Y, Andersson M, Ning Y, Wee HL, Khoo EYH, Tai ES, Kao SL, and Reilly M (2019). A stratification approach using logit-based models for confounder adjustment in the study of continuous outcomes. Statistical Methods in Medical Research, 28(4):1105-1125.

2018 Zhao X, Ning Y, Chen MI, and Cook AR (2018). Individual and population trajectories of influenza antibody titers over multiple seasons in tropical Singapore. American Journal of Epidemiology. 187(1): 135–143.

2017 Luo M, Lim WY, Tan CS, Ning Y, Chia KS, van Dam RM, Tang WE, Tan NC, Chen R, Tai ES, and Venkataraman K (2017). Longitudinal trends in HbA1c and associations with comorbidity and all-cause mortality in Asian patients with type 2 diabetes: a cohort study. Diabetes Research and Clinical Practice, 133:69-77.

2016 Salloway MK, Deng X, Ning Y, Kao SL, Chen Y, Schaefer GO, Chin JJL, Tai ES, and Tan CS (2016). A de-identification tool for users in medical operations and public health. In 2016 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI):529-532.

Conference Presentations

2022 A novel interpretable machine learning system to generate clinical risk scores: an application for predicting early mortality or unplanned readmission in a retrospective cohort study. AMIA 2022 Annual Symposium (2022)

AutoScore-Ordinal: An interpretable machine learning framework for generating scoring models for ordinal outcomes. AMIA 2022 Annual Symposium (2022)

2021 Weighted analyses of survival outcomes under complex study designs: An R implementation. 14th International Conference of the ERCIM WG on Computational and Methodological Statistics, invited oral presentation

2019 Conditional probit model for robust inference on change in continuous outcomes. 40th Annual Conference of the International Society for Clinical Biostatistics, oral presentation

2018 Handling ties in ranks in the rank-ordered logit model. The Joint International Society for Clinical Biostatistics and Australian Statistical Conference 2018, oral presentation

Statistical Software

SamplingDesignTools. Author of the R package that implements tools for working with various epidemiological study designs for studies of binary and survival outcomes.

Research Experience

Mar-Oct 2021 Postdoctoral research assistant, Saw Swee Hock School of Public Health, National University of Singapore & Department of Medical Epidemiology and Biostatistics, Karolinska Institutet

2014-2016 Research Assistant, Yong Loo Lin School of Medicine, National University of Singapore, Singapore

Teaching Experience

2022 Lecturer, GMS 5204 Data Science + Healthcare, Duke-NUS Medical School

Guest Lecturer, Beyond Classic Designs and Analysis for Health Data, Karolinska Institutet & National University of Singapore

Lecturer, R for Data Science, SingHealth Academy

May 2019 Lecturer, Introduction to R Commander, Saw Swee Hock School of Public Health, National University of Singapore

Sep 2018 Tutor, StatisticAlps 2018, 8th edition: Extended use of regression models for new epidemiological designs and analyses, Bicocca Summer School, University of Milano-Bicocca