Course Catalog

Full Day Courses
Title
Instructors
Haoda Fu, Amgen
Morning Half-Day Courses
Title
Instructors
Li Wang, AbbVie; Yunzhao Xing, AbbVie; Sheng Zhong, AbbVie
Birol Emir, Pfizer Inc; Michael Gaffney, Independent Consultant; Demissie Alemayehu, Pfizer Inc
Afternoon Half-Day Courses
Title
Instructors
Jing Qian, University of Massachusetts Amherst
Jerry Li, BMS; Ivna Chan, BMS; Inna Perevozskaya, BMS; Hao Sun, BMS
Gaohong Dong, Sarepta Therapeutics
Weijie Su, University of Pennsylvania; Jiancong Xiao, University of Pennsylvania; Xiang Li, University of Pennsylvania
Claude Petit, Astellas Pharma

Full-Day Courses:

Tutorial on Deep Learning and Generative AI:

Instructors: Haoda Fu, Amgen

Target Audience: People with at least master level statistics training

Prerequisites for participants: People know linear regression, basic programming

Computer and software requirements: Python, PyTorch

In an era where AI technologies are transforming industries, understanding their foundations and applications is essential for a wide range of professionals. This course is designed to equip statisticians, biostatisticians, researchers, and decision-makers with the mental models needed to navigate and leverage AI effectively. Whether you’re a decision-maker aiming to make informed choices about AI tools, a researcher seeking to integrate AI into your work, or a statistician looking to build advanced models, this course offers a valuable gateway to the world of AI.

Focusing on deep learning and generative AI, participants will gain hands-on experience with PyTorch, learn foundational concepts, and explore state-of-the-art architectures such as CNNs, GNNs, ResNet, U-Net, and transformers. The course also delves into applications of these models in medical imaging and drug discovery, as well as cutting-edge generative AI techniques like GANs, VAEs, DDPM, score-based models, and the mechanics behind large language models (LLMs).

By bridging technical knowledge with practical insights, this course empowers participants to apply AI in healthcare, research, and beyond, making it an indispensable resource for those seeking to understand and shape the future of AI in their fields.

The following are the outlines of the short course.

Why: History of Deep Learning and Generative AI
Build Our First Neural Network Model from Scratch
(Break 1)
Let Us Code Together
Build Our First Deep Learning Model for Computer Vision
(Launch break)
Sequence Classification Model
Nuts and Bolts for LLM: Sequence-to-Sequence Models
(Break 2)
Generative AI Family
Advanced Topics and Extensions: Generative AI on Smooth Manifolds
Final Thoughts

Short Bio:  Dr. Haoda Fu is Head of Exploratory Biostatistics in Amgen, before that he was an Associate Vice President and an Enterprise Lead for Machine Learning, Artificial Intelligence, from Eli Lilly and Company. Dr. Haoda Fu is a Fellow of ASA (American Statistical Association), and IMS Fellow (Institute of Mathematical Statistics). He is also an adjunct professor of biostatistics department, Univ. of North Carolina Chapel Hill and Indiana university School of Medicine. Dr. Fu received his Ph.D. in statistics from University of Wisconsin – Madison in 2007 and joined Lilly after that. Since he joined Lilly, he is very active in statistics and data science methodology research. He has more than 100 publications in the areas, such as Bayesian adaptive design, survival analysis, recurrent event modeling, personalized medicine, indirect and mixed treatment comparison, joint modeling, Bayesian decision making, and rare events analysis. In recent years, his research area focuses on machine learning and artificial intelligence. His research has been published in various top journals including JASA, JRSS-B, Biometrika, Biometrics, ACM, IEEE, JAMA, Annals of Internal Medicine etc.. He has been teaching topics of machine learning and AI in large industry conferences including teaching this topic in FDA workshop. He was board of directors for statistics organizations and program chairs, committee chairs such as ICSA, ENAR, and ASA Biopharm session. He is a COPSS Snedecor Awards committee member from 2022-2026, and also served as an associate editor for JASA theory and method from 2023, and JASA application and case study from 2025-2027.

Category: Technology Training


 

Morning Half-Day Courses:

 

Unleashing the power of machine learning and deep learning to accelerate clinical development:

Instructors: Li Wang, AbbVie; Yunzhao Xing, AbbVie; Sheng, Zhong, AbbVie.

Time: Morning Session
Target Audience: Professionals in statistics, biostatistics, or related disciplines who have an interest in utilizing machine learning, large language models, and computer vision to improve their work
Prerequisites for participants: Basic understanding of statistics and familiarity with programming as such python concepts. Prior exposure to machine learning concepts will be beneficial
Computer and software requirements: The laptop should have Python3.x installed, along with standard data science libraries such as NumPy, pandas, scikit-learn, or PyTorch. Participants should also have access to a text or code editor, such as Jupyter Notebook or Visual Studio Code, to facilitate hands-on exercises during the course.

With the rapid advancement of machine learning (ML) and deep learning (DL) methodology in the last decade, the performances of prediction tasks in many computer science fields (e.g., natural language processing) have been greatly improved. However, the impact of ML/DL in the field of clinical development has been relatively limited. Hence, we would like to propose a short course to motivate and encourage the use of ML/DL in clinical development. The course starts with an overview of ML/DL methodology evolution over time and the related key concepts (e.g., back-propagation, hyperparameter tuning, etc.). Then the latest developments in image processing and natural language processing are introduced, together with their novel applications in clinical development from our recent projects and submitted papers.

In terms of the course outline, the materials of the course are divided into three sections: I. General ML/DL methodology, II. Image processing and applications, and III. Natural language processing and applications.

The following are the outlines of the short course.

Part I – Machine Learning (ML) and Deep Learning (DL) Basics (45minutes)

Overview of similarities and differences between traditional statistics and ML/DL.

Introduction to fundamental neural network concepts: Data transformation (text, speech, images into numerical formats like vectors, matrices, and tensors). Key components of neural networks (neurons, weights, bias, activation functions). Explanation of feedforward data flow, loss functions, and backpropagation using numeric examples. Optimization techniques, such as gradient descent, to understand how models learn and improve.

Part II – Deep Convolutional Neural Networks (DCNNs) for Computer Vision (1 hour and 30 minutes including the break)

Provide an understanding of DCNNs’ pivotal role in computer vision, introducing image datasets, core operations, and significant architectures like VGGNet.

Discuss the evolution of object detection frameworks from Faster R-CNN to YOLO

Introduce image segmentation techniques including Mask R-CNN and U-Net. Explore the use of Generative Adversarial Networks for generating realistic data.

Provide a practical session demonstrating U-Net’s application in medical imaging.

Part III – Natural Language Processing (NLP) and Applications (1 hour and 15 minutes)

Show the pipeline workflow of a case study of applying NLP to detect adverse drug events using the X platform (formerly known as Twitter) Data

Cover fundamental NLP concepts, including word embeddings with Word2Vec. Provide a historical overview of language model advancements, including RNNs, LSTMs, and transformers. Address transformer-based LLMs, focusing on their architecture, self-attention, and efficiency improvements.

Apply these concepts to the case study, comparing model performances, followed by a code review

Short Bio: Li Wang, PhD, is currently Senior Director and Head of Statistical Innovation group in AbbVie. Li is leading Design Advisory which provides strategic and quantitative consulting as requested to all Development teams in all Therapeutic Areas to facilitate innovative thinking and complex innovative design evaluation. Li also co-leads Development Advanced Analytics capability in AbbVie to drive Machine Learning and Advanced Analytics research and application in Development. Prior to this senior leadership role, he led Immunology and Solid Tumor statistical design and strategy discussions and multiple ML, RWE and Bayesian innovation projects from 2017 to 2019. From 2006 to 2017, he contributed to and subsequently led several NDAs and SNDAs including blockbusters Eliquis, Onglyza and Rinvoq. He is enthusiastic in teaching statistical courses to non-statisticians, and investigating/ promoting novel statistical and machine learning methodologies.

He is very active in statistical communities as board member of ICSA (2023-2025), Chair-elect of Development Committee of SCT (2024), Chair of Education Committee of DIA, Cytel Innovation Advisory Board, DIA biostatistics industry and regulator forum planning committee and past chair of ICSA Midwest Chapter.

Li received his B.S. in Applied Mathematics from Peking University and his Ph.D in Statistics from Virginia Tech.

Li taught this short course in 2024 Regulatory-Industry Statistics Workshop and received lots of good feedbacks from statisticians from both pharmaceutical companies and FDA.

Dr. Yunzhao Xing is an associate director of Statistical Innovation at AbbVie, boasting a PhD in Material Science from the University of North Carolina at Chapel Hill and a background in Physics. Prior to AbbVie, he served as a senior scientist at Halliburton, focusing on sensor modeling and simulation. Since joining AbbVie in 2018, Yunzhao has led numerous successful projects in machine learning, deep learning, and image processing. His skill set encompasses web scraping, simulation modeling, and interactive web application development, making him a pivotal contributor to AbbVie’s Statistical Innovation Group. Yunzhao is recognized for his commitment to pushing the boundaries of statistical innovation.

Dr. Sheng Zhong is the Director of Statistics at AbbVie Inc. He received his Ph.D. in Statistics from the University of Chicago. At AbbVie, he led multiple innovative predictive modeling projects across different fields such as clinical trial enrollment duration forecasting, virtual controls based on targeted learning in single-arm trials, and predictive clinical safety monitoring based on structured and text data. His recent works have led to multiple publications and manuscripts under review. Before joining AbbVie in 2016, Dr. Zhong worked at a big data analytics start-up for heavy machine equipment maintenance, where his work led to 3 US patents.

Category: Methodology and technology training


Interface between Regulation and Statistics in Drug Development:

Instructors: Birol Emir, Pfizer Inc; Michael Gaffney, Independent Consultant; Demissie Alemayehu, Pfizer Inc

Time: Morning Session
Target Audience: This course is particularly aimed at statisticians who are relatively new to the pharmaceutical industry and wish to broaden their knowledge and understanding of the interplay between statistics and regulatory science in drug development.
Prerequisites for participants: The material is mostly written at a level that is accessible to audience with an intermediate knowledge of statistics.
Computer and software requirements: NA

This course is aimed primarily at statisticians who are relatively new to the pharmaceutical industry and wish to broaden their knowledge and understanding of the interplay between statistics and regulatory science in drug development. The main focus of the course is raising awareness in the intersection of statistics and regulatory affairs, with special emphasis on salient features of traditional and emerging issues and methodologies in the design, conduct, analysis, and reporting of clinical trials or observational studies intended for regulatory purposes. While the course is aimed at statisticians with limited experiences in this area, it may also be of benefit to other more experienced statisticians who wish to refresh their knowledge of current topics or keep up to date on best practices and regulatory developments. The course consists of four sections, with each section dedicated to a specific topic in regulatory affairs and statistics. In each case, the topic will be discussed from the statistical and regulatory perspectives. We will highlight current and emerging trends and suggest appropriate best practices. Notably, the course will also cover the recent progress in machine learning and Big Data analytics and the prevailing regulatory thinking on the integration of these new techniques in drug development. Issues of processing, analyzing and reporting multi-dimensional data will be highlighted, with special reference to data integrity, privacy and confidentiality.

The following are the outlines of the short course.

Part 1 introduces basic statistical and regulatory issues with special reference to the role of regulations and guidance documents, and the evolving role of the statistician vis-à-vis the changing regulatory and healthcare landscapes. In Part 2, we will discuss major statistical issues that commonly arise in the course of drug development and regulatory interactions, outlining measures that should be taken to ensure the validity of inferential results that are intended to be the basis for regulatory decision. The discussion will be illustrated with respect to regulatory guidance documents and best practices. Part 3 highlights the role of the statistician in the course of drug development, with special emphasis on the skills required to ensure effective interactions with regulatory and other external bodies. Part 4 addresses trending topics in drug development, with emphasis on the current regulatory thinking and the associated challenges and opportunities.
At the end of the course, we believe that the course participants will have a good understanding of the statistical and regulatory issues that commonly arise in the course of drug development. Notably, prospective attendees of this course will get a thorough appreciation of the current state of the statistical and regulatory sciences in the context of pharmaceutical research. In addition, attendees will be exposed to the behaviors and capabilities that are essential in their interactions with internal stakeholders and external partners, including DMCs and regulatory bodies.

Short Bio: Birol Emir, PhD, is Executive Director and Head of Real-World Evidence (RWE) Statistics at Pfizer Inc. He is a Fellow of the American Statistical Association and has served as Adjunct Professor of Statistics and Lecturer at Columbia University in New York. His primary focuses have been on real-world evidence generation, predictive modeling, and genomic data analysis. He has numerous publications in refereed journals, and recently, he co-authored “Interface between regulation and statistics in drug development (Alemayehu, Emir and Gaffney 2021, CRC Press) and he co-edited a book to fill the gap in health economics and outcome research (Alemayehu et al, 2017, CRC press). He has given many invited talks and short courses at statistical and clinical conferences.

Michael Gaffney, PhD, is a retired Vice President, Statistics, at Pfizer, and received his PhD from New York University School of Environmental Medicine with his dissertation in the area of multistage model of cancer induction. Dr. Gaffney has spent his 43- year career in pharmaceutical research concentrating in the areas of design and analysis of clinical trials and regulatory interaction for drug approval and product defense. He has interacted with FDA, EMA, MHRA, and regulators in Canada and Japan on over 25 distinct regulatory approvals and product issues in many therapeutic areas. Dr. Gaffney has published 40 peer- reviewed articles and has presented at numerous scientific meetings in diverse areas of
modeling cancer induction, variance components, harmonic regression, factor analysis, propensity scores, meta- analysis, large safety trials,
and sample size re-estimation. Dr. Gaffney was recently a member of the Council for International Organizations of Medical Sciences (CIOMS)
X committee and was a co- author of CIOMS X: Evidence Synthesis and Meta-​Analysis for Drug Safety.

Demissie Alemayehu, PhD, is Vice President and Head of the Statistical Research and Data Science Center at Pfizer Inc. He is a Fellow of the
American Statistical Association, has published widely, and has served on the editorial boards of major journals, including the Journal of the
American Statistical Association and the Journal of Nonparametric Statistics. Additionally, he has been on the faculties of both Columbia
University and Western Michigan University. He has co- authored a monograph entitled Patient-​Reported Outcomes: Measurement,
Implementation and Interpretation and co- edited another, Statistical Topics in Health Economics and Outcome Research, both published by
Chapman & Hall/ CRC Press.

Category: Methodology and career development

 

Afternoon Half-Day Courses:

Statistical methods for time-to-event data subject to truncation:

Instructors: Jing Qian, University of Massachusetts Amherst

          Time: Afternoon Session

Target Audience: students, practitioners or researchers with an interest in understanding statistical analysis of time-to-event data subject to truncation.

Prerequisites for participants: knowledge of statistical inference; basic knowledge of survival analysis.

Computer and software requirements: a computer/laptop installed with R (and RStudio) is recommended

Truncated time-to-event data arises in various fields, including biomedical sciences, public health, epidemiology, and astronomy. It involves biased sampling where the event time is observed only if it falls within a certain interval. This short course reviews statistical methods for time-to-event data subject to left, right, and sequential truncation, exploring both classical and advanced techniques.

The first half introduces classical risk-set adjustment methods for estimating event time distributions and conducting regression analysis with left-truncated data, with or without additional right censoring. Methods for right-truncated data will also be discussed. The assumption of quasi-independence between truncation and event times, which is crucial for the validity of classical methods, will be emphasized, along with hypothesis tests for assessing this assumption.

The second half covers recent methodological advances for analyzing truncated time-to-event data. Topics include methods for estimation and regression under dependent truncation, sequential truncation in observational cohort studies with complex sampling schemes, and techniques for estimation and regression under sequential truncation. Discussions will be supplemented with real-world data examples. R software will be used to demonstrate the implementation of the techniques.

The following are the outlines of the short course.

Teaching Plan for a half-day course entitled “”Statistical methods for time-to-event data subject to truncation””
(using morning time as an illustration)

8:30-9:30am, Part I: Introduction to time-to-event data subject to truncation, highlighting the difference between censoring and truncation. Classical risk-set adjustment methods for estimating event time distribution with left-truncated time-to-event data, with or without additional right censoring. Estimation of event time distribution with right-truncated time-to-event data.

9:30-10:15am: Part II: Regression analysis with left-truncated and right-censored time-to-event data, including Cox model and accelerated failure time model. Regression analysis with right-truncated data.

10:15-10:30am: 15 minutes break

10:30-11:45am: Part III: Hypothesis tests for assessing quasi-independence between truncation and event times. One-sample estimation and regression analysis methods under dependent truncation.

11:45am-12:30pm: Part IV: The concept of sequential truncation in observational cohort studies with complex sampling schemes. Methods for estimating event time distributions and performing regression analysis in the presence of sequential truncation.

Short Bio: Dr. Jing Qian is a Professor of Biostatistics in the Department of Biostatistics and Epidemiology at the University of Massachusetts Amherst, with extensive experience in statistical methodology and its applications to public health and biomedical research. Dr. Qian’s research focuses on the development of statistical methods for survival analysis of biomedical outcomes subject to complex censoring or sampling, biomarker evaluation and risk prediction, and covariates subject to censoring and truncation. His collaborative research spans neurodegenerative diseases such as Alzheimer’s and Parkinson’s diseases, breast cancer epidemiology, and health services research. He has served as the Principal Investigator on multiple NIH-funded grants and has published extensively in leading statistical and biomedical journals. Dr. Qian is also an experienced educator, having taught graduate-level courses on introductory, intermediate, and advanced biostatistical methods for over a decade.

Dr. Jing Qian is an experienced educator, having taught graduate-level courses on introductory, intermediate, and advanced biostatistical methods for over a decade at the University of Massachusetts Amherst. His teaching portfolio includes introductory and applied biostatistics courses for public health students, such as Introduction to Biostatistics and Intermediate Biostatistics; intermediate-level theory and methods courses for biostatistics graduate students, such as Fundamentals of Probability and Statistical Inference and Topics in Health Data Science; and advanced statistical theory and methods courses for Ph.D. students in biostatistics, such as Applied Statistical Learning and Advanced Statistical Inference. In all of these settings, Dr. Qian emphasizes the challenge and importance of engaging students and fostering active learning in the classroom.

Beyond classroom teaching, Dr. Qian has served as the primary dissertation advisor to more than 10 postdoctoral research fellows, doctoral students, and master’s students.

Category: Methodology and application


Introduction of Dynamic Borrowing in Clinical Trials and Regulatory Submission:

Instructors: Jerry Li, BMS; Ivan Chan, BMS; Inna Perevozskaya, BMS; Hao Sun, BMS

Time: Afternoon Session

Target Audience: Statisticians working on clinical trials

Prerequisites for participants: NA

Computer and software requirements: NA

Clinical trials represent a significant portion of drug development in both budget and duration. While randomized clinical trials are still gold standard, given the availability of the sheer amount of prior clinical trial data and real-world data/evidence, finding innovative ways to design more efficient clinical trials and to supplement relevant data for regulatory submission have become imperative and led to the increasing popularity of dynamic borrowing.

Dynamic borrowing can bring significant benefits to expedite drug development and to a company’s portfolio. Specifically, dynamic borrowing can overcome challenges when patients are difficult to enroll, reduce the size/duration/risk of a new trial ensuring adequate power, have great operational and cost saving benefits, and boost the power and improve the efficiency of analysis for a trial with a limited sample size.

This short course will cover the common sources of data for borrowing and introduce the approaches of both frequentist and Bayesian borrowing as well as the regulatory landscape in this space.

The following are the outlines of the short course.

The first part of this short course will introduce the rationale and overall benefits of dynamic borrowing, The first part will also cover the general methods. After the break, the second part will be more specific about the methods, regulatory landscape, and possible case studies and demo of an R Shiny App.

Short Bio: Dr. Jerry Li is currently a Director and TA Lead of Hematology Malignant Myeloid Diseases in Global Biostatistics and Data Sciences (GBDS), BMS. Jerry leads statistical support for the clinical development of multiple assets for clinical trials design including phase2/3 seamless design, interactions with worldwide health authorities, and life cycle management of the assets. Jerry established and co-lead the Dynamic Borrowing Working Group at BMS. Dr. also recently co-organized and co-moderated a whole-day Biostatistics Research and Innovation Network (BRAIN) meeting at BMS dedicated to dynamic borrowing topic.

Prior to BMS, Jerry was at Merck and Daiichi Sankyo following working at the FDA. He has held positions with increasing responsibilities in multiple therapeutic areas including oncology, neurosciences, immunology, and infectious disease and demonstrated a track record of successful regulatory approvals.

In addition to dynamic borrowing, Jerry is also interested in dose optimization, phase 2/3 seamless design, statistical modeling of disease-modifying treatment effect, and properties of log-rank test following covariate-adaptive randomization in oncology trials. Jerry received his Ph.D. in statistics from the University of Maryland, College Park.

Dr. Ivan Chan has more than 25 years of experience in the pharmaceutical industry. He is currently a VP and interim Head of Global Biometrics & Data Sciences at Bristol Myers Squibb. Prior to joining BMS, Ivan was VP and Head of Statistical Sciences at AbbVie leading multiple therapeutic areas. In addition, he spent 21 years previously at Merck Research Laboratories where he led the global statistical support for vaccines and early oncology.

Ivan received his B.S. in Statistics from the Chinese University of Hong Kong and Ph.D. in Biostatistics from the University of Minnesota. He is an elected Fellow of the American Statistical Association (ASA) and an elected Fellow of the Society for Clinical Trials (SCT). Ivan was the 2021 recipient of the Deming Lecturer Award from ASA for his outstanding contributions to vaccine development. He currently serves as Executive Director of the International Society for Biopharmaceutical Statistics and Co-Chair of Deming Conference on Applied Statistics. Ivan has previously served as the President of the International Chinese Statistical Association and the Program Chair of the ASA Biopharmaceutical Section. He has 90+ publications in statistical and clinical journals.

Dr. Inna Perevozskaya is a Fellow of the American Statistical Association and a Senior Director and Senior Biometrics Fellow, Head of Statistical Methodology at BMS, and co-lead of dynamic borrowing working group at BMS.

Dr. Hao Sun is currently a senior manager in Global Biostatistics and Data Sciences (GBDS), BMS. Hao received his PhD in Statistics from Iowa State University in 2022. In addition to supporting clinical trials at BMS, Hao is a co-lead of Methodology and Tools subteam within Dynamic Borrowing Working Group at BMS. He is also involved in research of dose optimization. Hao has successfully mentioned several summer interns in dynamic borrowing and dose optimization.

During his PhD, Hao was involved in multiple reach projects including providing high-dimensional mixed graphical model, establishing the consistency of graph reconstruction under complex survey sample designs, and developing design-based BIC for neighbor selection with group lasso to recover the true neighborhood as well as optimizing survey pseudo composite likelihood with coordinate gradient descent to estimate edge parameters. Other projects that Hao was working on include road change detection, Shiny App development for National Resource Inventory, and mixture responses for small area estimation.

Category: Methodology


Win Statistics (Win Ratio, Win Odds, and Net Benefit): Theories and Applications:

Instructors: Gaohong Dong, Sarepta Therapeutics;

Time: Afternoon Session
Target Audience: Statisticians, PhD students, and Statistical researchers in academia, industry, and government
Prerequisites for participants: NA
Computer and software requirements: NA

Over the past decade, the win ratio (Pocock et al. 2012), the win odds (Dong et al. 2019), and the net benefit (Buyse 2010)−as the ratio, odds, and difference of win proportions, respectively−have been comprehensively studied. The three win statistics hierarchically analyze prioritized multiple outcomes. Compared to the traditional “time to first event” analysis for multiple time-to-event outcomes, the win statistics allow the prioritization of multiple outcomes and effectively conduct a “time to worst event” analysis, which can be clinically more meaningful. Moreover, win statistics can incorporate multiple endpoints of same or mixed data types (e.g., time-to-event, ordinal, …), can handle repeated events, semi-competing risks, and non-proportional hazards situations.
The win ratio and the stratified win ratio (Dong et al., 2018) have been applied in the design and analysis of Phase III clinical trials, and have supported regulatory approvals, such as tafamidis and Attruby, respectively. The win odds has also been applied in practice.

The following are the outlines of the short course.

Part 1: Introduction and Theoretical Foundations
1. Introduction of win statistics
1.1. Motivation examples and issues of conventional time-to-first-event analyses
1.2. Win ratio, net benefit, and Finkelstein-Schoenfeld test
1.3. Mann-Whitney parameter
1.4. Win odds
2. Point and variance estimators
3. Complement of win statistics

Part 2: Advanced Concepts and Methods
4. Impact of follow-up time and censoring, and IPCW adjustment
4.1. Impact of follow-up time and censoring
4.2. IPCW (inverse-probability-of-censoring weighing) adjustment
4.3. Other adjustments
4.4. Use of win statistics under non-proportional hazards

Break

Part 2: Advanced Concepts and Methods (continued)
5. Stratified win statistics and handing of noncollapsibility
6. Regression analyses
7. Sample size and power calculations

Part 3: Applications and Practical Considerations
8. Applications
8.1. Cardiovascular trials (focusing on the ATTRibute-CM trial)
8.2. COVID-19 trials
8.3. Pediatric benefit-risk
8.4. Evidence synthesis of efficacy outcomes in oncology trials
8.5. Other applications
9. Regulatory perspective of win statistics
10. Software
11. Limitations and advantages of win statistics
12. Summary

Key references:
Finkelstein and Schoenfeld (1999, 2019); Buyse (2010); Pocock et al. (2012); Dong et al. (2016, 2018, 2020a, 2020b, 2020c, 2021, 2023a, 2023b, 2024); Luo et al. (2015); Bebu and Lachin (2016); Oakes (2016); Peng (2020); Brunner, Vandemeulebroecke, and Mütze (2021); Mao et al. (2021, 2022, 2023; 2024); Gasparyan et al. (2021 and 2022); ); Yu and Ganju (2022); Matsouak (2022); Yang et al. (2022); Cui, Dong, Kuan, and Huang (20223); Seifu et al. (2023); Wang, Zhou, Zhang, Kim et al. (2023); Maurer et al. (2018); Redfors et al. (2020); Lopes et al. (2021); Voors et al. (2022); Romiti et al. (2023); Weatherald et al. (2023); Kondo et al. (2023); Freund et al. (2023); Gregson et al. (2023); Barnhart et al. (2024); Pocock et al. (2024); Gillmore et al. (2024).

Short Bio: Gaohong Dong, PhD, has 20 years of experience in the pharmaceutical industry. He is a Director of Biostatistics at Sarepta Therapeutics. Prior to joining Sarepta, he worked at BeiGene and Novartis. Additionally, he worked as a consultant under his own entity of iStats Inc. Gaohong has been supporting drug development in multiple therapeutic areas including rare diseases, solid organ transplant, stem-cell transplant, infection diseases, and oncology. He is a co-author of many highly cited medical papers in transplant. Gaohong is deeply passionate about statistical research. He published peer-reviewed statistical journal papers and book chapters on Bayesian-Frequentist design, adaptive design, missing data imputation, meta-analysis, and composite of prioritized multiple outcomes. In recent years, his research has focused on the win statistics (win ratio, win odds, and net benefit). His research of the stratified win ratio and the win odds have been applied to the design and analysis of clinical trials, including many Phase III studies across multiple disease areas. Notably, the stratified win ratio (Dong et al., 2018) is the primary analysis for the ATTRibute-CM trial, which is the base for the FDA approval of Attruby in November 2024. Gaohong has been an Associate Editor of the Journal of Biopharmaceutical Statistics since 2017, and has served on the Scientific Program Committees for several major statistical conferences such as Regulatory-Industry Statistics Workshop (RISW), ICSA Applied Statistics Symposium, and Statistics in Pharmaceuticals in recent years.

Category: Methodology


Statistical Inference in Large Language Models:

Instructors: Weijie Su, University of Pennsylvania; Qi Long, University of Pennsylvania; Xiang Li, University of Pennsylvania

Time: Afternoon Session
Target Audience: PhD students and faculty who are interested in generative AI
Prerequisites for participants: NA
Computer and software requirements: NA

Large Language Models (LLMs) have recently stood out as revolutionary AI tools for processing data in the form of text. However, when harnessing their potential for statistical decision-making, it becomes essential to understand the risks of their outputs. Evaluating the uncertainty and confidence levels associated with LLMs presents both challenges and intriguing opportunities for today’s statisticians. The aim of this one-day short course is to equip statisticians with the skills to integrate inferential concepts into the applications and advancement of LLMs. Course topics include: 1) a brief introduction to the fundamentals of LLMs, tailored for those new to transformers and deep learning; 2) a primer on statistical inference techniques specifically for text data using LLMs; and 3) in-depth exploration of LLM applications in medical domains and the broader data science field. By the end of the course, attendees will possess the skills needed to empower LLMs with statistical inference. While this course promises a deep and enriching dive into the confluence of statistics and advanced AI, no prior knowledge of LLMs is required.

The following are the outlines of the short course.

Take the morning session as an example. The time arrangement would be similar for the afternoon session.

Morning Half-Day Course (8:30 a.m. – 12:30 p.m.)

Session 1: Understanding and Building LLM Foundations
8:30 a.m. – 10:15 a.m. (1 hour 45 minutes)

1. Understand the Evolution and Significance of LLMs
– Recognize the evolution and importance of Large Language Models in AI and data processing.

2. Grasp LLM Architectures and Mechanics
– Describe core principles and architectures of LLMs, including transformers and attention mechanisms.
– Learn how text data is processed, tokenized, and embedded in LLMs.

Break
10:15 a.m. – 10:30 a.m. (15 minutes)

Session 2: Challenges, Applications, and Ethics in LLMs
10:30 a.m. – 12:30 p.m. (2 hours)

1. Identify Challenges in LLMs
– Pinpoint common challenges and limitations such as overfitting and bias.
– Appreciate the importance of critically evaluating model outputs.

2. Analyze Real-World LLM Applications
– Evaluate the use of LLMs in healthcare settings, such as processing clinical notes and predicting patient outcomes.
– Identify other applications, including sentiment analysis and recommendation systems.

3. Navigate Ethical and Responsible Use of LLMs
– Recognize potential biases in medical text data and broader implications in data science.
– Advocate for the fair, ethical, and responsible use of LLMs across various domains.

Short Bio: Weijie Su is an Associate Professor in the Wharton Statistics and Data Science Department and, by courtesy, in the Departments of Computer and Information Science and Mathematics at the University of Pennsylvania. He is a co-director of Penn Research in Machine Learning (PRiML) Center. Prior to joining Penn, he received his Ph.D. in Statistics from Stanford University in 2016 and a bachelor’s degree in Mathematics from Peking University in 2011. His research interests span the statistical foundations of generative AI, privacy-preserving machine learning, high-dimensional statistics, and optimization. He serves as an associate editor of the Journal of Machine Learning Research, Journal of the American Statistical Association, Foundations and Trends in Statistics, and Operations Research, and he is currently guest editing a special issue on Statistics for Large Language Models and Large Language Models for Statistics in Stat. His work has been recognized with several awards, such as the Stanford Anderson Dissertation Award, NSF CAREER Award, Sloan Research Fellowship, IMS Peter Hall Prize, SIAM Early Career Prize in Data Science, ASA Noether Early Career Award, and the ICBS Frontiers of Science Award in Mathematics.

Jiancong Xiao is a postdoctoral researcher at the University of Pennsylvania, working with Professors Qi Long and Weijie Su. He received his Ph.D. from the Chinese University of Hong Kong, Shenzhen, an M.S. from the Chinese University of Hong Kong, and a B.S. from Sun Yat-sen University. His research interests lie in statistical and deep learning theory, with a focus on developing responsible and trustworthy machine learning models. His recent work explores statistical foundations of large language models. His research has been featured at top machine learning conferences, including NeurIPS, COLT, ICML, and ICLR.

Xiang Li is a postdoctoral researcher at the University of Pennsylvania, collaborating with Prof. Qi Long and Prof. Weijie Su. He received his Ph.D. in 2023 and B.S. in 2018 from the School of Mathematical Sciences at Peking University. His research lies at the intersection of statistics, stochastic optimization, and machine learning, with a recent focus on large language models. During his Ph.D., he made significant contributions to federated learning, stochastic approximation, online decision-making, and online statistical inference. His work has been featured at leading machine learning conferences, including ICML, ICLR, and NeurIPS, as well as in top journals such as JMLR and AOS.

Category: Methodology


An Outstanding Supervisor: Leading for Motivation, Innovation, and Retention:

Instructors: Claude Petit (Astellas Pharma) in collaboration with the Leadership in Practice Committee (LiPCom) of the Biopharmaceutical section of the ASA.

Time: Afternoon Session
Target Audience: If you lead a team or are considering a supervisory role, this course is for you.

This short course will bring to life the foundational concepts for becoming the ideal supervisor. Attendees will gain a deeper understanding of the essential leadership competencies that will empower them to grow a mentee or direct report, thus enabling them, in turn, to reach their full potential as well. The rewards of this development will cascade through the organization. Participants will learn and understand the expectations and behaviors necessary for becoming a supervisor for whom employees will want to work, increasing their team productivity through an elevated level of engagement. Engagement and fulfillment of employees is achievable when they feel motivated, are challenged to be the best they can be and are able to accomplish more than they thought they could. This course will consist of lecture, videos, and interactive panel discussions where participants will hear from seasoned and successful leaders about how they have learned from their experiences and developed tips and tricks for growing their supervisory skill set. Finally, participants will learn how to measure the right outcomes for enabling sustained growth in this dimension. It is said that employees do not leave companies, they leave supervisors. While many other leadership courses provide advice to statisticians, statistical analysts, and data scientists on how to be effective leaders, this course focuses on the critical role supervisors/professors/advisors play in their employees’ journeys to becoming strong leaders as well as individuals who propose and drive innovative ideas/solutions and effectively implement them. Strong supervisors, model desired employee behaviors, act as sponsors as well as mentors, contribute to their employees’ career satisfaction, support their employees’ work/life balance and generally retain good employees. If you are currently leading a team, managing a group, or considering a supervisory role, this course will help you be more effective.

This short course is being offered in collaboration with the Leadership in Practice Committee (LiPCom) of the Biopharmaceutical section of the ASA.

The following are the outlines of the short course.

This short course will bring to life the foundational concepts for becoming the ideal supervisor. Attendees will gain a deeper understanding of the essential leadership competencies that will empower them to grow a mentee or direct report, thus enabling them, in turn, to reach their full potential as well. The rewards of this development will cascade through the organization. Participants will learn and understand the expectations and behaviors necessary for becoming a supervisor for whom employees will want to work, increasing their team productivity through an elevated level of engagement. Engagement and fulfillment of employees is achievable when they feel motivated, are challenged to be the best they can be and are able to accomplish more than they thought they could. This course will consist of lecture, videos, and interactive panel discussions where participants will hear from seasoned and successful leaders about how they have learned from their experiences and developed tips and tricks for growing their supervisory skill set. Finally, participants will learn how to measure the right outcomes for enabling sustained growth in this dimension. It is said that employees do not leave companies, they leave supervisors. While many other leadership courses provide advice to statisticians, statistical analysts, and data scientists on how to be effective leaders, this course focuses on the critical role supervisors/professors/advisors play in their employees’ journeys to becoming strong leaders as well as individuals who propose and drive innovative ideas/solutions and effectively implement them. Strong supervisors, model desired employee behaviors, act as sponsors as well as mentors, contribute to their employees’ career satisfaction, support their employees’ work/life balance and generally retain good employees. If you are currently leading a team, managing a group, or considering a supervisory role, this course will help you be more effective.

Short Bio: Claude Petit earned her PhD in Biostatistics, concurrent with a medical degree in 1999 from the University of Kremlin Bicêtre (France) where she studied under Prof. Jean Maccario, employing Bayesian methods as applied to clinical trials, specifically involving the study and treatment of schizophrenia. She served as Adjunct Professor in Mathematics & Statistics at the University of Grenoble (1999), Medical University of Paris (2004), and at Ecole Nationale de la Statistique et d’Administration Informatique (ENSAI), she has been a lecturer at the Yale School of Public Health between  2012 and 2024.

Working in the field of statistics since 1994, Dr. Petit has worked at Sanofi-Aventis (formerly Rhone Poulenc Rorer); ESCLI (CRO); Laboratoires Servier; as well as Lincoln (CRO). She joined Boehringer Ingelheim, France as Biostatistics and Programming Head in 2004. After her move to the US in 2007, she served as Executive Director of Biostatistics and then Vice President of Biostatistics and Data management with Boehringer Ingelheim till July 2021. Currently, VP Statistical and Real World Data Science at Astellas, Claude is leading a global team of talented Statisticians and Programmers in US, Europe, Japan and China.

Eternal learner, she has a passion for leadership, growth and teaching. In 2021, she became a certified Executive Coach and funded Creating & Coaching Essential Leaders, LLC to empower one woman at a time.

Category: Career development

Scroll to top