Trustworthy Machine Learning

by Kush R. Varshney

Hello and welcome! I’m very happy that you’ve entrusted me and the book Trustworthy Machine Learning to accompany you on your journey toward creating trustworthy machine learning systems. Version 0.9 of the book, which is posted here, contains all of the chapters. I am making the book available at no cost because I do not want to limit its contents only to the most resourced. Some of the next steps I will be taking are to assemble a panel of diverse voices to suggest improvements, work on accessibility for the visually impaired, and create a print version that will be available at the lowest possible cost through an online retailer.

Accuracy is not enough when you’re developing machine learning systems for consequential application domains. You also need to make sure that your models are fair, have not been tampered with, will not fall apart in different conditions, and can be understood by people. Your design and development process has to be transparent and inclusive. You don’t want the systems you create to be harmful, but to help people flourish in ways they consent to. All of these considerations beyond accuracy that make machine learning safe, responsible, and worthy of our trust have been described by many experts as the biggest challenge of the next 5 years. I hope this book equips you with the thought process to meet this challenge.

This book is most appropriate for technologists in high-stakes domains who care about the broader impact of their work, have the patience to think about what they’re doing before they jump in, and do not shy away from a little math.

In writing the book, I have taken advantage of the dual nature of my day-job as an applied data scientist part of the time and a machine learning researcher the other part of the time. Each chapter focuses on a different use case that project managers, data scientists, and other practitioners tend to face when developing algorithms for financial services, healthcare, workforce management, social change, and other areas. These use cases are fictionalized versions of real engagements I’ve worked on. The contents bring in the latest research from trustworthy machine learning, including some that I’ve personally conducted as a machine learning researcher.

I urge, urge, urge you to engage with me and critique the content, especially if you have lived experiences different from mine. I truly want to reflect the opinions of diverse voices in this work. Please contact me by email (krvarshn@us.ibm.com) or on Twitter (@krvarshney).

Thanks again and happy reading.

—Kush

Trustworthy Machine Learning Version 0.9 (the entire book; pdf)

Front Matter and Preface pdf html

Part 1: Introduction and Preliminaries
Chapter 1: Establishing Trust pdf html
Chapter 2: Machine Learning Lifecycle pdf html
Chapter 3: Safety pdf html

Part 2: Data
Chapter 4: Data Modalities, Sources, and Biases pdf html
Chapter 5: Privacy and Consent pdf html

Part 3: Basic Modeling
Chapter 6: Detection Theory pdf html
Chapter 7: Supervised Learning pdf html
Chapter 8: Causal Modeling pdf html

Part 4: Reliability
Chapter 9: Distribution Shift pdf html
Chapter 10: Fairness pdf html
Chapter 11: Adversarial Robustness pdf html

Part 5: Interaction
Chapter 12: Interpretability and Explainability pdf html
Chapter 13: Transparency pdf html
Chapter 14: Value Alignment pdf html

Part 6: Purpose
Chapter 15: Ethics Principles pdf html
Chapter 16: Lived Experience pdf html
Chapter 17: Social Good pdf html
Chapter 18: Filter Bubbles and Disinformation pdf html

Shortcut pdf html