Date: Wednesday, Nov. 29, 2023
9:00 am – 10:00 am Pacific Time
12:00 pm – 1:00 pm Eastern Time

Location: Weekly Seminar, Zoom
Title: Balanced Filtering via Disclosure-Controlled Proxies
Abstract:
We study the problem of collecting a sample of data balanced with respect to sensitive groups when group membership is unavailable or prohibited from use at deployment time. Specifically, our deployment-time collection mechanism does not reveal significantly more about the group membership of any individual sample than can be ascertained from base rates alone. To do this, we adopt a fairness pipeline perspective, in which a learner can use a small set of labeled data to train a proxy function that can later be used for this filtering task. We then associate the range of the proxy function with sampling probabilities; given a new data sample, we classify it using our proxy function and then select it for our sample with probability proportional to the sampling probability corresponding to its proxy classification. Importantly, we require that the proxy classification does not reveal significant information about the sensitive group membership of any individual sample (i.e., the level of disclosure should be controlled). We show that under modest algorithmic assumptions, we find such a proxy in a sample- and oracle-efficient manner. Finally, we experimentally evaluate our algorithm and analyze its generalization properties.
Bio:
Emily Diana is a Research Assistant Professor at the Toyota Technological Institute at Chicago where her research focuses on the intersection of ethical algorithm design and socially responsible machine learning. Starting in September of 2024, she will be an Assistant Professor in the Operations Research group at CMU’s Tepper School of Business. She received her Ph. D. in Statistics and Data Science from the Wharton School of the University of Pennsylvania, where she was advised by Michael Kearns and Aaron Roth. She is honored to be the recipient of the 2022 Wharton School’s J. Parker Memorial Bursk Prize for Excellence in Research and to have been recognized as both a Rising Star in EECS by MIT and a Future Leader in Data Science by the University of Michigan.
