Session

Beyond the Privacy-Utility Tradeoff: Differential Privacy in Tabular Data Synthesis

Overview

ExperienceIn Person
TypeLightning Talk
TrackArtificial Intelligence
IndustryEnterprise Technology, Health and Life Sciences, Financial Services
TechnologiesLlama, PyTorch
Skill LevelIntermediate

As organizations increasingly leverage sensitive data for AI applications, generating high quality synthetic data with mathematical guarantees of privacy has become crucial. This talk explores the use of Gretel Navigator to generate differentially private synthetic data that maintains high fidelity to the source data and high utility on downstream tasks across heterogeneous datasets. Our analysis covers a framework for privacy-preserving synthetic data generation with two use cases: patient events and e-commerce reviews. We reveal nuanced strategies for: calibrating privacy parameters ε and δ for mixed-modal data, leveraging both record-level and user-level differential privacy depending on which entity in the dataset requires protection, maintaining statistical properties and high utility on downstream classification tasks under stringent privacy constraints (e.g., <0.05 difference in AUC when using DP), and quantifying resilience to membership inference and attribute inference attacks.

Session Speakers

IMAGE COMING SOON