Zach Schwartz - Databricks

Zach Schwartz

Deputy Division Chief, U.S. Census Bureau

Zack Schwartz serves as a Deputy Division Chief at the U.S. Census Bureau, responsible for delivering critical systems in support of the 2020 Census. Schwartz leads the Trust & Safety Team, which protects the Bureau’s reputation and the American public from malicious efforts to undermine the count. Schwartz was the Program Manager over the Decennial IT Program Management Office leading a complex budget and schedule in support of the 2020 Census systems. Schwartz holds a bachelor’s degree in business administration from American University and currently resides in Washington D.C. where he is a Sworn Reserve Police Officer.


The Future of Surveys Leveraging Authoritative Sources and Data Science for Demographic StatisticsSummit 2020

The U.S. Census Bureau is the leading source of quality data about the people of the United States, and its economy. The Decennial Census is the largest mobilization and operation conducted in the United States - and requires years of research, planning, and development of methods and infrastructure to ensure an accurate and complete count of the U.S. population, estimated currently at 330 million. The 2020 Census is the first census to take place in the cloud, while also leveraging existing data sources to reduce operational costs, update the U.S. Master Address File (MAF), and ensure quality results. As we look into the future of surveys, the Census Bureau's vision includes the use of Administrative (data) Sources to further-supplement responses with data that has been confirmed to be dependable for its purpose.

This innovation won't just change future decennial censuses, but also the more than 100 surveys, while preparing the public data to be consumed by its redesigned website: To execute the Census 2020 itself, the Bureau turned to cloud computing to be able to spin up thousands of compute nodes that will be required for the 50+ systems that support dozens of operations ranging from Geospatial workloads for address canvassing with Artificial Intelligence (AI), address matching (conflation) using. Machine Learning (ML), to complex data integration to detect fraud, and supplement the responses with Authoritative Sources. Apache Spark was used to support these workloads at scale.

This presentation will leave the audience with the following takeaways:

  1. Appreciation for the magnitude of the Census 2020, and Cloud Computing's role
  2. Vision for the future of surveys, leveraging Authoritative Sources which require heavy-duty data integration.
  3. Plus, we would like to take the opportunity to encourage the audience to participate in the Census 2020, by self-enumerating over the internet. undefined undefined undefined undefined undefined undefined