I am a senior architect on the National Synchrotron Light Source II (NSLS-II) project at BNL, working on the integration of scientific-oriented algorithms with HPC and Big Data technologies to address the new challenges of data-intensive science. Before NSLS-II, I was involved in several accelerator projects, designing and building large-scale high-performance computational applications and three-tier model-based control systems. This experience was generalized into the Unified Accelerator Libraries (UAL) framework which addressed composite modeling studies
The talk will present a MPI-based extension of the Spark platform developed in the context of light source facilities. The background and rationale of this extension are described in the attached paper “Bringing the HPC reconstruction algorithms to Big Data platforms”, which has been presented at New York Scientific Data Summit (NYSDS), August 14-17, 2016 (talk: https://www.bnl.gov/nysds16/files/pdf/talks/NYSDS16%20Malitsky.pdf) Specifically, the paper highlighted a gap between two modern driving forces of the scientific discovery process: HPC and Big Data technologies. As a result, it proposed to extend the Spark platform with inter-worker communication for supporting scientific-oriented parallel applications. The approach was illustrated in the context of the Spark-based deployment of the SHARP MPI/GPU ptychographic solver. Aside from its practical value, this application represents a reference use case that captures the major technical aspects of other reconstruction tasks. In the NYSDS’16 paper, the implemented approach followed the CaffeOnSpark RDMA peer-to-peer model and augmented it with the RDMA address exchange server. By the Spark Summit, we plan to further advance this direction with the Spark-MPI generic solution based on the Hydra process management framework for supporting two major MPI implementations, MPICH and MVAPICH.