Histological images provide tissue-specific context information for cancer prognosis, categorization, and treatment strategy. Establishing an automatic image processing pipeline would significantly reduce turnaround time of biopsy reports and allow medical doctors make critical decisions in time. Moreover, digital image processing would also enable unbiased quantitative analysis on tumor histology, in contrast with traditional histologic diagnosis by pathologists. The challenge of such analyses lies in that each tumor slide contains hundreds of thousands of objects — tumor cells, immune cells, blood vessels, etc. In order to grasp the context information of the tumor microenvironment, it is essential to incorporate the spatial information of objects and structures. For example, the distances between objects could help creating spatial clusters, representing tissue chunks composed of different proportion of various cells. Utilizing the Spatial Spark library we achieved such task and generated spatial relations of objects in each tumor slide and between tumor slides of the same patient. Spatial relations helped to quantify traits of the tumor microenvironment, and opened a new area for research, which links image-based tumor microenvironment information to prediction of treatment response.
Wei-Yi Cheng is a data scientist at Pharmaceutical Research and Early Development Informatics, Roche Innovation Center New York. He specialized in integrated molecular data analyses with concentration on disease risk and prognosis prediction. At Roche, he is utilizing Spark and other tools from Hadoop ecosystem to provide data-driven solutions for drug projects. Wei-Yi received his Ph.D. degree in Electrical Engineering from Columbia University in the City of New York, where his research focused on development of genome-scale data mining algorithms for biological discovery and predictive modeling.