Steve Lee (Seungchul Lee) works in the Data Research Team at BISTel Inc, focusing on developing algorithms on distributed systems. He is interested in developing big data applications specialized in the manufacturing industry. Steve’s main research interests involve algorithm-based applications using large-scale datasets.
November 18, 2020 04:00 PM PT
This talk presents the web application that calculates real-time health scores at a very rapid speed using Spark on Kubernates. A health score represents a machine's lifetime and it is commonly used as a landmark for making a decision on whether to replace the machine with new one for high productivity maintenance. Therefore, it is very important to observe the health scores of the large number of machines in a factory without a delay. To cope with this issue, the BISTel has applied the stream processing using Spark and services the real-time health score application.
While the batch-based system was only capable of calculating after a certain amount of time is passed, our application enables to calculate and inform health scores as the data is reached to our system without a delay. Especially, we have applied the ML libraries working on streaming engine on kubernates for the accurate health score results on cloud environment. In this presentation, we talk about the dataflow design and how to apply the ml libraries the spark stream engine with instantaneous response as data arrives. And finally, for the easy deployment and scalability for usage, we wrapped the entire applications using kubernates.
Speakers: Seungchul Lee and Daeyoung Kim
April 23, 2019 05:00 PM PT
As the development of semiconductor devices, manufacturing system leads to improve productivity and efficiency for wafer fabrication. Owing to such improvement, the number of wafers yielded from the fabrication process has been rapidly increasing. However, current software systems for semiconductor wafers are not aimed at processing large number of wafers. To resolve this issue, the BISTel (a world-class provider of manufacturing intelligence solutions and services for manufacturers) tries to build several products for big data such as Trace Analyzer (TA) and Map Analyzer (MA) using Apache Spark.
TA is to analyze raw trace data from a manufacturing process. It captures details on all variable changes, big and small and give the traces' statistical summary (i.e.: min, max, slope, average, etc.). Several BISTel's customers, which are the top-tier semiconductor company in the world use the TA to analyze the massive raw trace data from their manufacturing process.
Especially, TA is able to manage terabytes of data by applying Apache Spark's APIs. MA is an advanced pattern recognition tool that sorts wafer yield maps and automatically identify common yield loss patterns. Also, some semiconductor companies use MA to identify clustering patterns for more than 100,000 wafers, which can be considered as big data in the semiconductor area. This talk will introduce these two products which are developed based on the Apache Spark and present how to handle the large-scale semiconductor data in the aspects of software techniques.