At CERN, the organization that does fundamental particle physics research, Daniel is working on developing and providing Big Data solutions that involve data analytics and machine learning techniques. During his two Degrees and two Masters, where he studied computer science, telecommunications, and Big Data, he’s was also interested in Evolutionary Computation, a field of knowledge where he has several publications. His responsibilities range from the deployment of Big Data tools to the development of machine learning algorithms.
At CERN, the biggest physics laboratory in the world, large volumes of data are generated every hour, it implies serious challenges to store and process all this data. The storage group, which holds more than 200 petabytes, is an essential player to help the organisation overcoming this great challenge. Data is mostly stored in tape libraries, where tens of drivers have available thousands of tapes to store data for long term. Since tape systems are critical, alerts and actions need to take place as soon as anomalies arise, for that we have developed a few tools on top of the monitoring infrastructure. One of them will be introduced, ExDeMon, an open-sourced metrics monitor where stateful processing implemented with Spark Structured Streaming is playing a key role by applying machine learning techniques on collected logs and metrics. One of the machine learning techniques we aim to apply are Markov chains, this statistical model was developed by Andrey Markov in the XIX century. Currently, Markov chains are the core behind Apple Siri, Google Assistant or OCR softwares. The model, hierarchically applied, can reach deep levels of understanding of a system's status, feature that could be notably exploited in monitoring environments. Challenges faced and decisions taken while developing the application and troubleshooting Apache Spark in production will be shared. Session hashtag: #SAISExp14
At CERN, the biggest physics laboratory in the world, large volumes of data are generated every hour, it implies serious challenges to store and process all this data. An important part of this responsibility comes to the database group which not only provides services for RDBMS but also scalable systems as Hadoop, Spark and HBase. Since databases are critical, they need to be monitored, for that we have built a highly scalable, secure and central repository that stores consolidated audit data and listener, alert and OS log events generated by the databases. This central platform is used for reporting, alerting and security policy management. The database group want to further exploit the information available in this central repository to build intrusion detection system to enhance the security of the database infrastructure. In addition, build pattern detection models to flush out anomalies using the monitoring and performance metrics available in the central repository. Finally, this platform also helps us for capacity planning of the database deployment. The audience would get first-hand experience of how to build real time Apache Spark application that is deployed in production. They would hear the challenges faced and decisions taken while developing the application and troubleshooting Apache Spark and Spark streaming application in production. Session hashtag: #EUde13