Veysel Kocaman

Senior Data Scientist, John Snow Labs

Veysel is a seasoned data scientist with a strong background in every aspect of data science including machine learning, artificial intelligence and big data with over ten years of experience. He is currently a Senior Data Scientist at John Snow Labs, improving the Spark NLP for Healthcare library and delivering hands-on projects in Healthcare and Life Science. He’s the intructor of advaned NLP courses on Experfy and Udemy. Veysel has broad experience consulting on Statistics, Data Science, Software Architecture, DevOps, Machine Learning and AI to several start-ups, bootcamps and companies around the globe.

Past sessions

Summit Europe 2020 Advanced Natural Language Processing with Apache Spark NLP

November 17, 2020 04:00 PM PT

NLP is a key component in many data science systems that must understand or reason about text. This hands-on tutorial uses the open-source Spark NLP library to explore advanced NLP in Python. Spark NLP provides state-of-the-art accuracy, speed, and scalability for language understanding by delivering production-grade implementations of some of the most recent research in applied deep learning. It's the most widely used NLP library in the enterprise today.

You'll edit and extend a set of executable Python notebooks by implementing these common NLP tasks: named entity recognition, sentiment analysis, spell checking and correction, document classification, and multilingual and multi domain support. The discussion of each NLP task includes the latest advances in deep learning used to tackle it, including the prebuilt use of BERT embeddings within Spark NLP, using tuned embeddings, and "post-BERT" research results like XLNet, ALBERT, and roBERTa. Spark NLP builds on the Apache Spark and TensorFlow ecosystems, and as such it's the only open-source NLP library that can natively scale to use any Spark cluster, as well as take advantage of the latest processors from Intel and Nvidia. You'll run the notebooks locally on your laptop, but we'll explain and show a complete case study and benchmarks on how to scale an NLP pipeline for both training and inference.

Speakers: David Talby and Veysel Kocaman

The speaker will review case studies from real-world projects that built AI systems using Natural Language Processing (NLP) in healthcare. These case studies cover projects that deployed automated patient risk prediction, automated diagnosis, clinical guidelines, and revenue cycle optimization. He will also cover why and how NLP was used, what deep learning models and libraries were used, and what was achieved. Key takeaways for attendees will include important considerations for NLP projects including how to build domain-specific healthcare models and using NLP as part of larger and scalable machine learning and deep learning pipelines in distributed environment.