Francesca Lazzeri, PhD is an experienced scientist and machine learning practitioner with over 12 years of both academic and industry experience. She is author of the book “Machine Learning for Time Series Forecasting with Python” (Wiley) and many other publications, including technology journals and conferences.
Francesca is Adjunct Professor of AI and machine learning at Columbia University and Principal Cloud Advocate Manager at Microsoft, where she leads an international team (across USA, Canada, UK and Russia) of cloud AI developer advocates and engineers, managing a large portfolio of customers in the research/academic/education sector and building intelligent automated solutions on the cloud. Before joining Microsoft, she was a research fellow at Harvard University in the Technology and Operations Management Unit.
In this session, we show how to leverage CORD dataset, containing more than 400000 scientific papers on COVID and related topics, and recent advances in natural language processing and other AI techniques to generate new insights in support of the ongoing fight against this infectious disease. The idea explored in our talk is to apply modern NLP methods, such and named entity recognition (NER) and relation extraction to article’s abstracts (and, possibly, full text), to extract some meaningful insights from the text, and to enable semantically rich search over the paper corpus. We first investigate how to train NER model using Medical NER dataset from Kaggle, and specialized version of BERT (PubMedBERT) as a feature extractor, to allow automatic extraction of such entities as medical condition names, medicine names and pathogens. Entity extraction alone can provide us with some interesting findings, such as how approaches to COVID treatment evolved with time, in terms of mentioned medicines. We demonstrate how to use Azure Machine Learning for training the model. To take this investigation one step further, we also investigate the usage of pre-trained medical models, available as Text Analytics for Health service on the Microsoft Azure cloud. In addition to many entity types, it can also extract relations (such as the dosage of medicine provisioned), entity negation, and entity mapping to some well-known medical ontologies. We investigate the best way to use Azure ML at scale to score large paper collection, and to store the results.
Machine learning model fairness and interpretability are critical for data scientists, researchers and developers to explain their models and understand the value and accuracy of their findings. Interpretability is also important to debug machine learning models and make informed decisions about how to improve them. In this session, Francesca will go over a few methods and tools that enable you to "unpackâ€ machine learning models, gain insights into how and why they produce specific results, assess your AI systems fairness and mitigate any observed fairness issues.
Using open source fairness and interpretability packages, attendees will learn how to: