Alexis Roos

Director, Data Science and Machine Learning, Salesforce

At Salesforce, Alexis is leading a team of data engineers and scientists focusing on deriving intelligence from activity data for Einstein platform. Alexis has over 20 years of software engineering experience and has building Spark based production applications for the last three years using Scala, Spark batch and streaming, GraphX, NLP and ML. Previously Alexis worked for Radius Intelligence, Concurrent Inc, Couchbase, Sun Microsystems/Oracle for 13 years and two large SIs in Europe. Alexis holds a Master’s Degree in CS with a focus on Cognitive Sciences and has done countless online trainings around data science and engineering.

SESSIONS

Deep Learning for Natural Language Processing Using Apache Spark and TensorFlow

When interacting with customers, being able to extract relevant communications information in real-time is critical for success. This presentation will illustrate how Salesforce is using Apache Spark and TensorFlow to monitor customer activities in real-time and surface insights. Long Short-Term Memory (LSTM) networks have proven to be an effective technology to achieve state-of-the-art results on a variety of Natural Language Processing (NLP) tasks. It naturally captures the temporal information and the semantic meanings of human language, when coupled with word embedding models. LSTM networks can be readily built using any of today's deep learning packages. However, most popular deep learning packages use Python as their native language, which presents a real challenge in productizing such technology, with production environments often relying on other technology stack. In this talk, we will explain how to build an LSTM classifier using the TensorFlow framework, and combine the deep learning apparatus of TensorFlow with the distributed data processing power of Spark. We will discuss how to reuse existing Scala data preparation libraries in TensorFlow training pipeline and unify them into a single Notebook and discuss strategies for scoring at runtime. We will show that pre-trained Word2Vec embeddings reduce the demand for large volume of labeled data. The end result is a fast and accurate machine learning model for text classification that can be integrated into a structured streaming production environment.

Building a Graph

Radius Intelligence (www.radius.com) empowers Data Science to deliver an unique marketing intelligence platform used by over hundred US companies. This presentation will explain how Radius is using Spark along with GraphX, MLLib and Scala to create a comprehensive and accurate index of US business from dozens of different sources. In particular, I will address problems related to clustering records together based on a graph approach and how to resolve the graph into a set of US businesses. I will discuss some of the models related to cleaning out the noise and how to rank best values and impute missing values and provide some best practices.

Using Apache Spark for Intelligent Services

Salesforce is developing Einstein which is an artificial intelligence (AI) capability built into the core of the Salesforce Platform. Einstein helps power the world’s smartest CRM to deliver advanced AI capabilities to sales, services, and marketing teams - helping them discover new insights, predict likely outcomes to power smarter decision making, recommend next steps, and automate workflows so users can focus on building meaningful relationships with every customer. Salesforce is using Apache Spark (batch, streaming, GraphX and ML) to power the Einstein platform and services. In this keynote and demo, Alexis will highlight how Salesforce is building intelligent Services for Einstein using activity data by leveraging Spark and Databricks to scale data science and engineering.

Using AI for Providing Insights and Recommendations on Activity Data

In the customer age, being able to extract relevant communications information in real-time and cross reference it with context is key. Learn how Salesforce Inbox is using data science and engineering to enable salespeople to monitor their emails in real-time and surface insights and recommendations. Salesforce is developing Einstein, an artificial intelligence capability built into the core of the Salesforce Platform. Einstein helps power the world's smartest CRM to deliver advanced AI capabilities to sales, services, and marketing teams – allowing them to discover new insights, predict likely outcomes to power smarter decision making, recommend next steps, and automate workflows so users can focus on building meaningful relationships with every customer. Find out how Salesforce Einstein Inbox combines activity data, such as emails, with contextual and CRM data to provide real-time insights and recommended actions. Learn about use cases, architecture, and how a variety of technologies including data engineering, data science, graph processing, NLP, machine learning and deep learning are combined together to support the application. This session will include an interactive demo where you'll get to see the associated code using notebooks running Spark. Session hashtag: #SFds6