INSTRUCTOR-LED

Hands on Deep Learning with Keras, Tensorflow, and Apache Spark™

DB 401Request Info

Overview

This course is aimed at the practitioning data scientist who is eager to get started with deep learning, as well as software engineers and technical managers interested in a thorough, hands-on overview of deep learning and its integration with Apache Spark.

The course covers the fundamentals of neural networks and how to build distributed Tensorflow models on top of Spark DataFrames. Throughout the class, you will use Keras, Tensorflow, Deep Learning Pipelines, and Horovod to build and tune models. This course is taught entirely in Python.

Each topic includes lecture content along with hands-on labs in the Databricks notebook environment.

Learning Objectives

After taking this class, students will be able to:

 

  • Build a neural network with Keras
  • Explain the difference between various activation functions and optimizers
  • Track experiments with MLflow
  • Apply models at scale with Deep Learning Pipelines
  • Perform transfer learning
  • Build distributed Tensorflow models with Horovod

 

Prerequisites

  • Python (numpy and pandas)
  • Background in data science very helpful (recommend)
  • Basic knowledge of Spark DataFrames

Delivery Requirements

  • A computer or laptop
  • Chrome or Firefox web browser – preferably Chrome
  • Internet access with unfettered connections to the following domains:
    • *.databricks.com – required
    • keras.io – required
    • spark.apache.org – required

Topics

  • Intro to Neural Networks with Keras
    • Neural network architectures
    • Activation functions
    • Evaluation metrics
    • Batch sizes, epochs, etc.
  • MLflow
    • Reproducible ML/DL
  • Convolutional Neural Networks
    • Convolutions
    • Batch Normalization
    • Max Pooling
    • ImageNet Architectures
  • Deep Learning Pipelines
    • Model inference at scale
  • Horovod
    • Distributed Tensorflow training
    • Ring-All Reduce
  • NLP
    • Deep Learning for natural language processing
    • RNN, LSTM, GRU

Course Syllabus

 

Module Lecture Hands-On
Intro to Neural Networks with Keras I
  • Intro to Feed-Forward Neural Networks
    • Basic architecture
    • Batch sizes and epochs
    • Evaluation metrics
  • Keras API
  • Build simple neural network with Keras
Intro to Neural Networks with Keras II
  • Optimize neural network
    • Activation functions
    • Data Normalization
    • Optimizers
    • Custom metrics
    • Validation dataset
  • Callbacks/checkpointing
  • Apply lecture concepts to improve upon performance from previous lab
  • Early stopping
  • Saving/loading models
MLflow
  • Experiment tracking
    • Record which model + hyperparameters performed best
  • Add MLflow to track experiments
Convolutional Neural Networks
  • Working with image data
    • Convolutions
    • Max pooling vs. avg. pooling
    • Strides
  • ImageNet Architectures
  • DeepLearningPipelines: Apply pre-trained models in parallel
  • Apply pre-trained CNN to own images
Transfer Learning
  • Freeze lower layers of pre-trained neural network
  • Retrain final layers of neural network
  • Tune pre-trained CNN to build cat vs dog classifier
Horovod
  • Horovod
    • Distributed Tensorflow model training
    • All-reduce technique
    • HorovodEstimator to combine Spark DataFrames with Horovod
  • Combine Spark pre-processing w/ distributed neural network training
  • Tensorflow Estimator API
  • Convert single node code into distributed code
  • Use Tensorflow Estimator
NLP
  • NLP Techniques
    • Tokenizing
    • Word Embeddings
    • etc.
  • RNN vs LSTM vs GRU
  • Predict user reviews

Details

  • Duration: 2 Day
  • Hours: 9:00 a.m. – 5:00 p.m.

Target Audience

Data scientists, analysts, architects, software engineers, and technical managers who want to learn deep learning and apply it at scale using Apache Spark.

Lab Requirements

  • Chrome or Firefox browser. Internet Explorer, Edge, and Safari are not supported.
  • Internet (web access)