Deploy and Serve Model from Azure Databricks onto Azure Machine Learning

Download Slides

We demonstrate how to deploy a PySpark based Multi-class classification model trained on Azure Databricks using Azure Machine Learning (AML) onto Azure Kubernetes (AKS) and associate the model to web services. This presentation covers end-to-end development cycle; from training the model to using it in web application. Machine Learning problem formulation The current solutions to detect semantic types of tabular data mostly rely on dictionary/vocabulary, regular expressions and rule-based look up to identify the semantic types. However, these solutions are 1. Not robust to dirty and complex data and 2. Not generalized to diverse data types. We formulate this into a Machine Learning problem by training a multi-class classifier to automatically predict the semantic type for tabular data. Model Training on Azure Databricks We choose Azure Databricks to perform the featurization and model training using PySpark SQL and Machine Learning Library. To speed up the featurization process, we leverage the PySpark Functions (UDF) to register and distribute the featurization functions into UDFs. For the model training, we pick the Random Forests as the classification algorithms and optimize the model hyperparameters using PySpark MLLib. Model Deployment using Azure Machine Learning Azure Machine Learning provided the reusable and scalable capabilities to manager the lifecycle of Machine Learning models. We developed the E2E deployment pipeline on Azure Machine Learning including model preparation, computing initialization, model registration, and web service deployment. Serving as Web Service on Azure Kubernetes Azure Kubernetes provide the fast response and autoscaling capabilities serving model as web service together with the security authorization. We customized the AKS cluster with PySpark runtime to support PySpark based featurization and model scoring. Our model and scoring service are being deployed onto AKS cluster and served as HTTPS endpoints with both key-based and token-based authentication.


 

Try Databricks

Video Transcript

– Hello and welcome to Deploy and Self Model from Azure Databricks, and Azure Machine Learning.

Deploy and serve model from Azure Databricks onto Azure Machine Learning

I’m Reema Kuvadia, Software Engineer in AI Platform Team in Microsoft. – Hi, I’m Tao Li, I’m from Microsoft, working as a Senior Apply Scientist. – We have divided our talk into three section, model training and experimenting, model deployment, and model consumption in Azure website deployment. The first two modules will be covered by Tao, and the last one will be covered by me.

Before we move on to our main agenda, I just want to give a quick overview of all the Azure resources that we are using and why. Azure Databricks, in Azure Databricks, we are using Jupiter Notebook to run the PySpark code, and attaching a cluster to do the required processing to train the model. Azure Blob Storage, once the model is trained, we are storing the model in Azure Blob Storage. You can even use Azure Data Lake. Third, Azure machine learning. In Azure Machine Learning, we are prepping the model for deployment. Four, Azure Kubernetes Service. In Azure Kubernetes, we are creating an endpoint for model that we will be consuming in Azure Web Service. Four, Azure Web Service. Here we will be deploying a web application in which we will be consuming our Kubernetes end point. All the above resources can be deployed by just one click, using ARM template. Let me give you a quick demo about it. This demo is to show you how to deploy Azure resources, using ARM template. I have written a simple PowerShell script to deploy all the Azure resources we need for this demo, like Databricks workspace, storage account, machine learning workspace, etc. I have added my ARM templates in the template folder. And I’ve got these templates from this GitHub repository called Azure Quickstart Template.

To customize my script further, I’ve added parameters.json, where I can add the name I want, the location in which I want the Azure resources to be deployed, and manage access policy by the security on each of this resources.

To deploy the Azure listings, all you have to do is call the template in PowerShell script. For example, to deploy data exports case, I’ll call the Databricks template, and I’ll pass the required parameters. So if you see here, it just needs the location, and resource group, and stuff like that, which have passed in the database parameter. And all you now have to do, is just run the script. To run this script, navigate to the file location in PowerShell. And just run the script. This will start deploying the Azure resource in the resource group it is gonna create, in the required subscription and location. For this demo, I already ran the script before. So I have my resources already deployed in my resource group.

To see if your script ran successfully, and all your Azure resources were created. You can check them here, but if something failed, you can go to the deployments tab and click here. It will show you the resource which failed, and the error. Here it shows that this keyboard name was already in use. So what I did is, I changed the keyboard name, and re-ran the script, and this deployed the keyboard for this resource group. That’s how simple this is. So yeah, that’s it for me. Now, let me hand it over to Tao, to talk more about modern training and experimenting.

Introduction to the problem

– Okay, now let me to just look at the problem we want to solve in this project. So we want to solve a machine learning problem to identify the semantic type, about one type of data, which is column data. This problem is very critical in the fitness world, to solve the data cleaning, normalization, data matching and other data enrichment problem. So right now, most of the current solutions rely on a dictionary, or whatever they lookup. So all these solutions are not robust to the dirty and a complex data. So what’s more, this cannot be generalized to our data. So in our project, we want to make a machine learning solution to make sure the model can have a way to learn from the data, and make a right prediction. So here is several examples. So the first example is a name. So we wanted the machine learning model to capture all the information, it’s best the sematic meaning from the data, and to make a prediction C, based on name. Similarly, for the second and third column, we want to make the right prediction to see this location, for the location of other values, and also date for the last columns. Okay, now let’s talk about the Model E2E Flow. So in a model flow, as I mentioned before, so we’re treating this problem as a machine learning problem. Then so we started with all the data gathering, and then we need to do the model training, which happened on the Databricks. So in this stage, we do the feature engineering, and also model training, using PySpark machine learning. Then we will move to another stage, once we have the model. We will publish the model into the Azure Blob Storage, which can be consumed in a later stage. Once this is done, we will move to the next stage which is a model deployments. Definitions, we will define the modeling environment, and also the dependencies. So once this is done, another important is the scoring script, which is used as the entry function for the model to make the prediction. Once all this comes down, we will go to the final stage of the model deployment, which is first register model into other machine learning, and is then create a model image. And standard which is deploy them as a model into the Kubernetes service, and the web service. Once all this is done, so we will go we can use this as a serving, and also consumer stage. The model will be served as web services on Azure. And then secondly, the application can fully use and consume the model using the RestAPI endpoint, then this will become to be the application stage.

Model Architecture and Training

Okay, now let me talk about some details for the model architecture and training. In this stage, we use this one as a multi-class classification, using random forest. Here Is our architecture. So we started with data gallery, using Excel data, public web tables and some research tables from some papers. We also leverage some customer data. So all the data will become to be processed, as some kind of tabular data, with a header and also wider examples.

Model Architecture and Training Multi-class Classification using Random Forest

We will do the feature engineering, to generate the features for the both the header, and also what else? Also, we also generated the labels based on the header, to come out some kind of label about the categories. Okay, once all this is done, we will have one feature vectors per column, and the one label, which is sematic tab. So we did a model training on Azure, and a polished model which can be consumed in the production of Azure Machine Learning. So in this presentation, I will only talk about some details. In the featurization stage, we leveraged the spark feature about the numeric data frame to make the embedding can be easily look up. And certainly, we’ll leverage the Spark SQL, to make sure that featurization code can be running as a UDF function to make sure for faster computing. Okay, now let’s talk about another component, which is the model training. So in the modeling, we leverage the Spark MLlib to do the model experimenting. In our example, we use a random forest and our model. What’s more, we also leverage MLFlow for the model logging, and also selection. Now, let’s move on to the second demo, which is training the model within our Databricks. This session, I will then the model training on Azure Databricks. Azure Databricks provide us a easier to use interface to manage a resource on Azure. It also creates the cluster management, which enables us to easily create config, and also manage a cluster. For example, I can specify Databricks worsening, we are creating clusters. It can also allow us to easily create a new packages or libraries that is needed for the experimenting. What’s more, it provides us easier to use notebook, for the modeling training and also experimenting. So in this (mumbles), so we need to specify all the libraries that we needed for the featurization, and also model training. And then secondly, we need to ling to our Azure Blob Storage, using our storage account, and a key. And together with all the configuration parameters, we need in this kind of little experiment. Now, we can load all the production that was embedded into a memory, which can be super efficient to use in the featurization part. And after that, we need to define all the UDF functions for the featurization. This is all function related here for the featurization to this tabular data under study. We then just use the UDF function, and also learnt the impression, to just concatenate the functions together with the return tab into each UDF. All the UDF function will be registered into a system, to make sure the featurization can be in power computing, and for even more efficiency. Then here is the featurization column, most basically we just call the UDF function by using the withColumn function, and I call each UDF to compute the features. Once the featurization is done, we will go to the model training part, which will train a random foresting model, by using the pipeline.fit function. Once the model have been trained, we save the model into the local storage by using the model.write.overwrite.save function. Okay, once the modeling being trained, and is saved, we will continue with them model evaluation and measurement by generating the precision numbers per tab, and the weighted precision recall numbers overall. After all is done, we can use a publish model into Azure Blob Storage for the downstream pipeline to consume and deploy. What’s more, Databricks also provide us the function that we can use, like the Mleap, which allows us to package a model into the job package. And the model training and the selection part, the MLFlow also being used to do some kind of feature parameter sweeping, and off logging. Yeah, so thank you.

Hello, in this session, I will continue with the model deployment.

Model Deployment

Now, let’s look at what’s the model deployment? So in this the model deployment part, as we mentioned, the model will become to be trained on Azure Databricks, and we then publish to Azure Blob Storage.

So yeah, to highlight model training or Azure Databricks, and the model will become to be published, and saved onto Azure Blob Storage. Once this is done, the model will become to be deployed as a service onto Azure, with is running on Azure Kubernetes Services. So here are some prerequirement, which is first Azure Machine Learning Workspace, and other one Azure Kubernetes Service Cluster, plus we need to leverage SDK, Azure Machine Learning, and also Azure Storage. The first step before the model deployment, is model registration, which is register a model into the workspace. Makes sure the model can be stored, tracked and also versioning. Okay, once the model is being registered, we can go to the planning stage. To define a score script, which you already call a score.py. This is to load the model, and also when the deploy service is being deployed into the Azure Kubernetes Services. Secondly, this function, it also need to handle the data, which is from the endpoints. And also make this into the model by doing the featurization and a prediction, then retaining the results, and the response to the endpoints. Plus, we also need to defend the AML environment which included software dependencies, and also the libraries dependencies. Once this has been defined, we can go to the second stage, about the model deployments. In this stage, there are two important parts. One is create and image, basically just configure the entry script, and environment, and then configure that runtime, which here it’s spark.py, to make sure the cluster, or the image can draw within the runtime of Spark. Plus, it also provided the flexibility to make even more flexible configuration, like CPU memory for all the configurations. Once this image being defined and created, we can deploy the image as a web service into the Azure Kubernetes Cluster, and then get an endpoint. Now why is the model been deployed, and in the endpoint get ready. We can easily consume the model by using both the SDK, and also using the endpoint at the RestAPI services.

Scoring Eile (Score.py)

So one important file it’s called Scoring File, or the Score.py. This one it entry function, which is used to receive the data, and also doing the featurization, prediction, and also retain the result. So the endpoint back to the client. So this function contains two important functions. The first one, we call this initialize function. This function is loading the model as a global parameters, we can also use this function to define some other parameters. And certainly this function can only be run once, which is when you deploy the model to the Docker, or image, or cluster.

Here is one example.

If you look at here, this function defined here one spark-o-meter, one hour machine learning model. Plus, one word to embedding in memory look up it frame. So, this one will enable all the models, global parameter, and also some in memory source can be loaded and fully used. Once this is done, another important function it will become to be called, our return will receive some data, this is called a run function. This function basically receive data, and also make a prediction on our attendees. All the interface both input and output, it follow a JSON format, for serialization, and also deserialization. Here is one example. This function will just receive the data from JSON format, gather data, make some data featurization, and also make a prediction. Once all the featurization is done, we will just pick the results, and send this back to the endpoint. Okay, now let’s talk about more details, about how to conceive the model. Okay, in this session, I will demo how to do the model deployment using the Azure Machine Learning. So it provided us an easier to use portal, and to access that notebook, and also compute plus with all model endpoint. Here in a compute, you can access the computing instance, which is used to execute the script. And also you can also use this to manage your cluster. In this case, we have one Kubernetes cluster used to deploy our machine learning models. You can immediately do this model to manage and optimize, monitor all your pipelines. What’s important, you have with a notebook, which allow you to do the model deployment. In this pipeline, we created the one folder. And the only needed to stop here, is only one notebook, which had all the logic to do the model preparation, and also model deployment. So firstly, let’s enable all the pre requirements by importing all the libraries from the Azure Machine Learning SDK. And let’s also initialize a workspace to persist all the configuration and author models. Now let’s go to the first step, which would prepare the deployment. Here, we just need to connect to the Azure Blob Storage, and download the models, and also embedded models. Let’s look at the model download, which just use a datastore.download function. Similarly, we downloaded the inviting function as well. And then, in a second step, we will register all these two models into the Azure returning space. So let me take a look at whether it is a model. Here, if you look at a models here, you can access all the models that was been shortlisted into the LMS running.

Okay, once this model was being registered, you can go to a source file, which deploy a model, and a web services. Okay, now let’s just find a ComputeTarget. Here we choose the aks_clusters that we’re going to use. Here gives us functions. And then lastly, let’s specify some other Kubernetes configuration to config backup here, which how to CPU call, and other four GB memories. So it also provides you even flagged for usage to just configure the computer and also the other cluster. Once we finish configuration, we will go to the image creation part, by using container image. And then to create a image with an execution script score.py. And also runtime, which is spark.py. And also conda_file, which is env.yml. Now, let’s first look at what’s a yml file, before we do the deployment. The yml file is just a single file, contains all the dependency here with a specified elaborate Python, and the PySpark with the (mumbles). And then with all the packages that needed to be installed on a plaster. Okay, another important one in order to do the deployment, is what we call a spark.py, which is this function. Which entry function, let me take a look at this function. This function as I mentioned, basically it had two most important function. Why the initial add function basically just take all the global parameters. In this case, we defended the spark environment, and also the model, plus the model to in word embedding, and also some other global parameters we want to use. So for spark, we define our spark session, in order to use for each model prediction. And also when we do the model loading, we use Azure SDK data model pass, and then load the model using spark called PipelineModel.load. Similarly, we did the same thing for the embedding, but you use a different function, pickle.load, to load an array of embedding file into the memory. Why is this done? Here we also have similar function used to do the featurization. And together, we also have another important function, which is run function. This function needed to just receive the data from the endpoint. Which is jsondata, and then doing the featurization, which is part the data from a JSON file. Now once the featurization is done, we call it createDataFrame, the data into a spark data frame, and then continue the model scoring, and the prediction. So then as a prediction, we may do some kind of data post processing to adjust some prediction, either to reject the prediction with some low probability. And the result of all that, we will return a result. So the result will become to be summarized about JSON format. Then this result will become to be returned to a web service. We can also handle some exceptions by using the exception function here. – Thank you Tao for getting the model ready for us to consume. Now let us see how we can consume this model into our web application.

Model Consumption and Website Deployment

To consume the model, we first need to register it in our machine learning workspace. Here is the code snippet to register the model. We will be needing path, name of the model, description of the model, and the workspace. Once the registration is complete, we are ready to create an endpoint. We have few dependencies, which we will be declaring in the environment config file. I have added snippet on how this looks. In the next demo that you will see, I will walk you through how to register the model, create your workspace, and how to consume the yml file.

Once we deploy the web application, this is how our web application will look.

Application Demo

So to test our end-to-end model, if you give an input,

the model can detect what type of column it is. As you see an example in the first column, the name, it detects this first name. In the third column, it detects the company name. To complete this model, end-to-end, let me walk you through a quick demo. In Azure Machine Learning Workspace, I have created a script called model-register-and-deploy. I’m first initializing the workspace. Then I’m registering the model. This is the same code snippet that you saw in the slide, where I will be needing the model path, the model name, description, and the workspace that we initialized in the earlier step. Before we proceed to the next step, we will be needing two things, the scoring script, and our environment config file, the yml file, where we will be declaring all our dependencies. Going back to the script, the third step is creating the endpoint. To create this endpoint, we will be needing three things, the scoring script, which will be our execution script, defining the runtime which will be PySpark, and the environment config yml file, which will be our conda_file. When you’re at this script, you will see that it is creating an image. Once this image is successfully created, you navigate to the endpoint tab, in machine learning workspace. And if you click on any of this endpoint, you will be able to see all the details of this endpoint.

Here, you will see the REST_endpoint. Just copy this, and keep it ready for the next step.

I have created a simple.net web application. And in my web config file, I have copied the URL, the REST_endpoint URL, and I have pasted here. Once your application is ready, all you have to do is go to your solution, just right click, and click on Publish. Once again click on Publish, since we had already deployed App Service Plan, you can use that one. Or you can create a new one, by just clicking on Azure, and App Service Windows. And here you can select your subscription, the resource group, which if it already has App Service, you can say like that, or you can create a new app service plan. Since we already have it, I’ll just continue with that. And you can select one of the app service. And then all you have to do is publish. Once you publish, it will give you a site URL. And when you click on that, it will open a website, which will look exactly like this.

To check if your endpoint is properly configured, I’m going to show you how I can verify that. I go to Launch, and I give a manual input, and I put my name.

And Taos name. It should show me as first and full name, if you see.

Just another example,

I’m going to change the input from names to company.

And let’s see what output we get. So this is company name. This shows that my end-to-end, from a model creation, to model consumption, is working accurately. So this is my demo, thank you. To conclude our talk, I would like to say we have achieved end-to-end model training, to model consumption using Spark APIs, and Microsoft resources along with third party platforms, like Databricks.

I have attached all the links for your references. Hope you guys have enjoyed our talk.


 
Try Databricks
« back
About Reema Kuvadia

Microsoft

Reema has received a Masters in Computer Science from George Washington University and has over 4 years of experience as a Software Engineer, with an expertise in full stack development and is passionate about learning something new everyday from new languages to technologies She is currently working on the AI platform team at Microsoft, contributing to the development of Microsoft's prosperity technology (Cosmos, Azure Machine Learning) and industry leading Open Source technology (Spark, Databricks). She is looking forward to explore her career as Data engineer.

About Tao Li

Microsoft

Full stack data engineer and machine learning scientist with 8+ years working experience in Bing Data Mining, Bing Predicts, Business AI of C+E at Microsoft as an E2E applied scientist and data engineer. The technical areas spread across wide spectrum from Data Mining, Machine Learning, Market Intelligence using both Microsoft propriety technology (Cosmos, Azure Machine Learning) and industry leading Open Source technology (Spark, Databricks).