Accelerating MLflow Hyper-parameter Optimization Pipelines with RAPIDS - Databricks

Accelerating MLflow Hyper-parameter Optimization Pipelines with RAPIDS

When combined with scale-out cloud infrastructure, modern hyperparameter optimization (HPO) libraries allow data scientists to deploy more compute power to improve model accuracy, running hundreds or thousands of model variants with minimal code changes. HPO has traditionally run into two barriers – complexity of model management and computational cost.

In this talk, we walk through a detailed example to address these challenges by combining two open source libraries. We use MLflow to simplify model management, and RAPIDS, a GPU-accelerated data science library, to reduce the compute time requirements. RAPIDS provides Pandas-compatible and scikit-learn-compatible APIs in Python that allow users to port existing code easily, while accelerating both data preprocessing and machine learning training scripts. The example builds a pipeline to predict flight delays from FAA data with random forests and gradient boosted decision trees, demonstrating a dramatic speedup in model building when compared to a non-accelerated version and using MLFlow APIs to select the best model and prepare it for deployment.


Try Databricks

Video Transcript

– Hi, everybody, I’m John Zedlewski. I’m the Director of RAPIDS Machine Learning @ NVIDIA. Today I’ll be talking about how to use GPUs to accelerate your Machine Learning model tuning with RAPIDS, and MLFlow all integrated with the databricks platform. So first, I just wanna say that all of the resources for this talk are available at this GITHUB link up here at the top. That’s That includes the notebook will walk through a live a little later, some read me’s and other Getting Started materials. So feel free to open that up and follow along in the background. So where’s this talk gonna go from here? We’re gonna start by motivating the problem of what is Hyperparameter optimization? What are some of the challenges with it, and how can we solve them? Part of the answer is gonna be using RAPIDS Machine Learning. So we’re gonna dive into the details of RAPIDS ML and the RAPIDS platform. So everyone gets a basic running there. Now, we’ll talk a little more about how we’ve integrated RAPIDS together with MLFlow and Hyperopt to make this model tuning process and all Model Management really seamless. But we’re gonna spend a lot of our time together, not just looking at slides, but really diving into a demo notebook, which again, is also available at that repo. So we’ll say about half of this talk to jump into the code. So what problem really motivated this set of tools to come together? You know, I’ve been a data scientist for a while. And I think a lot of my life is kind of like this that we’ve prepped a lot of data we’re really excited about. We defined some key metrics for a model, we built a model, and we were really excited about it. But you know, the accuracy, maybe it’s not quite good enough, maybe we have a feeling that we could do better. Or maybe we just know that it’s a model where increasing accuracy is gonna deliver a lot of business value. So what’s the next step we do here? Well, typically, we might look at changing some of the Hyperparameters for a model. Many Machine Learning models are quite complex, complex and have a huge range of Hyperparameters that we could tune. So we could play with some of them manually. So if I look at XGBoost, for instance, a popular gradient boosted decision tree tool that has about 25 plus parameters that go into a model. And, you know, I work on XGBoost pretty much every day. But I think I seldom forget what half of these parameters mean to be totally honest. And it can be really a slog to dig through them. So we could manually tune everything, you could spend days kinda tweaking parameters, trying out the, you know, increasing tree depth, or whatever other parameters wanna play with. But that quickly becomes a massive time sink. And it seems to kinda that maybe computers can do a little better. So that’s option B, just automated Hyperparameter search is best represented in databricks by the Hyperopt package. But it’s really an idea that goes back further in Academia, and it’s starting to be increasingly used throughout industry. So Hyperparameter optimization, or HPO, it’s pretty simple, really, we’re gonna train and evaluate many, many model variants in parallel, possibly hundreds of model variants, trying out all kinds of different combinations of Hyperparameters. And we’re gonna use some process to automatically pick the Hyperparameters that deliver the best model. It sounds simple in practice. And it’s really nice that we don’t have to do that much work to get better accuracy. Often you can take some existing scripts, layer them together with a Hyperparameter optimization library, and just kick off hundreds of models, which is really exciting. But the challenging part is that hundreds of models part that I mentioned, we’re gonna have to manage those models, we need to keep track of what are the best ones, what was the exact code used in each one, and we have to wait for hundreds of them to complete. So that could turn to be a challenge. Maybe it makes our workflow much more complicated if we don’t do this right. So I guess I’m gonna convince you here that it is possible to this well, it is possible to make it easy. But we really need to integrate three components, we need a HPO framework that makes it simple. So you don’t have to go write a ton of new code in order to do Hyperparameter optimization on your existing model. You need something that’s gonna manage the model lifecycle so that when you train these hundreds of models, you can easily go back and find the best one, find how it was trained, pull it out, and then go use it for inference we need to. And one of the key pieces that’s often missing in this is that you need really fast model training. If your model takes an hour to run, but you wanna run hundreds or thousands of variants of it, suddenly your lunch break problem has become a week long problems too. And that may be absolutely unacceptable. So RAPIDS is really one of the key missing ingredients here because it accelerates Machine Learning models pretty dramatically. So one of the experiments we’ll walk through in a sec goes through a simple Hyperparameter optimization example, building a random forest classifier for a decent sized data set. We did that with just CPU in Scikit-learn. And it takes about seven hours to complete a simple Hyperparameter Sweep. With the same configuration in RAPIDS that gets down to 17 minutes. Again training the same sort of random forest and using the exact same Hyperparameter optimization. So this turns something from a day long problem to train your model into a coffee break problem to train your model. We think this is the kind of thing that’s gonna make Hyperopts much more accessible and much easier to integrate into your daily flow as a data scientist. But before this makes sense, probably we wanna talk a little about what is RAPIDS? How does it fit in with the Python data science ecosystem? So here are the SPARK + AI summit. You’ll hear other talks too, about how RAPIDS integrates directly with the SPARK ecosystem, and can be available to Scala users as well. Here, we’re gonna focus primarily on the Python side of RAPIDS. It targets data scientists who are already familiar with Python stack kinda like this. I think if you build models in Python one day, you probably know most of these libraries, so Pandas for your ETL, Scikit-learn, NetworkX, Deep Learning Libraries like PyTorch, maybe Dask for distribute computing, and all kinds of visualization libraries. When we went to do RAPIDS, and to build GPU acceleration for data scientists, we said, why reinvent the wheel. Why make people learn a whole new set of APIs, when we can match these existing APIs that are already popular with millions of data scientists. That’s essentially what we’ve done. RAPIDS mirrors that existing PI Data ecosystem, but builds GPU based back ends to accelerate each of those libraries on GPU. So instead of using Pandas for ETL. Now you’ll use cuDF for ETL. Instead of using Scikit-learn for Machine Learning. Now use cuML. But you have the same APIs that you’re familiar with it just goes much, much faster. Because everything is in GPU memory all integrated with a common format based on Apache Arrow. Why do we care about moving data science onto the GPU? Well, the hardware advantages of being on a modern GPU are pretty enormous. Even from the voltage generation of NVIDIA GPU, you can get about a terabyte a second of memory bandwidth, maybe 20 x higher than you can get in a CPU platform. Now 15 Teraflops of performance in a single card, and you can have incredible interconnect speeds between cards, either on the same machine or across machines, you continue to scale your problem up. It’s really an amazing beast of compute power, that deep learning scientists have been using since the beginning of the modern deep learning age. But other data scientists haven’t always been able to take advantage of because the libraries weren’t there to use this sort of compute beast for your ETL and virtual Machine Learning. And that’s the problem that RAPIDS really meant to solve. So when we look at the core building blocks of doing Python data science, it usually starts with ETL usually starts with something like a Pandas data problem. And I think we all know that as much as many of us spent time in school learning Machine Learning, 90% of our time, we go to ETL and data munging. So we’ve all been there that your simple batch job to add a few more features to your large data set suddenly takes another 15 minutes, found there was an error, you’ve got to iterate and repeat. We can spend tons of time on these iterative ETL workflows. And that’s why we built cuDF to accelerate that. So if you’re familiar with Pandas, this API example over here on the right, should hopefully look very familiar. This is actually using cuDF. And all of this work is happening on the GPU. So here we’ll load this file on the GPU while it’s still compressed, decompress it on GPU and do a lightning fast GP, do a lightning fast GPU based read of the CSV. It was kinda surprising that in the first time I started playing with this, I would have thought that maybe there’s not much more you can do to speed up reading CSVs. But even something as simple as this, we’re talking 20 x speed ups overweighting data in a CPU. So here you can do all the sort of standard Pandas functions, string manipulations, simple column data type changes, turning something into an integer. These are the sorts of things we do everyday as data scientists, but we want to be much faster. And they are much faster when we switch things over to a GPU. So here, just running simple workloads, like merging data frames together. cuDF can run 500 or 900 times faster than Pandas, sorting columns, doing GroupBy. These are the things we do every day as data scientists can have a factor of several hundred Speedup just by moving into GPU while being API compatible. This is something that enables you to do so many more experiments and pipeline and we bring together that the traditional Machine Learning from RAPIDS ML can help complete the picture. So RAPIDS ML or especially its central library cuML fits in right where Scikit-learn does in a PI GD ecosystem. It’s really a suite of Machine Learning models that are easy to use, all based on a reusable core CUDA Foundation. And these libraries that CUDA and NVIDIA have been optimizing over the past decade. So what does it look like to use cuML? Well, this is sort of a standard Machine Learning one on one problem, right? All done in the Python ecosystem. Loaded data set, put it into Pandas do run a clustering algorithm and then plot the results. So we can cluster these points in two different two different sub categories. Many of us have written this several times. And now if you wanna plot it to cuML and cuDF, we’ve highlighted all the changes that you can make here. And see, essentially all you had to do was change few imports here. We really, really closely manage API compatibility with these existing libraries, if we’re incompatible, because we’re missing a feature, because we have a slight difference in our API, we consider it a bug, please file it we’re gonna jump right on it this is something we take very, very seriously. So what do we offer within Machine Learning? Well, pretty much the most popular versions of traditional Machine Learning algorithms that data scientists use every day. Set includes tree based models like decision trees and random forests, linear regressions, nearest neighbors, dimensionality reduction, some of those like UMAP and T-SNE can become really beast to do, you can take a huge amount of time on CPU. So having these Speedups really transforms where you can use them. We talked about Speedups, we talked about anywhere from, you know, five, or six x for the simplest possible models up to Speedups by a factor of over 700. When you’re talking about support vector machine models or dimensionality reduction, typically as your problem gets larger and more compute intensive. Once you scale up to these 4 million 20 million, 200 million row datasets, you start to see a much much bigger gain from RAPIDS as you can finally take advantage of that huge compute beast that is GPU. So one of the additional libraries that work closely with in RAPIDS ML is XGBoost which is one of the most popular tree based Machine Learning libraries. XGBoost too fits in this GPU ecosystem, and allows you to load data directly from cuDF or from other GPU based data formats without having you a round trip through CPU memory. Again, just by enabling GPU acceleration, you can get these huge Speedups up to 17 x depending on a data set that you’re that you’re fitting on. So XGBoost fits in the same way as cuML does the rest of the ecosystem too. There’s way more about RAPIDS that I can possibly go into in this talk, I really encourage you to check out for much deeper dive. But here I wanna shift focus and really talk more specifically about how RAPIDS integrates tha MLFlow and Hyperopts, and how we use all those in a databricks environment. So I know there are many more talks at this summit about MLFlow, but just so we’re on the same page, MLFlow is really about Machine Learning model Lifecycle Management. It allows you to track your experiments so you know exactly what code exactly what gate hash was used. When you train a model. It saves the parameters and logs that when in that training so you can reproduce that or analyze that something went wrong. It lets you persist your past models in a centralized model management database. So you can query them again whenever you need them. You can visualize models side by side like we’re doing over here. And you can convert models so that they can be used in to play and production, which I think is one of the most exciting features of MLFlow. Finally, MLFlow is deep integration with databricks platform. And now it has integration with the RAPIDS platform as well, too. So you can dive much more to MLFlow through these links here at the bottom. It will also show more about in the live demo that we have. What does it look like though to integrate RAPIDS with MLFlow? Well, since the beginning of MLFlow, it’s had support for Scikit-learn models. So you can see over here on the left, what a typical Scikit-learn flow would look like. Essentially, we’re training a model, and then MLFlow enters after we’ve done the training, we wanna log some metrics. So we measured how accurate our model is, we log that sets persisted along with all the artifacts of its model. So we can query it later and figure out what that accuracy was. We’re also logging the model itself. Here, this model will be pickled in Python terminology, and saved to a central model database. So you can always unpickle it and then go do inference. Go do further inspection or training later. Really easy to use API, MLFlow is very lightweight and easy to add your workflow. If we’re gonna do this in a cuML workflow, well, we’ve really faithfully matched the Scikit-learn API. So you find we can do the exact same thing you can use the accuracy scores off of that are developed in cuML, log those metrics exactly the same way. And you can even use exact same Scikit-learn interface to log a model, knowing that this is just a GPU based model that was developed with RAPIDS. And it works the same way a Scikit-learn, but it was trained much, much faster, as we’ll see.

You put all this together and integrate it with Hyperopts, which is the third piece of puzzle that we talked about initially. So we already have really fast model training with RAPIDS. And we already have a way of managing model life cycles with MLFlow, Hyperopt does the actual exploration for Hyperparameters. So it’s an Academic project originally that’s gone on for a long time, but it’s become increasingly popular in industry, and again, has great integration with databricks. It uses a clever algorithm called Tree-structured Parzen Estimators to allow you to explore all the possible Hyperparameters in an intelligent way. That balance between the needs to do exploration that is trying very different Hyperparameter settings. If you haven’t tried before, but also exploitation, which is, once you have some pretty good Hyperparameters, let’s try some pretty similar ones and see if we can get a little more of a local optimum there too. So really cool algorithm to read up more in detail. But I think for the purpose of training a model, it’s actually really simple. You define a model function here that trains your models, trains your model, and evaluates it and basically returns your evaluation score, it’s gonna take in Hyperparameters as it’s params tuple, you choose the parameter space, that is, what’s the range of Hyperparameters, you want Hyperopt to search over. In this case, merging of a Hyperparameter called x, we’re telling it to search between negative 10 and 10 is reasonable values for x, then you just let fmin rep and it’s gonna go run, in this case, 100 different versions of your model and find the one that delivers the best accuracy score, as you’ve defined it here. And we can print that out here. On databricks, this can all be done in parallel to over many workers, which makes it particularly cool and scalable. So we wanna dive into the actual demos and really get into how this works in a more complex example. But again, I’d encourage you to look at the GITHUB repo here that has examples in much more detail. And if you wanna run RAPIDS on a different platform, try one of these resources below either our GITHUB page, ANACONDA packages to get started. Or if you work in a DOCKER based workflow, NGC the NVIDIA GPU Compute Cloud already provides optimized DOCKER images that you can vote on. All right, now we get to the fun bit of the talk where we get to actually dive into the code. So I just started with this basic databricks learning page that you’ve probably all seen in the UI. And now I wanna go and make sure I have a cluster set up that can run RAPIDS and actually let us launch a notebook. So I’ll go over to clusters, UI here. I’ve already defined a couple of these clusters. But it’s worth looking into exactly how to configure a cluster so that you can use it for RAPIDS. Now, the core of RAPIDS of course, is to be able to use GPUs for data science. So you wanna go here to the databricks runtime version and make sure you’ve selected a runtime that’s compatible with GPUs. So here I’m using a 6.6 ML runtime, which clearly shows that it’s GPU compatible. You can select one of these other GPU compatible run times too, all right. It’s also critical to make sure that we’ve chosen workers that actually have GPUs attached to them. This is an AWS back cluster. And so you’ll see some AWS instance type names, there ton of options, but you just need to focus on that GPU accelerated section. Any of the g4dn, or p3 based instances, when we find to run RAPIDS, they’re all modern GPUs. And you can choose what fits your workload best based on cost and throughput of the GPU. Here I’ll use a p3 2xlarge which contains one voltage GPU, it’s kind of a nice starting point for training. And I’ll make sure my driver nodes have the exact same instance setting. Now, RAPIDS doesn’t come pre installed in the Python, and that comes with this databricks runtime, but it’s really easy to set up. In that Cloud ML examples repo that I mentioned earlier, we offered a int script that does all the installation you need for your cluster. Just gonna go into advanced options, go to int scripts, and make sure you’ve added the int script from the Cloud ML example repo that installs the version of RAPIDS you wanna use. Here, we’re installing RAPIDS 0.13. Just downloaded that to DBFS. And then out of this int script, that’ll run when the cluster node starts up, and automatically do a bunch of CONDA installations, getting the RAPIDS back as you need it. So again, pretty seamless, just grab this int script from our repo. And now let’s actually look at the real code and walk through a bit of a more realistic example. So this is gonna be a simple example of building a classifier for whether a flight arrives on time. I think a lot of us are kinda missing those days when we used to actually fly to conferences like SPARK + AI someone gets to meet you on person. So it’s nice to do something travel related here. We start with a couple of Python preliminaries just importing the critical packages. And we’ll do both Scikit-learn and cuML version so we can do some comparisons. We added some logging functions just so that we can share the timings of various things. And we’ll start by loading some of the data. Here we have a 20 million row example of the airline data in packet format. And we also have a 1% sub sample of that airlines more dataset, we’ll use the 1% sub sample space interactive demo, but also post a larger data one online. So you can see how much bigger the Speedup gets as you go to larger data sizes. So we can read the same with cuDF as if it were Pandas or anything else. It’s much faster. It’s even though the read happens on GPU, we can print this out easily in a notebook. And see it’s a pretty straightforward data set with metadata, one row for each plates. And a simple binary variable. This is what we’re gonna wanna predict is arrival delay binary. This is one of our flight was delayed and zero of our flight arrived on time. So it’s gonna be kind of nice to know in advance. So to predict this, let’s relatively start with a random forest classifier. This is one of the most popular models in data science, just going to build an ensemble of trees and use those to predict. You can see though, that the Scikit-learn version has quite a few options that you can dig into as you go about optimizing. We can use we could dig into all these manually but it’d be a lot nicer to have Hyperopt to kind of solve the problem for us. Here, we’re gonna focus on a couple of the critical parameters, max depth. So how deep do you want each tree to be that gets built. Max features. So every time you split a node, do you wanna look at all the features or just a subset of them? And also the number of estimators, sets how many trees we’re actually gonna build, and can determine a bit about whether we over fit and, or under fit. So those are three really classic parameters to that can make a big impact here. Let’s start by looking at how you do this in Scikit-learn and integrate Scikit-learn with MLFlow and Hyperopt. We start with a really sort of simple Scikit-learn pipeline in here. That could be in any script, we load in some input data, extract the target, split to train and test, build a classifier and then evaluate it to find its accuracy. Probably something we’ve all done 100 times as data scientists that are parts here that are interesting are one we start by using MLFlow. So here in MLFlow, we’re gonna log the model after we’ve trained it. This means that the model we train will be persisted in central central model server through MLFlow, which in this case is also managed by databricks. That means that if we ever wanna do any additional debugging, or deploy this for inference, we can easily go extract that out of the model database, which comes in really handy. We’re also gonna log a metric from MLFlow, you can log as many metrics as you want for a model you might care about, for instance, both accuracy and precision and recall, for instance, this case, I’ll log just accuracy. And I’ll also have to return something that that Hyperopt can optimize. So in this case, Hyperopt just kinda minimized something for us, we want Hyperopt to minimize the opposite of accuracy or maximize accuracy effectively. This is just wrapped up in a real simple function that takes in a tuple of params and can be called by Hyperopt, but there’s really not a lot of wrapping that you have to do about it. You just kind of crack open these parameters, and make sure you understand how you return a loss that Hyperopt understand. So, before we’ve been running this, let’s look at what it’s gonna take to port that to GPU. And this is just a RAPIDS version of the exact same application. You can see the only differences really that we’re using cuDF to read the data. We’re using cuML of pre processing train, and to compute metrics. But otherwise, these are all just import changes, we haven’t changed any of the parameters, we haven’t had to do any additional work, it was really just literally a copy paste and mechanical transfer. Again, we can use the exact same MLFlow functions, you can log a model to MLFlow, if that model was trained on GPU really fast, or if it was trained much slower on CPU and Scikit-learn. It works the exact same way, which is part of the advantage of this sort of workflow. And again, we’ll we’ll return our loss function here. So let’s see how long it takes to do this on CPU. I just grabbed some totally arbitrary parameters. It’s about 100 trees of depth up to eight on this small data set. So we’re printing out a couple timing stats for each of the different stages here. ETL is fast, the loading is pretty fast. Fit took 7.7 seconds. It’s not the end of the world. But this is a 1% sample of our data. And we built a pretty small model on it. So we can see that this is gonna get to be a pain as we move to a full dataset. And it’s really gonna be a pain when we run Hyperopt, and we wanna run hundreds of versions of this. So let’s say we just try and do the same trivial example with the RAPIDS version. How much faster is that? Okay, that was returned a little faster and here we can see the fit was little more than 20 times faster on this small data set, it’s only a point 35 seconds error. So really feels like no waiting time. And this Speedup is gonna scale up even further as we go to larger data sizes, which is a very typical thing we see within RAPIDS. So now we plug everything into Hyperopt just we showed in the slides, we’re defining a real simple space to search for parameters. We just tell it, what’s a reasonable range of Max depth, reasonable range of Max features and a reasonable range of estimators and Hyperopt will do the actual exploration with limits for us, so we can set this up with these parameters, and we’ll actually use these exact same parameters for both the CPU version and the GPU version. In either case, one key thing to know is that for the databricks version of Hyperopt, you wanna start by defining a SPARK trials object. This is basically a placeholder that will keep track of all your trials and integrate them with the databricks UI. And now, I, I’m not gonna run the CPU version now because it actually takes a super long time, unfortunately, let’s jump right in and try this with a GPU version here. Again, you can see the code for GPU though is essentially identical to the CPU, we’re just using different naming. So we can keep these things straight in our model registry. So we kick this off, and it run the fmin function from Hyperopt. Fmin really is the main driver that kicks off the search space search process with all these jobs, and we can see right away in the UI, it’s starting to run two jobs in parallel. These will automatically be tracked within MLFlow. And you can see, these already complete already here. And we can grab them easily up here in the run section of our UI. So if I go to topic and click runs, I can see all these past runs. And I don’t just get a job ID of the date, we’re actually seeing what are the features being explored by MLFlow, which I think is pretty cool. That said, this interface can get a little bit confusing, especially if you have large number of runs. There’s also the much more detailed experiment UI, you can get to that by clicking at the bottom here of the runs UI. And pull it up, here it’s gonna show all of our past runs. So we’ve got lots of experiments here. But let me just take one kind of large set of runs that I ran previously.

I don’t wanna have to click each one of these by itself, but so I can select this whole region of runs. These were all a bunch of different Hyperparameters that we tried running, so they have different max depth, with different accuracy, and all the other Hyperparameters for modified were varied too. The best way to see though is not to use this UI, but rather to go to the Compare UI after selecting some of these, now you get kind of a deep dive into each of the past runs you did, you know, these were all the parameters set for each, here’s the accuracy got out of it. And we can already see just looking at the first couple of these that you start to gain accuracy pretty quickly from some of these defaults from, you know, we’re gonna vary down from the 8.3 down to around 8.6, for the best models, helpfully includes a scatterplot, where you can look at one of the Hyperparameters and see how accuracy varies. For instance, as we vary max depth, you can kind of see begin to see this upward trend here. Or we can look at all of the Hyperparameters at once this max correlates with this parallel coordinates plot, we certainly see the range of a couple of points of accuracy between a random model or poor model and one of the best models. So and you can explore this UI in a lot more detail. And you can actually query these things in each way. But clearly, what we’re gonna do here is pick the best version of the model and run with that. So what does that look like once we have a good model, where are we actually going with it? I’m gonna start by going over to here to the models tab. And you can see there’s actually some nice interface belts, which is taking the models that we trained with MLFlow that we logged in registered a corner registered model name into the model registry, here open this up. And it looks like we’ve trained a couple of past versions of this RAPIDS based airline model. I’m gonna go and take one of them and say, okay, this latest version five, I wanna use this as my production model. So let’s transition that to production, I know is great, we do this awesome Hyperparameter Sweep. And so we feel really good about it. If I have any questions about it, though, I can always drill down into the run on that back date and see exactly what was the loss and what were the Hyperparameters used for it. All right, so it’s a big model, with max depth of 15. Makes a lot of sense. Now this in production, okay I knew with it well, it’s very simple. Clearly, we’re gonna do want an X is to play this for inference somewhere. And that gets really easy and that’s where the integration of MLFlow is super helpful. So if I have some script that just wants to take whatever is the latest version of this model and have this inference script not have to know about versioning and who’s tagged what is production, I can use the MLFlow client to query based on this model name and say, just get me the version that that has. It’s in the current stage called production. So we can run that and MLFlow will automatically figure out here’s the full name of the Save model that we’ve tagged as production. So this could be completely separate data scientists, ways to use in production, who doesn’t have to know what the lifecycle models were updating in the background here. So we can just pull that model into production. Here, we’re gonna load a model using MLFlow, SKLearn load model, and we’re gonna load from that URI. And you can see what we’ve gotten back is actually this cuML random forest. So exactly the random forest classifier that we trained previously, is up and running in our driver here, and we can you just go load in some data. Again, drop off the unnecessary target variable, and go ahead and print them and see what happens. So it’s gonna take a sec to read this in okay, but we immediately get back out a bunch of predictions, which are the predicted or delay or not delay binary variables here. So again, you can see that by hooking into MLFlow, you don’t just get the ability to Hyperopt, but you start to get some of this model Lifecycle Management pretty much for free with just a few lines of code. One of the really powerful things here is that because we matched the Scikit-learn APIs carefully and cuML, we can leverage all this existing infrastructure that’s been baked in MLFlow from the beginning, and allow you to do model development with these much much faster models and Lifecycle Management in the same way that you would with with SKLearn. So hopefully, this is easy to get started. But again, play with it on your own. You can get this from the RAPIDS AI.

Try Databricks
« back
About John Zedlewski


John Zedlewski is the director of GPU-accelerated machine learning on the RAPIDS team. Previously, he worked on deep learning for self-driving cars at NVIDIA, deep learning for radiology at Enlitic, and machine learning for structured healthcare data at Castlight. He has an MA/ABD in economics from Harvard with a focus in computational econometrics and an AB in computer science from Princeton.