Combining Rules-based and AI Models to Combat Financial Fraud
The financial services industry (FSI) is rushing towards transformational change, delivering transactional features and facilitating payments through new digital channels to remain competitive. Unfortunately, the speed and convenience that these capabilities afford also benefit fraudsters. Fraud in financial services still remains the number one threat to organizations’ bottom line given the record-high increase in overall...
Bayesian Modeling of the Temporal Dynamics of COVID-19 Using PyMC3
In this post, we look at how to use PyMC3 to infer the disease parameters for COVID-19. PyMC3 is a popular probabilistic programming framework that is used for Bayesian modeling. Two popular methods to accomplish this are the Markov Chain Monte Carlo (MCMC) and Variational Inference methods. The work here looks at using the currently...
Personalizing the Customer Experience with Recommendations
Go directly to the Recommendation notebooks referenced throughout this post. Retail made a giant leap forward in the adoption of e-commerce in 2020, E-commerce as a percentage of total retail saw multiple years of progress in one year. Meanwhile, COVID, lockdowns and economic uncertainty have completely disrupted how we engage and retain customers. Companies need...
MLflow Model Registry on Databricks Simplifies MLOps With CI/CD Features
MLflow helps organizations manage the ML lifecycle through the ability to track experiment metrics, parameters, and artifacts, as well as deploy models to batch or real-time serving systems. The MLflow Model Registry provides a central repository to manage the model deployment lifecycle, acting as the hub between experimentation and deployment. A critical part of MLOps,...
How to Train XGBoost With Spark
XGBoost is currently one of the most popular machine learning libraries and distributed training is becoming more frequently required to accommodate the rapidly increasing size of datasets. To utilize distributed training on a Spark cluster, the XGBoost4J-Spark package can be used in Scala pipelines but presents issues with Python pipelines. This article will go over...
MLflow 1.12 Features Extended PyTorch Integration
MLflow 1.12 features include extended PyTorch integration, SHAP model explainability, autologging MLflow entities for supported model flavors, and a number of UI and document improvements. Now available on PyPI and the docs online, you can install this new release with pip install mlflow==1.12.0 as described in the MLflow quickstart guide. In this blog, we briefly...
Quickly Deploy, Test, and Manage ML Models as REST Endpoints with MLflow Model Serving on Databricks
MLflow Model Registry now provides turnkey model serving for dashboarding and real-time inference, including code snippets for tests, controls, and automation. MLflow Model Serving on Databricks provides a turnkey solution to host machine learning (ML) models as REST endpoints that are updated automatically, enabling data teams to own the end-to-end lifecycle of a real-time machine...
Ten Simple Databricks Notebook Tips & Tricks for Data Scientists
Often, small things make a huge difference, hence the adage that "some of the best ideas are simple!" Over the course of a few releases this year, and in our efforts to make Databricks simple, we have added several small features in our notebooks that make a huge difference. In this blog and the accompanying...
Reputation Risk: Improving Business Competency and Nurturing Happy Customers by Building a Risk Analysis Engine
Why reputation risk matters? When it comes to the term "risk management", Financial Service Institutions (FSI) have seen guidance and frameworks around capital requirements from Basel standards. But, none of these guidelines mention reputation risk and for years organizations have lacked a clear way to manage and measure non-financial risks such as reputation risk. Given...
Detecting At-risk Patients with Real World Data
With the rise of low cost genome sequencing and AI-enabled medical imaging, there has been substantial interest in precision medicine. In precision medicine, we aim to use data and AI to come up with the best treatment for a disease. While precision medicine has improved outcomes for patients diagnosed with rare diseases and cancers, precision...