Quick link to the accelerator notebooks referenced through this post.
You are a security practitioner, a data scientist or a security data engineer; you've seen the Large Scale Threat Detection and Response talk with Databricks . But you're wondering, "how can I try Databricks in my own security operations?" In this blog post, you will learn how to detect a remote access trojan using passive DNS (pDNS) and threat intel. Along the way, you'll learn how to store, and analyze DNS data using Delta, Spark and MLFlow. As you well know, APT's and cyber criminals are known to utilize DNS. Threat actors use the DNS protocol for command and control or beaconing or resolution of attacker domains. This is why academic researchers and industry groups advise security teams to collect and analyze DNS events to hunt, detect, investigate and respond to threats. But you know, it's not as easy as it sounds.
Using the notebooks on this solution accelerator, you will be able to detect the Agent Tesla RAT. You will be using analytics for domain generation algorithms (DGA), typosquatting and threat intel enrichments from URLhaus. Along the way you will learn the Databricks concepts of:
Why use Databricks for this? Because the hardest thing about security analytics aren't the analytics. You already know that analyzing large scale DNS traffic logs is complicated. Colleagues in the security community tell us that the challenges fall into three categories:
In order to address these issues, security teams need a real-time data analytics platform that can handle cloud-scale, analyze data wherever it is, natively support streaming and batch analytics and, have collaborative, content development capabilities. And… if someone could make this entire system elastic to prevent hardware commits… now wouldn't that be cool!
You can use this notebook in the Databricks community edition or in your own Databricks deployment. There are lot of lines here but the high level flow is this:
Each section of the notebook has comments. We invite you to email us: [email protected] or submit issues on the Github repo. We look forward to your questions and suggestions for making this notebook easier to understand and deploy.
Now, we invite you, to log in to the community edition or your own Databricks account and run this notebook series. We look forward to your feedback and suggestions.
You can create a community edition account by going to this link. Then you can import the notebook:
Please refer to the docs for detailed instructions on importing the notebook to run.
Quick link to the accelerator notebooks referenced through this post