High-profile cybersecurity breaches dominated headlines in 2017. In the first half of the year, over 1.9B records were stolen. That’s more than 7,000 records breached every minute. And the fallout from a single event can be staggering. Customer attrition, negative PR and regulatory fines amount to millions in financial losses. In fact, according to recent research from IBM the average cost of a data breach is $3.62M.
With thousands of records being stolen each day this begs the question – why is this happening and what can be done to help prevent it?
Cybercriminals have become more sophisticated over the years. No longer relying on a single tactic to penetrate the enterprise firewall, most criminals employ a coordinated, multi-pronged attack. Verizon recently published a list of the most common tactics used by cybercriminals and it’s clear the methods are diverse:
What tactics do cybercriminals use?
Preventing one type of attack is simply not enough. To make matters more complicated, cybercriminals have begun to make use of AI-supported systems to rapidly scale attacks, personalize phishing emails, identify system vulnerabilities, and mutate malware and ransomware on the fly. Staying ahead of these increasingly complex attacks requires cybersecurity teams to monitor their network for a broad range of threats that may or may not resemble traditional threat patterns.
Staying abreast of the latest threat isn’t the only challenge. The increasing volume and complexity of threats require security teams to capture and mine mountains of data in order to avoid a breach. Yet, the Security Information and Event Management (SIEM) and threat detection tools they’ve come to rely on were not built with big data in mind resulting in a number of challenges:
In order to effectively detect and remediate threats in today’s environment, security teams need to find a better way to process and correlate massive amounts of real-time and historical data, detect patterns that exist outside pre-defined rules and reduce the number of false positives.
Databricks offers security teams a new set of tools to combat the growing challenges of big data and sophisticated threats. Where existing tools fall short, the Databricks Unified Analytics Platform fills the void with a platform for data scientists and cybersecurity analysts to easily build, scale, and deploy real-time analytics and machine learning models in minutes, leading to better detection and remediation.
Databricks complements existing threat detection efforts with the following capabilities:
A leading technology company employs a large cybersecurity operations center to monitor, analyze and investigate trillions of threat signals each day. Data flows in from a diverse set of sources including intrusion detection systems, network infrastructure and server logs, application logs and more, totaling petabytes in size.
When a suspicious event is identified, threat response teams need to run queries in real-time against large historical datasets to verify the extent and validity of a potential breach. To keep pace with the threat environment the team needed a solution capable of:
The Challenge
It took a team of twenty engineers over six months to build their legacy architecture that consisted of various data lakes, data warehouses, and ETL tools to try to meet these requirements. Even then, the team was only able to store two weeks of data in its data warehouses due to cost, limiting its ability to look backward in time. Furthermore, the data warehouses chosen were not able to run machine learning.
The Solution
Using the Databricks Unified Analytics platform the company was able to put their new architecture into production in just two weeks with a team of five engineers.
Their new architecture is simple and performant. End-to-end latency is low (seconds to minutes) and the threat response team saw up to 100x query speed improvements over open source Apache Spark on Parquet. Moreover, using Databricks, the team is now able to run interactive queries on all its historical data — not just two weeks worth — making it possible to better detect threats over longer time horizons and conduct deep forensic reviews. They also gain the ability to leverage Apache Spark for machine learning and advanced analytics.
As cybercriminals continue to evolve their techniques, so do cybersecurity teams need to evolve how they detect and prevent threats. Big data analytics and AI offer a new hope for organizations looking to improve their security posture, but choosing the right platform is critical to success.
Download our Cybersecurity Analytics Solution Brief or watch the replay of our recent webinar Enhancing Threat Detection with Big Data and AI to learn how Databricks can enhance your security posture.