Proactive Threat Hunting in Duo Data
00. Catching Smoke
Duo Security processes over a billion authentications per month. Almost all of them are benign – a user logging in as part of their work or school routine to access their protected applications. A very small percentage of authentications, however, are what we'd consider malicious. Maybe a user clicked on a phishing link and handed over an MFA passcode. Maybe an attacker gained someone's first factor credentials and sent them Push requests in the hopes that they would eventually hit accept. MFA attacks can unfold in many ways, and understanding what they look like within real authentication data is critical to detecting and preventing them effectively.
Machine learning can be an effective tool for identifying malicious activity in log data. Unsupervised methods (those not relying on a pre-labeled dataset for training) can detect anomalous behavior but can have a high false positive rate as users tend to be unpredictable. Supervised methods (those relying on a labeled dataset for training) can be effective at identifying known attacks; research has shown that neural networks like long short-term memory (LSTM) models can be effective at detecting malicious data on an imbalanced dataset. Other work has shown that classification of benign and malicious authentications can be achieved using more explainable approaches like random forest models. These supervised methods, however, require a robust training dataset containing labeled instances of malicious and benign authentications.
In short, we can't prevent attacks we aren't looking for, and we can't build effective detection mechanisms without an understanding of how attacks show up in our data.
01. Intelligent Threat Hunting in Big Data
Threat hunting is the act of looking for malicious activity in data. We at Duo have been ramping up our threat hunting activities so that we can better understand threats and create more effective models.
Reactive Threat Hunting
We classify threat hunting into two modalities: reactive and proactive. Reactive threat hunting typically starts with a customer notifying us that one or more of their users may have been compromised. This process tends to blur the lines between IR/investigation and threat hunting. When we receive a potential IOC (indicator of compromise) like a suspicious IP address making requests to their user accounts, we can simultaneously support them in their investigation of the immediate impact, as well as use that information to hunt for similar patterns across our entire customer base.
During an investigation, we can help confirm the threat and can dive into the data and better understand things like the following:
- Scope of impact: Was it just one user impacted, or are we able to map things like the attacker's IP address to logs across many users? Do we see any signs of persistent access, like an attacker-registered authentication device?
- Root cause: Did a user get phished? Were they part of a brute force attack? Was a compromise possibly the result of misconfigured access policies?
- Immediate mitigation: What users were impacted, and do they need to take immediate action such as rotating credentials? Are there any attacker-owned devices that need to be removed from the customer's environment?
- Product-rooted prevention: What capabilities, if any, exist within Duo's current offering that would have potentially prevented the attack? Are there any gaps in the product that would serve more effective prevention in the future?
In cases where the customer may require assistance in investigating an ongoing breach, we will typically offer the above information and assist with any further questions that may require more visibility into our data. It is important to note that this assisted incident investigation is still in its early stages as a process and is not yet formally offered by Duo as a service.
These instances, in addition to aiding the customer, benefit us in terms of gaining more labeled data. However, a reactive approach to threats falls short in a few areas. First, we are always going to be behind. At this stage in the game, the attacker has likely gained access to one or more user accounts, and our role is simply to help a customer understand the severity of the situation. Second, we fall short of capturing attacks that may subvert customer's own detection. Relying on customer notifications to capture true positive attacks would limit the scope of the attack data collected and could lead to a dataset heavily biased towards certain attack types. Attacks that rely on specific policy configurations, for example, may look benign to a customer but will only be visible to us with data that spans across all Duo customers.
Proactive Threat Hunting
Proactive threat hunting is one way that we augment reactive investigation of customer-driven notifications. By definition, proactive threat hunting requires us to find attacks we don't yet know about. One way we have found effective at identifying broad patterns is to analyze patterns across our entire customer base. For example, we can observe time-series patterns of an attacker network crawling customer integrations and user accounts, as we did in our recent Talos collaboration on wide spread VPN attacks. We can observe high rates of failure to identify infrastructure responsible for large-scale brute force attacks, like these observed attacks against remote desktop servers. We can also identify suspicious authentication devices that are registered across multiple user accounts.
We leverage these patterns to then generate a triage queue of possible IOCs. We then maintain that queue via a custom-built internal tool exclusively for large scale exploration of our entire authentication data lake. This tool allows us to visualize threat actor activity on a variety of levels. An example plot, shown below, illustrates a high authentication failure rate from a single IP across multiple customers. As these authentications are entirely Duo Push requests, this indicates a Push Spray attack in which an attacker targets a specific type of application, attempts guessable credentials, and sends Push requests in the hopes that at least one user will errantly accept the request.
This simple representation of the power of cross-customer IOC-level analysis enables us to see a story rather than a single data point. For many of these impacted customers, they only saw a handful of failed authentications and likely were unable to identify any malicious behavior. When shown here in a time-series plot, however, we are able to better understand the attacker's approach.
02. Leveling Up with Talos Threat Intelligence
As we identify possible IOCs in our data, we will often share them with our threat intelligence team at Talos for further investigation. This allows us to understand whether an IP address we see sending fraudulent requests to users, for example, might also be flagged in other sources like Talos honeypots or intelligence from other vendors such as GreyNoise. Cross checking IOCs across multiple sources allows us to have high confidence when taking immediate action like blocking an IP from accessing Duo altogether.
This collaboration between Duo and Talos combines deep Duo product-specific expertise with world class broad threat research, resulting in improved iteration speed and improved investigation outcomes for both groups.
03. Cataloging and Using Threat Data
Identifying a threat and letting customers know is a key first step, but our motivation for this work remains to use real world attack data to better enable future detection and prevention development within the product. In an ideal world, we don't have to wait for an attacker to make access attempts across many customers before we can isolate and stop their activity.
To track attacks, we developed a pipeline which ingests IOCs (networks, device signatures) and the timeframe they were active, and maps them to all relevant authentications. This enables us to maintain a high confidence dataset of authentications known to be associated with malicious behavior. In 6 months, we have increased our volume of labeled authentications by over 300%, with attack patterns ranging from phishing attacks, brute force attacks, password spraying, and MFA fatigue. These labels lead to a better product and improved security for our customers. They can also empower future development of smarter machine learning models to detect and prevent attacks in real time.
04. Conclusions
Threat intelligence remains a cat and mouse game. Attackers change course rapidly and tracking their behavior across billions of authentications can feel like a daunting task. By combining the expertise of data scientists, threat analysts, and customers, however, we have begun to catalogue true positive attack data in a way that will help us proactively catch attacks as they unfold and enable improved product efficacy for the future.