new

improved

fixed

MDR for Microsoft 365

MDR for Microsoft 365 Progress Report #1

As promised, this is the first of many updates on our progress addressing the issues we've identified and others have pointed out with our MDR for Microsoft 365 product. This list isn't exhaustive. There are other minor bug fixes and refactors happening, but we wanted to call some out since they are important.
Spur location and VPN data generating signals for our SOC analysts:
We've partnered with Spur to enable us to enrich IP addresses associated with Microsoft 365 events. This allows us to identify the geographic location of IP addresses so that we can identify the location of the user who generated the event. Spur also provides us with information about the services provided by the IP address. This allows us to identify IP addresses that are part of VPN and anonymization networks, which is a good signal to our SOC analysts that they need to take a second look at what that login session has been doing.
We've been capturing this data for a little bit, but we've finally finished the work necessary to pass this data to our detection engine so that our Detection Engineering team can build detectors around suspicious activity and so that our SOC analysts can have better insight when investigating suspicious behavior. After deploying this to production we identified and reported
90 cases of suspicious activity
to partners last week. We even had a partner post about our SOC detecting them working remotely and locking their account to be safe.
Fixed a bug that caused us to miss existing malicious inbox rules for new M365 users:
We found that when we were importing existing inbox rules for M365 users during Huntress onboarding, we were not generating alerts for our SOC analysts to report. It turns out that we had a bug that caused the events not to match the detectors, so we were not able to report on malicious inbox rules that existed
before
we were deployed and started to receive the Microsoft 365 events from the audit log. After realizing this, we conducted a manual threat hunt to search for suspicious inbox rules we may have missed. In cases where a historical inbox rule detection occurred, an incident report was issued and a separate note with additional context was sent to the impacted partner.
Obviously this is a big problem and one that we addressed very quickly. We're also putting in place some testing to ensure that this can't happen again in the future. We've got a fairly robust monitoring and observability system in DataDog, we just need to expose some metrics around this so we can identify if it stops working in the future.
Some detections being missed because we reached the maximum hits for an Elasticsearch query:
We found that in some cases, we were missing detections because the maximum number of hits an Elasticsearch rule was able to have was 100. This meant that if there were too many matches in a short time period, not all matches would be returned. This one was not obvious, because you don't know what you don't know, but we identified some events that we thought should have generated signals and did not and we've seen this issue with Elasticsearch before.
We updated the maximum number of hits for a query to 5,000 for now and we're continuing the work to move these detectors over to our new custom detection engine, where we don't have to worry about this limitation.
Partners missing Microsoft 365 permissions due to Entra Application permissions update:
When you create an Entra Application (integration) to allow access to Microsoft 365 tenant data, you have to specify a list of permissions (scopes) for what you'll need to access. The downside to this is that if you ever need to expand that list of permissions in the future, everyone needs to re-authorize your application. This is not ideal since it requires user actions, and getting the attention of busy administrators to take care of this is challenging. A few months back we added new permissions to our Entra Application and while we thought we got everyone to re-authorize the application, we found that not everyone had the permissions our system needed to take the necessary remediations.
We've contacted the 71 affected partners and our support team is working with them to address the rules. This is a tricky one to say is fully solved because we try to only ask for the permissions we need, but if we expand what we're doing and need more permissions in the future, we'll have to ensure that everyone re-authorizes our application. In the future we'll put checks in place to automatically bug folks when we need them to re-authorize the application, and we'll also try to get as many of the permissions correct the first time.
New detectors added:
I don't know how helpful listing the new detectors we're adding will be, but we've gotten a decent number of requests from folks to help them understand what types of things we're detecting, so here are a few new detectors we shipped:
  • Login from VPN
  • Login from proxy
  • Login from brute force IP
  • Login from TOR
  • Login from new region
  • Login from RDP
Why do these detectors matter? Because they fixed much of the gap we had regarding detection efficacy. We have a few more key detectors we will be adding in the next couple of weeks.
Conclusion
While we identified some issues with some of our detectors and infrastructure, we were able to get these fixed and put tests in place to keep this from happening again. Nothing will be perfect the first time. Progress is the name of the game. We're going to keep iterating on what we have to make it better and to chip away at the never-ending list of TODOs.
-- Chris