A Bird’s Eye View: DNS and Domain Logging
Share this entry
Introduction
Logs are the essential building blocks in security. During my time as a vendor-side evangelist for a log collection software suite, one of the common use cases I encountered was DNS log collection and integrations. Therefore, as part one of a several part series on log collection, I decided to share what I know about logging—but with a focus on logs sourced from DNS (server and client) and other log sources holding valuable IP, hostname, and domain metadata.
In the inaugural post of this series, we will cover a bird’s eye view of logging—reasons for log collection, industry guidance, and logging statistics. The subsequent blog series will cover the role of targeted log collection in defense, with pointers toward research and use cases. We also keep in mind those involved in integrating log sources and SIEMs. We will be publishing a series on Linux and Windows DNS server and client log collection and deployment, as well as other log sources (cloud, auditing, mail logs, network defense logs, and more).
Reasons for Log Collection
What are the reasons to collect logs? An overview of the main reasons includes:
- Troubleshooting operational issues
- Maintaining event integrity
- Auditing and compliance purposes
- Building a centralized log management server
- Following best practices and security guidelines. For example, being able to answer these questions:
- What are the important security events to monitor?
- What are the notable domain and DNS events to monitor?
- Informing analysts with the data they need for investigations, alerts, reporting, and more.
- Finding MITM (Man in the Middle) spoofing or hijacking of the DNS responses
- Finding evidence of communication with CnC (Command and Control)
- Finding evidence of Social Engineering/Phishing (such as active use of deceptive domains)
Reasons and rationale may not be enough, especially if you plan to convince your company to begin or improve logging. Luckily there is already guidance available in the industry, as covered in the next section.
Logging Guidance from the Industry
I recommend using the industry guidance as a starting point for log deployments while ensuring there is scope to capture not only DNS logs, but also other log sources that hold valuable metadata, such as auditing events and log metadata containing IP addresses and hostnames.
Palantir
Palantir’s WEF (Windows Event Forwarding) Guidance includes both DNS client and server event in their Github repository for WEF .
NSA
NSA published the “Windows Event Monitoring Guidance” which includes Windows Event Log channels such as “Microsoft-Windows-DNS-Client” and “Microsoft-Windows-DNSServer” in addition to the names of relevant Event Tracing for Windows (ETW) Providers.*
Example configuration from the NSA Event Monitoring Guidance:
<Query Id="0" Path="Microsoft-Windows-DNS-Client/Operational">
<!-- 3008: DNS Client events Query Completed -->
<Select Path="Microsoft-Windows-DNS-Client/Operational">*[System[(EventID=3008)]]</Select>
<!-- Suppresses local machine name resolution events -->
<Suppress Path="Microsoft-Windows-DNS-Client/Operational">*[EventData[Data[@Name="QueryOptions"]="140737488355328"]]</Suppress>
<!-- Suppresses empty name resolution events -->
<Suppress Path="Microsoft-Windows-DNS-Client/Operational">*[EventData[Data[@Name="QueryResults"]=""]]</Suppress>
</Query>
<Query Id="1" Path="DNS Server">
<!-- 150: DNS Server could not load or initialize the plug-in DLL -->
<!-- 770: DNS Server plugin DLL has been loaded -->
<Select Path="DNS Server">*[System[(EventID=150 or EventID=770)]]</Select>
<!-- NOTE: The ACL for Microsoft-Windows-DNSServer/Audit may need to be updated to allow read access by Event Log Readers -->
<!-- 541: The setting serverlevelplugindll on scope . has been set to $dll_path -->
<Select Path="Microsoft-Windows-DNSServer/Audit">*[System[(EventID=541)]]</Select>
OWASP (The Open Web Application Security Project)
OWASP Top 10 Security Vulnerabilities for 2020 included Insufficient logging and monitoring. They have included this type of vulnerability in the 2017 report as well.
From OWASP:
Insufficient logging and monitoring, coupled with missing or ineffective integration with incident response, allows attackers to attack systems further, maintain persistence, pivot to more systems to tamper with, extract, or destroy data. Most breach studies demonstrate the time to detect a breach is over 200 days, typically detected by external parties rather than internal processes or monitoring.
SANS Critical Controls
SANS Critical Controls (CIS Controls 7.1) has Sub Control 8.7 (Network), which includes the recommendation: “Enable domain name system (DNS) query logging to detect hostname lookup for known malicious C2 domains.”
Open Source
SwiftonSecurity Github Repo for Sysmon https://github.com/SwiftOnSecurity/sysmon-config
Other guidance: general log collection guidance
- CREST Cyber Security Monitoring and Logging Guide
- UK NCSC Logging Made Easy
- NIST SP 800-81-2 Secure Domain System (DNS) Deployment Guide
- PCI on Effective Daily Log Monitoring
Log Deployment Numbers: A Scenario
To fulfill the needs of analyzing IOCs (Indicators of Compromise) and other threat-hunting activities, as well as the operational requirements to fulfill SOC work (such as alerting and triage), companies require log data. And mountains of it. DNS and domain-related telemetry events work with other log data sources, such as authentication or auditing logs.
More information and data is required—once you know the exfiltration point (a malicious domain) via the logs, what entry point did the attackers use? And it needs to be specific as well. It’s why, for example, the Equifax 2017 post-mortem indicated that one of the early signals of the attack was running the whoami command on a target server.
Splunk Instance Deployment Example
The following is a real example of a deployment on a Splunk instance. Below are the details:
- Central Splunk Instance with the capacity of 12TB
- Up to 20TB effective capacity by the end of 2020
- 50+ Splunk Universal Forwarders on Windows
- 300+ Splunk Universal Forwarders expected by the end of 2020
- Actual Volume of Data: 200 GB/Day
- 300+ GB/Day expected by the end of 2020
- 100 Source types
- 650 independent log sources
One interesting item to note is that there is an upward trend towards widening log source exposure and in turn, increase log collection.
A Deployment Example – Potential Logging Stats
Based on the figure of 300+ GB/Day from the earlier example, the following is a potential
- 7400 EPS (Events per Second)
- 639,360,000 EPD (Events Per Day)
- 90 Days Raw Log Retention
- 90 Days SIEM Log Retention
- 1:1 SIEM Storage Compression
- 298 Total Raw Log Data (GB/day)
- 893 Total Normalized Log Data (GB/day)
- 26,820 GB Raw Storage Requirement
- 80,370 Total SIEM Storage Requirement
Where:
- Total Raw Log Data (GB/day) value assumes 500 bytes log data.
- Total Normalized Log Data (GB/day) value assumes 1500 bytes per stored record.
The value of 300+ GB/Day was used to reverse calculate back to how many log events this may indicate. A SIEM storage calculator (BuzzCircuit SIEM Storage Calculator*) was used to arrive at several 7400 log events per second, or 639 million log events per day. Remember that these are log events, not the full telemetry data as not all sources get logged. There may also be additional parsing rules for these logs, such as rules to deduplicate logs or drop unimportant metadata.
*These calculators are a useful utility for deployments to figure out how much budget and infrastructure are needed to meet capacity to collect logs, in addition to baseline tests.
By the time a user reaches the stage where they are considering additional help, such as DomainTools integrations or Iris Investigate, to aid in their threat detection and analysis of domains, they may have these types of scope to process.
What About DNS Server Logging Stats?
You may also want to get an idea of what the log generation numbers are for DNS server events. The DNS server sample from a Solarwinds paper on estimating log generation states that, for them, “2 Windows DNS Server” for 1000 employees can generate 100 EPS or Events per Second (peak, average peak)*. In my calculation, that is up to 8,640,000 EPD or Events per Day (based on peak).
Based on my calculations:
- 4GB/day Total Raw Log Data
- 12 GB/day Total Normalized Log Data
- 360 GB Raw Storage Requirement
- 1080 Total SIEM Storage Requirement
I have only included the Solarwinds for the events per second (in bold) since the rest are approximate numbers I calculated based on the known peak EPS.
One thing to keep in mind is that the estimations excluded other log sources. The additional sources are also of interest and would have also provided valuable metadata such as:
- DNS Client events
- Network connection logs, such as from Windows Firewall
- FQDN metadata from proxy logs
- Hostname (source and destination) from message tracking logs
- DNS Query events
More information about these log sources, including log samples, will be covered in a future blog post.
Conclusion
Defenders should pay special attention to the internals of their own DNS servers, DNS query logs from clients, network perimeter activities, mail logs, proxy logs, and more. Industry guidance illustrates the importance of collecting these (i.e. DNS, IP, hostname, domain logs). Deploying such a task is no small feat for an enterprise, as shown by the log deployment statistics.
In upcoming parts of the series, we show how log sources provide the backbone for defense, provide deployment examples, and how relevant metadata from these logs can be used for threat hunting and analysis. When relevant logs are collected, analysts can build a far more comprehensive depiction of IOCs—from a potential unauthorized intrusion via a deceptive phishing portal, malicious exfiltration beyond the network to an external IP, and more.
Leverage your Logs from Source to SIEM (and more)
Make the most of relevant logging sources in your environments and networks. Armed with the right metadata from your server, client, and network endpoints, you can improve and strengthen defenses using DomainTools Iris Detect, Iris Investigate, and numerous APIs.
Additional Resources
Use APIs to enrich your log data programmatically. The Iris Investigate API and Iris Enrich API, which process the same data sources as the Iris Investigate UI, provide Domain Risk Scores from Proximity and Threat Profile algorithms. See the API documentation to learn more. In addition to APIs, use DomainTools integrations to find threat intelligence in your domain metadata.
*Event Forwarding Guidance on Github.
(