Friday, April 4, 2014 - 09:57

[Updated with further information and suggestions provided by CSIRTs: thanks!]

One incident response tool that seems to be growing in value is passive DNS monitoring, described in Florian Weimer's original paper. As described in the references at the bottom of this post, patterns of activity in the Domain Name System – when names change, move or are looked up – can be used to give early warning of phishing campaigns, botnets, malware, and more. And this is achieved with a negligible impact on the privacy of Internet users.

DNS is sometimes described as the phone book of the Internet: it's the distributed public database that lets us humans type in www.bbc.co.uk and our computers know that they need to contact the much less memorable 212.58.244.71. If DNS is all the phone books in the world, then passive DNS monitoring is a bit like a traditional reference library where you're asked not to put books back on the shelf so that staff know which books have been referenced but not who read them or which other books that individual read. Similarly, passive DNS records only what questions were asked the Domain Name System and what the answers were at the time (in my example the pDNS record would be "6/3/14 15:30:21 www.bbc.co.uk A 212.58.244.71"). Any information about the source of the question or anything that could link it to other DNS questions I might ask is either not collected at all, or immediately discarded.

A couple of examples show why even that information can be very useful. If the answers to "where is www.bbc.co.uk?" suddenly changed country then we might wonder whether something had gone wrong. If a previously unknown IP address is suddenly of interest to many different computers and its domain name looks like my-online-banking-service.we4se8934ds.com then we might suspect that a new round of phishing e-mails have just been sent. Although there may be legitimate explanations for these kinds of unusual patterns, they definitely highlight things worth further investigation.

Logging IP addresses may seem to raise issues around protecting personal data but the kinds of information collected by passive DNS should not involve privacy or compliance risks:

No information is logged about either the individual who made the query or their computer so there are no identifiers or patterns that could be used to link queries to those who made them;
The great majority of DNS queries, and therefore of the answers logged by passive DNS, are for the IP addresses of multi-user servers (web, e-mail, etc.) that are not associated with an individual person;
The proportion of responses relating to client machines can be further reduced if organisations exclude queries for local names and addresses from their passive DNS logs. If one of the organisation’s own addresses is behaving strangely then it should fix the problem, not just log it!
Everything logged by passive DNS is extracts from public databases maintained by those responsible for the addresses and names.

These precautions cannot completely eliminate the possibility of processing personal data. For example if an organisation names computers with public addresses after their individual owners then these are likely appear in the public DNS database and may be captured by a passive DNS sensor. Such "user-assisted" privacy infringements are, however, a necessary consequence of a technique that is very effective in helping incident response teams detect and mitigate the much greater privacy breaches that result from the phishing of bank account credentials. Both the e-Privacy Directive (Recital 53) and draft Data Protection Regulation recognise incident detection and response, protecting both systems and data, as a legitimate reason for processing personal data; the minimal processing of personal data involved in passive DNS should be considered clearly proportionate to that aim.

It may be possible to further tune passive DNS systems to exclude records that are unlikely to be relevant to incident response or carry a higher privacy risk. For example:

not recording error responses, which may include typing errors that expose private data (see Merike Kaeo's paper for how this can happen);
not recording responses from DNS servers known to return wildcard responses to requests containing typos;
not recording (or at least not sharing) responses involving private IP addresses or other DNS data not intended to be available on the public Internet;
recording only specific record types known to include indicators of common problems (e.g. A, AAAA, MX, NS).

This does, however, mean that there will be no historic data available if the excluded records do subsequently turn out to be needed for a specific investigation.

Any organisation or network that (unlike Janet) runs its own DNS resolver can collect passive DNS data from it. However the value of passive DNS data increases if it is shared, since patterns such as the start of a phishing campaign are easier to detect with data from a large range of internet locations. Aggregating records also further reduces privacy risk as runs of duplicate records can be reduced to counts: "between times X and Y there were N queries for www.bbc.co.uk that returned 212.58.244.71". Passive DNS sharing is an excellent example of a technique that improves both security and privacy.