DNS is fundamental to how the web works, and for most of the population it’s completely transparent. Everything on the web is accessed by a DNS name.
Since DNS is an old protocol (November 1987 in fact) the protocol was built in a time where encryption was not viewed as a necessary thing. This is not to be confused with DNSSEC, whose purpose is to validate that a DNS response has not been tampered with.
Because queries are not encrypted in DNS, they are visible to the network devices between you and the resolver, and even if they were, in almost all cases people use external resolvers either locally in their network (normally a DNS query cache that passes on misses to a ISP DNS resolver) or an external one (see 188.8.131.52 / 184.108.40.206 / 220.127.116.11 and other catchy IP addresses)
A lot of “threat intelligence” relies on DNS data for research, so security teams inside of companies often log DNS queries so they can track down incidents later on.
However to make this data meaningful, researchers need data from real systems in the wild. This is where they turn to larger companies who have data feeds from large resolvers silently logging queries that pass through them.
This means that downstream customers’ data is sent away to 3rd party sources, who often charge to have researchers access the data.
So what ISP’s are doing this? How can we find that out?
You just ask every open DNS resolver a query that is tied to the ISP itself! After all, the IPv4 internet is only 3.5~ million routable addresses!
So over ~18 hours my server send 50,000 DNS packets per second to random internet hosts.
To make this setup work, I bought a new domain name for this experiment and for every single ASN generated a specific DNS query for that ISP. On the sending side I discarded any responses and only listened to queries that made it back to my domains DNS nameservers:
The idea is that a lot of these resolvers I am hitting are actually just caches that upstream to their local ISP resolver. From the results of the inbound query I can see what ISP’s use what ISPs to send the lookups to my nameserver, using the results of the inbound query. A query sent to a home ISP that results in a nameserver query arriving from Google would suggest that 18.104.22.168 is in use, a query arriving from cloudflare would suggest 22.214.171.124, or a query arriving from the ISP itself would suggest it’s the ISPs local infrastructure looking up the query.
In addition, we can look up on the open platforms to see what subdomains were harvested:
There doesn’t seem to be a direct link between these ASN’s and their resolvers (other than google, who is unlikely to be only selling a very small % of their queries)
There is another interesting phenomena. Even when scanning was finished, I was still getting queries as if I was still sending packets to some ISPs. I’ve come to call these “zombie queries”
Here is an example:
I sent 2 million queries to Telenet in Belgium, and got 416 queries to my name servers back. However after scanning I got 4 queries from a South African ISP:
2018-06-20 05:27:15 IP 126.96.36.199.51587 > 188.8.131.52.53: 32906 [1au] A? a-6848.4uqu.party. (46) 2018-06-20 05:28:51 IP 184.108.40.206.39401 > 220.127.116.11.53: 20872 [1au] ANY? a-6848.4uqu.party. (46) 2018-06-20 05:28:51 IP 18.104.22.168.35844 > 22.214.171.124.53: 2062 [1au] ANY? a-6848.4uqu.party. (46) 2018-06-20 05:28:52 IP 126.96.36.199.60279 > 188.8.131.52.53: 14686 [1au] ANY? a-6848.4uqu.party. (46)
This is especially curious, since the last 3 of those queries are for a different DNS type. So something has to have processed the query sent to Telenet in Belgium and decided to poke at my subdomain more.
This does not 100% mean that Telenet has sold data to a 3rd party that then re-queried me in South Africa, it could have been a customer with a config that pointed to another upstream DNS provider that sold it.
You can explore below the data I have gathered, You can enter ISP names or the AS number:
You can find the tools I wrote for this over at: https://github.com/benjojo/dns-spies
Until next time!