Python DNS Resolver — Deep Dive
System-level framing
DNS is the foundation of internet addressing, yet most Python applications treat it as a black box delegated to the OS. Taking control of DNS resolution lets you build monitoring tools, security scanners, load balancer verifiers, and migration validators that would otherwise require shell scripts piping through dig.
The two main approaches in Python are the built-in socket module (which delegates to the OS resolver via getaddrinfo) and the dnspython library (which implements the DNS protocol natively). This deep dive focuses primarily on dnspython because it gives full protocol access.
Installation and basic queries
pip install dnspython
import dns.resolver
# Simple A record lookup
answers = dns.resolver.resolve("example.com", "A")
for rdata in answers:
print(f"IP: {rdata.address}")
# Check the TTL
print(f"TTL: {answers.rrset.ttl} seconds")
The resolve() function returns an Answer object containing an rrset (resource record set). Each record type has specific attributes — address for A/AAAA, exchange and preference for MX, target for CNAME and SRV.
Querying specific record types
# MX records with priority
mx_answers = dns.resolver.resolve("gmail.com", "MX")
for rdata in sorted(mx_answers, key=lambda r: r.preference):
print(f"Priority {rdata.preference}: {rdata.exchange}")
# TXT records (SPF, DKIM, domain verification)
txt_answers = dns.resolver.resolve("example.com", "TXT")
for rdata in txt_answers:
print(rdata.to_text())
# SRV records (service discovery)
srv_answers = dns.resolver.resolve("_sip._tcp.example.com", "SRV")
for rdata in srv_answers:
print(f"{rdata.target}:{rdata.port} (priority={rdata.priority}, weight={rdata.weight})")
Custom resolver configuration
By default, dnspython uses the system resolver settings from /etc/resolv.conf. You can override this:
import dns.resolver
resolver = dns.resolver.Resolver()
resolver.nameservers = ["8.8.8.8", "1.1.1.1"]
resolver.timeout = 5.0
resolver.lifetime = 10.0 # total time for all retries
answers = resolver.resolve("example.com", "A")
The timeout controls per-query wait time; lifetime caps the total time across all retry attempts. In production monitoring, set both explicitly to avoid hanging threads.
Async resolution for bulk lookups
When resolving hundreds or thousands of domains, serial queries are too slow. dnspython provides an async resolver:
import asyncio
import dns.asyncresolver
async def resolve_domain(domain: str) -> dict:
resolver = dns.asyncresolver.Resolver()
resolver.nameservers = ["8.8.8.8"]
try:
answers = await resolver.resolve(domain, "A")
return {"domain": domain, "ips": [r.address for r in answers]}
except dns.resolver.NXDOMAIN:
return {"domain": domain, "error": "NXDOMAIN"}
except dns.resolver.NoAnswer:
return {"domain": domain, "error": "NoAnswer"}
except dns.exception.Timeout:
return {"domain": domain, "error": "Timeout"}
async def bulk_resolve(domains: list[str], concurrency: int = 50) -> list[dict]:
semaphore = asyncio.Semaphore(concurrency)
async def limited(domain):
async with semaphore:
return await resolve_domain(domain)
return await asyncio.gather(*[limited(d) for d in domains])
# Usage
domains = ["google.com", "github.com", "nonexistent.invalid"]
results = asyncio.run(bulk_resolve(domains))
for r in results:
print(r)
Using a semaphore prevents overwhelming the resolver. In practice, 50-100 concurrent queries is a reasonable starting point.
Reverse DNS lookups
Convert an IP address back to a hostname:
import dns.reversename
import dns.resolver
addr = dns.reversename.from_address("8.8.8.8")
answers = dns.resolver.resolve(addr, "PTR")
for rdata in answers:
print(f"8.8.8.8 → {rdata.target}")
# Output: 8.8.8.8 → dns.google.
Reverse DNS is essential for log analysis, email server verification (matching PTR records to HELO domains), and network auditing.
DNSSEC validation
DNSSEC adds cryptographic signatures to DNS responses, preventing spoofing. dnspython can request and validate these:
import dns.resolver
import dns.flags
resolver = dns.resolver.Resolver()
resolver.use_edns(0, dns.flags.DO, 4096) # Request DNSSEC data
try:
answers = resolver.resolve("example.com", "A")
if answers.response.flags & dns.flags.AD:
print("DNSSEC validated (Authenticated Data flag set)")
else:
print("Response not DNSSEC validated")
except dns.resolver.NXDOMAIN:
print("Domain does not exist")
The AD (Authenticated Data) flag indicates the recursive resolver validated the chain of trust. For full local validation, you need to verify signatures against the root trust anchor yourself — dnspython provides the primitives but the implementation is non-trivial.
Zone transfers (AXFR)
If you manage DNS infrastructure, zone transfers let you fetch all records for a zone:
import dns.zone
import dns.query
try:
zone = dns.zone.from_xfr(
dns.query.xfr("ns1.example.com", "example.com", timeout=30)
)
for name, node in zone.nodes.items():
print(zone[name].to_text(name))
except dns.exception.FormError:
print("Zone transfer refused (as expected for most public servers)")
Most authoritative servers restrict AXFR to authorized IPs. This is useful for internal DNS auditing and migration verification.
Real-world patterns
Domain migration validator
When migrating domains, verify that all records match expectations:
def validate_migration(domain: str, expected: dict) -> list[str]:
errors = []
resolver = dns.resolver.Resolver()
resolver.nameservers = ["8.8.8.8"] # Use public resolver for external view
for record_type, expected_values in expected.items():
try:
answers = resolver.resolve(domain, record_type)
actual = {rdata.to_text() for rdata in answers}
missing = set(expected_values) - actual
if missing:
errors.append(f"{record_type}: missing {missing}")
except dns.resolver.NoAnswer:
errors.append(f"{record_type}: no answer")
return errors
# Usage
issues = validate_migration("example.com", {
"A": {"93.184.216.34"},
"MX": {"10 mail.example.com."},
})
TTL monitoring for cache warming
def check_ttl_health(domain: str, min_ttl: int = 300) -> dict:
resolver = dns.resolver.Resolver()
answers = resolver.resolve(domain, "A")
ttl = answers.rrset.ttl
return {
"domain": domain,
"ttl": ttl,
"healthy": ttl >= min_ttl,
"warning": "TTL too low — frequent re-resolution" if ttl < min_ttl else None,
}
Email deliverability checker
def check_email_dns(domain: str) -> dict:
results = {"domain": domain, "issues": []}
resolver = dns.resolver.Resolver()
# MX records
try:
mx = resolver.resolve(domain, "MX")
results["mx"] = [f"{r.preference} {r.exchange}" for r in mx]
except dns.resolver.NoAnswer:
results["issues"].append("No MX records — email delivery will fail")
# SPF record
try:
txt = resolver.resolve(domain, "TXT")
spf = [r.to_text() for r in txt if "v=spf1" in r.to_text()]
results["spf"] = spf or None
if not spf:
results["issues"].append("No SPF record — emails may be marked as spam")
except dns.resolver.NoAnswer:
results["issues"].append("No TXT records at all")
return results
Error handling taxonomy
import dns.resolver
import dns.exception
try:
answers = dns.resolver.resolve("example.com", "A")
except dns.resolver.NXDOMAIN:
# Domain does not exist
pass
except dns.resolver.NoAnswer:
# Domain exists but has no records of the requested type
pass
except dns.resolver.NoNameservers:
# All nameservers failed to respond
pass
except dns.exception.Timeout:
# Query timed out
pass
except dns.resolver.NoRootSOA:
# Cannot find root SOA — broken resolver configuration
pass
Each exception type maps to a specific failure mode. Production code should handle all of them, not just catch a broad Exception.
Performance considerations
- Connection reuse:
dnspythonuses UDP by default (no connection state). For TCP queries (large responses, zone transfers), connections are per-query. - Caching:
dnspythondoes not cache by default. For repeated lookups, implement your own cache keyed by (name, type) with TTL-based expiration, or usedns.resolver.Cache. - Threading: The synchronous resolver is safe to call from multiple threads. For high-throughput scenarios, the async resolver with
asynciois more efficient. - System resolver comparison:
socket.getaddrinfo()benefits from the OS DNS cache (nscd, systemd-resolved).dnspythonbypasses this cache, which gives fresher results but higher latency for repeated queries.
Tradeoffs
| Approach | Pros | Cons |
|---|---|---|
socket.getaddrinfo | Zero dependencies, OS cache, handles /etc/hosts | Limited to A/AAAA, no record type control |
dnspython sync | Full DNS protocol, any record type | No OS cache, blocking I/O |
dnspython async | High throughput for bulk queries | Requires asyncio event loop |
subprocess + dig | Quick one-offs in scripts | Fragile parsing, slow for bulk |
One thing to remember: DNS is a distributed, cached system with eventual consistency. Your Python code can query it with precision, but the answers you get depend on cache state, propagation delays, and resolver configuration. Build tools that account for this uncertainty.
See Also
- Python Dpkt Packet Parsing Understand how Python reads and decodes captured network traffic, like opening envelopes to see what is inside each message.
- Python Ftp Sftp Transfers Understand how Python moves files between computers over a network, like a digital delivery truck with a locked or unlocked cargo door.
- Python Impacket Security Tools Understand how Python speaks the secret languages of Windows networks, helping security teams find weaknesses before attackers do.
- Python Netconf Yang Understand how Python configures network devices automatically, like a remote control for every router and switch in your building.
- Python Pcap Analysis Understand how Python reads recordings of network traffic, like playing back security camera footage to see what happened on your network.