Parallelizing Your Farsight DNSDB Queries
Share this entry
Ever wish you could get a bunch of Farsight DNSDB queries completed more quickly? Often you can!
DNSDB API subscribers can run up to ten parallel streams of DNSDB queries, even though most users only run their queries sequentially. The easiest way to explain the difference between serial and parallel execution paradigms may be with an illustration:
Serial execution is what you get by default when running DNSDB passive DNS queries with:
- dnsdbq (see https://github.com/dnsdb/dnsdbq), the company’s command line DNSDB client (at least as normally used)
- Farsight DNSDB Scout (see https://www.farsightsecurity.com/tools/dnsdb-scout/), the company’s web interface to DNSDB, or
- Most 3rd-party DNSDB integrations.
When doing simple serial queries, a DNSDB query gets made, results get found and returned, and after that query’s done, another query can then run. This is very straightforward BUT NOT particularly fast.
Parallel execution runs multiple streams of jobs concurrently. Running multiple jobs in parallel potentially delivers dramatically improved throughput: more work completed in less time. In fact, in this case, we have what’s referred to as an “embarrassingly parallel” problem that we can easily “divide up” and run chunk-by-chunk with no required interaction between/across the chunks — perfect for parallel execution paradigms.
The “devil is in the details.” Parallelizing your DNSDB queries is conceptually simple, but how to actually do it, particularly if you’re not a programmer?
In a new whitepaper, “Speeding Up Farsight DNSDB Queries via Parallelization,” DomainTools Distinguished Scientist Joe St Sauver starts with what you already know, and then gradually make small but highly-impactful improvements, providing a “path to going parallel” even for DNSDB users who aren’t developers.
Isn’t it time you learned how to increase the speed with which you run large numbers of DNSDB API jobs?