featured image, HTML format
Blog Farsight TXT Record

Limiting DNSDB Results: dnsdbq little ell vs big ell

I. Introduction

dnsdbq is Farsight’s popular command line client interface to Farsight’s DNSDB and other passive DNS systems. It is available in easy-to-build-from-source code form from https://github.com/dnsdb/dnsdbq.

Some of dnsdbq’s features may initially seem complex or confusing. For example, why does dnsdbq have both a little ell option (-l) and a big ell (-L) option for limiting responses? The manual page for dnsdbq describes both:

-l query_limit
     query for that limit's number of responses. If specified as 0 then
     the DNSDB API server will return the maximum limit of results
     allowed.  If -l, is not specified, then the query will not specify a
     limit, and the DNSDB API server may use its default limit.

[...]

-L output_limit
     clamps the number of objects per response (under -[R|r|N|n|i|f]) or
     for all responses (under -[fm|ff|ffm]) output to output_limit.  If
     unset, and if batch and merge modes have not been selected with the
     -f and -m options, then the -L output limit defaults to the -l
     limit's value. Otherwise the default is no output limit.

Those may superficially seem quite similar (they’re both limiting what we end up getting, right?), but, in fact, there are important differences.

II. Experimenting With Little Ell and Big Ell

For example, let’s assume we want to look at ALL the results for an RRname query for www.uoregon.edu/CNAME, sorted in descending order. We see results that look like:

$ dnsdbq -r www.uoregon.edu/CNAME -S -k last
;; record times: 2019-02-22T04:08:40Z .. 2021-03-22T14:30:51Z (~2y ~29d)
;; count: 847635; bailiwick: uoregon.edu.
www.uoregon.edu.  CNAME  drupal-hosting-web-cluster5-prod.uoregon.edu.

;; record times: 2019-02-22T01:10:03Z .. 2019-02-22T04:06:06Z (2h 56m 4s)
;; count: 146; bailiwick: uoregon.edu.
www.uoregon.edu.  CNAME  drupal-hosting-web-cluster5.uoregon.edu.

;; record times: 2014-12-29T16:09:30Z .. 2019-02-22T01:01:40Z (~4y ~55d)
;; count: 1525954; bailiwick: uoregon.edu.
www.uoregon.edu.  CNAME  drupal-cluster5.uoregon.edu.

;; record times: 2013-09-12T14:44:09Z .. 2014-12-29T16:14:57Z (~1y ~108d)
;; count: 1002955; bailiwick: uoregon.edu.
www.uoregon.edu.  CNAME  wc-www.uoregon.edu.

;; record times: 2010-10-19T12:12:39Z .. 2013-09-12T14:43:56Z (~2y ~329d)
;; count: 1924809; bailiwick: uoregon.edu.
www.uoregon.edu.  CNAME  uowc-www.uoregon.edu.

Now assume that we want to keep just the most recent result, perhaps for use in an example in some documentation. We MISTAKENLY attempt to get that result by adding dash little ell one:

$ dnsdbq -r www.uoregon.edu/CNAME -S -k last -l1
Query limited: Result limit reached
;; record times: 2013-09-12T14:44:09Z .. 2014-12-29T16:14:57Z (~1y ~108d)
;; count: 1002955; bailiwick: uoregon.edu.
www.uoregon.edu.  CNAME  wc-www.uoregon.edu.

Hmm. That’s not the result we expected! We wanted the MOST RECENT result, but actually get the 2nd-to-oldest result instead.

So what do we see if we use big ell instead of little ell?

$ dnsdbq -r www.uoregon.edu/CNAME -S -k last -L1
;; record times: 2019-02-22T04:08:40Z .. 2021-03-22T14:30:51Z (~2y ~29d)
;; count: 847635; bailiwick: uoregon.edu.
www.uoregon.edu.  CNAME  drupal-hosting-web-cluster5-prod.uoregon.edu.

There we go! That’s what we wanted! So what was the difference? Simple:

  • The little ell option (-l) limits the number of results returned from the DNSDB API server, first, and then works on whatever else you wanted done “client side” (such as sorting the results). (Little ell avoids retrieving “unwanted” results “up front.”)
  • The big ell option (-L) on the other hand, applies its limit as the last thing dnsdbq does (after all sorting or other “client side” “magic” is done). This option merely controls what gets output, it does NOT attempt to prevent “unwanted” data from getting retrieved in the first place.

At this point you might be tempted to (wrongly) say, “Well, I guess all I ever need is “big ell” then, eh? No. In point of fact, BOTH little ell and big ell play important roles.

For example, by default the results returned by DNSDB API are limited to 10,000 results. If you want or need more than that number, you need little ell to be able to ask for 100,000 or 500,000 or the maximum (a million) results, instead.

Big ell also plays a particularly important role when it comes to batched queries.

III. Little Ell, Big Ell, And Batched Queries

Normally, dnsdbq runs one query at a time. However, dnsdbq can also process a “batch” of queries using the -f option. In fact, we’ll often use -f with -m to run up to 10 queries in parallel.

For example, perhaps you have a file of queries called ous.txt that you want to run, containing the lines:

$OPTIONS -l0 -L5000000
rrset/name/*.uoregon.edu
rrset/name/*.oregonstate.edu
rrset/name/*.pdx.edu
rrset/name/*.eou.edu
rrset/name/*.oit.edu
rrset/name/*.sou.edu
rrset/name/*.wou.edu

You could run those in “batch mode” by saying:

$ dnsdbq -fm < ous.txt > ous.output

Note that because we’re using -fm mode, the queries will be run concurrently (“in parallel”) with the output from all the queries interleaved. We can set our options on the command line, or in the batch file itself, as we’ve done in this example. In “batch mode:”

  • Little ell establishes limits that you want to apply to EACH query in that batch

  • Big ell establishes limits that pertain to the COMBINED OUTPUT from all the queries in the batch run.

Because batch mode is so powerful, big ell can serve as a nice “safety switch” protecting you against accidentally doing something crazy, like asking for queries that can return (literally!) billions of results in aggregate!

IV. Summary

A nice summary of little ell vs. big ell was provided by one of the authors of these feature:

-l will be applied to each query in a batch. -L will be applied to the combined output of all those queries.

-l is processed server side, -L is processed client side.

-l happens before sorting, -L happens after sorting.

-l limits the size of (each) answer and prevents useless data from being sent.-L doesn’t stop the full answer from being transmitted; dnsdbq just stops printing after the limit is reached.

We hope you find these features a useful addition to your dnsdbq analytic “arsenal.”

Joe St Sauver is a Distinguished Scientist and Director of Research for Farsight Security, Inc.