Farsight's Network Message, Volume 2: Introduction to nmsgtool
Abstract
This article is the second in a multi-part blog series intended to introduce
and acquaint the user with Farsight Security’s NMSG suite. This article is an
introduction to nmsgtool
and provides several useful recipes and examples.
Before reading this article, it is recommended that you read
Farsight’s Network Message, Volume 1: Introduction to NMSG and have a local installation of
nmsgtool
. To get the most from this article and be able to run all of
the examples, an account on Farsight Security’s Security Information Exchange (SIE) is recommended. If you
don’t have SIE, you can
order it here. This article covers
nmsgtool
version 0.9.1
.
What is nmsgtool?
To paraphrase the Unix manpage, nmsgtool
is the command line interface to
libnmsg
and is a thin wrapper around libnmsg’s I/O engine. It controls the
transmission, storage, creation, and conversion of NMSG payloads.
NMSG Inputs and Outputs
The nmsgtool
program is a single tool for taking inputs from a variety of
different inputs like data streams from the network, capturing data from
network interfaces, reading data from files or even standard input and making
NMSG payloads available to one or more outputs. The outputs are files in binary
or human-readable (ASCII presentation) form, or binary payloads to network
sockets for transport. Without having to create a program for each function,
nmsgtool
handles all sorts of data processing including serialization,
fragmentation, compression, striping or mirroring, rolling file outputs, and
executing data processing programs on file outputs.
nmsgtool
inputs can take the following forms:
- A file containing binary NMSG data (i.e.: the output of a previously
instantiated
nmsgtool
command) - A socket that is plumbed to contain binary NMSG data
- Reassembled IP datagrams from a pcap file
- Reassembled IP datagrams from a network interface
- A file containing ASCII presentation data
nmsgtool
outputs can take the following forms:
- Binary NMSG data to a file
- Binary NMSG data to a network socket
- ASCII presentation form data to a file (including stdout)
You can specify more than one of each.
nmsgtool Recipes
The following are a handful of useful nmsgtool
recipes intended to showcase
its functionality and demonstrate different ways to use the tool.
Read Data from SIE
A common use case for SIE customers is to use nmsgtool
to read live SIE data
directly from the wire and write the output to the screen.
$ nmsgtool -C ch212 -c 1 -o - [72] [2015-02-03 13:35:29.678474903] [2:5 SIE newdomain] [a1ba02cf] [] [] domain: s47rbh.xyz. time_seen: 2015-02-03 13:33:20 rrname: s47rbh.xyz. rrclass: IN (1) rrtype: NS (2) rdata: ns1.51dns.com. rdata: ns2.51dns.com.
The above invocation reads a single NMSG payload [-c 1]
from SIE Channel 212
(Newly Observed Domains) [-C ch212]
and emits it to stdout as ASCII
presentation data [-o -]
.
Note that if no outputs are specified, ASCII presentation to stdout [-o -]
is the default behavior of nmsgtool
. In future examples, it will be omitted.
Behind the scenes nmsgtool
uses a configuration file called
nmsgtool.chalias
that contains channel number to IP address/UDP port
mappings. When a channel number is specified on the command line, nmsgtool
looks it up in the nmsgtool.chalias
file and listens on the specified
network socket.
The captured NMSG datagram header is the emitted first. Breaking this down we have the following individual fields:
[72]
: The message size in bytes[2015-02-03 13:35:29.678474903]
: A UTC timestamp with nanosecond resolution[2:5 SIE newdomain]
: Vendor and message ID, vendor and message type[a1ba02cf]
: The source identifier (optional)[]
: The operator code (optional)[]
: The group code (optional)
The message payload is a combination of key-value pairs. They follow a
schema defined by the vendor and message type (in the above example, the
vendor is SIE
and the message type is newdomain
). nmsgtool
includes dynamically loadable modules that enable it to present this
data as you see above and also enable NMSG-based programs or scripts load the
key-value pairs into structures. These concepts will be more rigorously
explained in future NMSG articles.
Read and Write Binary NMSG Files
Another common use case for SIE customers is to read messages from SIE into a local binary file for later analysis.
$ nmsgtool -C ch208 -c 100000 -w ch208.nmsg $ stat -c "%n %s" ch208.nmsg ch208.nmsg 15246581 $ nmsgtool -r ch208.nmsg -c 1 [72] [2015-02-01 00:07:53.596907788] [2:1 SIE dnsdedupe] [a1ba02cf] [] [] type: EXPIRATION count: 2 time_first: 2015-01-31 07:29:37 time_last: 2015-01-31 07:29:37 bailiwick: <redacted> rrname: <redacted> rrclass: IN (1) rrtype: A (1) rrttl: 43200 rdata: <redacted>
The above invocation reads reads 100,000 NMSG payloads [-c 1000000]
from
SIE Channel 208 (Passive DNS, deduplicated, verified, in-bailiwick)
[-C ch208]
and emits them as binary NMSGs to a file [-w ch208.nmsg]
.
The result is 15 megabytes.
That binary file is then read [-r ch208.nmsg]
and a single NMSG payload
[-c 1]
containing a dnsdedupe
message is emitted to stdout as ASCII
presentation data.
Output Compression
nmsgtool
can compress binary payload output (to a file or for emission
across the network) using
zlib compression (the same algorithm used by the
ubiquitous gzip tool). To see an example of the on-disk
storage benefit compression can offer, we compress the data captured in the
previous recipe.
$ nmsgtool -r ch208.nmsg -w ch208z.nmsg -z $ stat -c "%n %s" ch208-z.nmsg ch208-z.nmsg 6428829
The above invocation reads the binary file from the previous example
[-r ch208]
and writes a new file [-w ch208z.nmsg]
, compressing each
payload [-z]
. The resultant file is just over six megabytes for a 58%
decrease in file size. It is important to note that the compression is
performed per payload, not across the entire file.
Kicker Scripts and Output File Rolling
Another useful feature nmsgtool
offers is the ability to perform automatic
file rolling (rotation) based on timer expiry or payload count. Additionally,
the user can specify a kicker command to run on output files after rotation.
Consider the following simple shell script:
#!/bin/sh
echo "$1: " `nmsgtool -r $1 | grep "[2:1 SIE dnsdedupe]" | wc -l`
The above script, count.sh
, counts the number of dnsdedupe
payloads from a
binary NMSG file. The script is invoked in the following example:
$ nmsgtool -C ch202 -w ch202 -t 2 -z -k count.sh ./ch202.20150202.0110.1422839406.364843292.nmsg: 3099450 ./ch202.20150202.0110.1422839408.013136741.nmsg: 3384114 ./ch202.20150202.0110.1422839410.024261700.nmsg: 3090827 ./ch202.20150202.0110.1422839412.024284315.nmsg: 3100505 ./ch202.20150202.0110.1422839414.033887391.nmsg: 3026627 ./ch202.20150202.0110.1422839416.014162500.nmsg: 3208181
The above invocation reads payloads from SIE Channel 202 (Raw Passive DNS)
[-C ch202]
and writes compressed payloads [-z]
to a binary file
[-w ch202]
. Every two seconds [-t 2]
the file is closed, rotated, and the
kicker script is run on the output file [-k count.sh]
. The output from
each count.sh
invocation is the filename followed by the number of NMSG
payloads.
Transfer NMSGs across the network
nmsgtool
can be used to transfer NMSGs across an IPv4 or IPv6 network to
either a unicast or broadcast address.
For this example, we instantiate two nmsgtool
sessions on separate hosts. On
the receiving host, we run nmsgtool
as follows:
$ nmsgtool -l 10.0.1.52/9430
The above invocation listens for NMSGS on a network socket connected to
10.0.1.52 on UDP port 9430 [-l 10.0.1.52/9430]
. When NMSGs appear, they will
be emitted as ASCII presentation data to stdout.
On the sending host, we run nmsgtool
as follows:
$ nmsgtool -r ch202.20150202.0110.1422839406.364843292.nmsg -c 2 -s 10.0.1.52/9430
The above invocation reads two payloads [-c 2]
from the binary NMSG file
created in the previous example [-r ch202...]
. They are written to the
network destined for 10.0.1.52 on UDP port 9430 [-s 10.0.1.52/9430]
.
On the receiving host we see the following output:
[293] [2015-02-02 01:08:21.902736000] [1:9 base dnsqr] [e9b019b8] [] [] type: UDP_QUERY_RESPONSE query_ip: <redacted> response_ip: <redacted> proto: UDP (17) query_port: 31211 response_port: 53 id: 7644 qname: <redacted> qclass: IN (1) qtype: AAAA (28) rcode: NOERROR (0) delay: 0.182413 udp_checksum: ABSENT [...] [352] [2015-02-02 01:08:22.095911000] [1:9 base dnsqr] [e9b019b8] [] [] [...]
Two presentation format dnsqr
NMSGs (redacted and cropped for publication)
are emitted to stdout.
As a side-note, the sender has options to tune the network performance including setting the NMSG container maximum transmission unit size (note this is distinct from IP MTU), buffering, and rate limiting.
Payload Striping vs Mirroring
When multiple outputs are specified, nmsgtool
defaults to
striping payloads across each
output. However, nmsgtool
can also be configured to
mirror payloads to each output.
$ nmsgtool -C ch211 -c 100 -o - -s 10.0.1.52/9430 --mirror [94] [2015-02-03 08:50:19.277158975] [2:5 SIE newdomain] [a1ba02cf] [] [] [...]
The above invocation reads 100 payloads [-c 100]
from SIE Channel 211 (Newly
Active Domains) and mirrors [--mirror]
across two outputs, one ASCII
presentation to stdout [-o -]
and one network socket destined for
10.0.1.52 on UDP port 9430 [-s 10.0.1.52/9430]
.
Input from a Network Interface or Pcap File with BPF filtering
Perhaps you’d like to create your own NMSG stream sourced from live network
traffic. More specifically, you want only DNS traffic. To this end, you can
configure nmsgtool
to read IP datagrams directly from a network interface or
a pcap file. Additionally, a
BPF can be specified to
winnow packets. When receiving input from a network interface or a pcap file,
nmsgtool
requires the user to set the vendor and message type so it knows how
to encode each payload.
$ nmsgtool -i eth1 -V base -T dnsqr -b "udp 53" [220] [2010-05-09 05:08:54.951124000] [1:9 base dnsqr] [00000000] [] [] [...]
The above invocation reads data from a network interface [-i eth1]
and
winnows those packets to just UDP port 53 [-b "udp 53"]
and encodes this data
as base
/dnsqr
[-V base]
and [-T dnsqr]
and emits as ASCII presentation
data to stdout.
Input from a pcap file is syntactically similar, just substitute
[-p example.pcap]
for [-i eth1]
.
Coming up
The next article in the NMSG series will examine low-level NMSG implementation details such as header composition and data encoding. Future articles will introduce the programming APIs.
Mike Schiffman is a Protocol Legerdemainist for Farsight Security, Inc.
Read the next part in this series: Network Message, Volume 3: Headers and Encoding