Conficker and a Brief Tutorial on Working With SIE Channel 80 Sinkhole Data
What Is Conficker?
Quoting the Conficker Working Group (CWG):
Conficker, also known as Downup, Downandup, Conflicker, and Kido, is a computer worm that surfaced November 21st, 2008 […] When executed on a computer, Conficker disables a number of system services such as Windows Automatic Update, Windows Security Center, Windows Defender and Windows Error Reporting. It receives further instructions by connecting to a server or peer and receiving a binary update. The instructions it receives may include to propagate, gather personal information and to download and install additional malware onto the victim’s computer.
Surprisingly, despite heroic efforts on the part of the Conficker Working Group * and others in the Internet security community, Conficker is still going strong nearly seven years after it was first seen.
Over 700,000 unique infected IP addresses are routinely observed reaching out to command and control addresses, “phoning home” to see if there’s any malicious work that needs to be done. The owners of most of these Conficker-infected systems have no idea that their systems are even infected.
Fortunately, the command and control (C&C) hosts where those infected systems are attempting to “check-in” have all been sinkholed. Sinkholed C&C hosts are addresses that have been taken over by the good guys. Infected hosts are still allowed to connect, but only to log the IP addresses of the infected hosts; no evil work gets doled out to those Conficker-infected hosts when they connect to the sinkhole systems.
Because the Conficker C&C hosts have been sinkholed, Conficker can be thought of as being in a strange sort of “undead” limbo state, neither fully operational, nor totally squashed…
- On the one hand, there are over 700,000 end-user computers that ARE still infected with Conficker,
- On the other hand, because the Conficker command and control servers have been sinkholed, those infected systems aren’t actively doing anything malicious (other than repeatedly checking in).
Nonetheless, those 700,000 hosts can’t apply periodic patches and are insecure, so these are systems we’d really like to see cleaned up, once and for all. This is a goal that requires your help. Please check any Windows systems you use, AND any Windows systems used by friends or family members, to see if any of those systems are infected with Conficker. If it turns out that they are infected, please work to get them cleaned up.
Checking For Conficker If You’re An End User
If you’re an end user and you wonder if your system might be infected with Conficker, check by visiting the Conficker “Eye Chart” page
If it turns out that one or more of your system is infected, clean that system up using the repair tools listed on the Conficker Working Group’s “Repair Tool” page.
Internet Service Providers and Enterprises
ISPs and enterprise network operators should also endeavor to identify Conficker-infected customer systems.
If you have an intrusion detection system (such as Snort) deployed, you can get rules from this link that will help you spot Conficker network traffic from any local infected systems. We’d urge you to run those rules, and to chase down any hits that they catch.
If your network doesn’t have an intrusion detection system deployed, your security team or network engineers can still request periodic reports of Conficker-infected hosts as described at:
- http://www.confickerworkinggroup.org/wiki/pmwiki.php/SP/ServiceProviders
- http://www.confickerworkinggroup.org/wiki/pmwiki.php/ENT/Enterprise
If you operate a network, PLEASE do whatever you can to help find Conficker-infected users and get them cleaned up!
Conficker and the Farsight Security, Inc., Security Information Exchange (SIE)
The SIE Channel Guide mentions Channel 80, containing “Sinkhole Data for the Conficker Working Group (CWG).”. That channel consists of “requests to sinkhole web servers that capture web requests from infected clients.” The most import part of that data is typically the source IP of the infected host.
If you’ve leased a blade server at SIE and have subscribed to the SIE Base Channel Package, you can get a list of a dozen infected client source IPs by simply saying:
$ nmsgtool -c 12 -C ch80 | grep srcip | awk '{print $2}' 123.176.36.xxx 41.67.59.xxx 36.76.155.xxx 189.71.158.xxx 188.24.32.xxx 41.69.49.xxx 113.22.113.xxx 182.52.111.xxx 202.77.106.xxx 66.30.245.xxx 66.30.245.xxx 85.57.239.xxx
For the purpose of this article, we’ve anonymized the actual IP addresses of infected hosts by replacing the last octet of those IP addresses with “xxx”, but just to avoid any possible confusion, the actual output from channel 80 as received from SIE includes the entire non-anonymized address of infected hosts.
These addresses, and hundred of thousands of other addresses, are broadcast on SIE Channel 80. The traffic on that channel represent Conficker-infected systems (or sometimes proxy servers proxying requests on behalf of infected systems) reaching out, asking for malicious work to do. This Conficker check-in traffic is continual, normally running around 7Mbps, plus or minus a megabit or two over the course of the day. We’d sure like to see that average traffic level drop over time. Sadly, check-in traffic volumes appears to have largely plateaued, dropping little (if at all) in recent times.
If a security researcher with access to the Security Information Exchange wanted to pull a sample of a million observations to study, it would only take them five minutes or so to get them:
$ time nmsgtool -C ch80 -c 1000000 | grep srcip | awk '{print $2}' > ch80.txt real 5m5.322s <== just 5 minutes and 5.3 seconds of wall time user 0m17.137s sys 0m1.284s
A few IP addresses, most likely proxy servers in front of large networks with numerous infected systems, may show up with a high number of hits. Other IPs may only show up a few times, perhaps phoning home to multiple sinkholed C&C servers.
You can process the SIE data snapshot you saved as ch80.txt
in order to
get a list of the “chattiest” IP addresses sorted in descending order by frequency, by using the command:
$ sort ch80.txt | uniq -c | sort -nr > ch80-sorted.txt
The records in that file will have two values: a frequency count, and an associated IP address.
Unless you look at network addresses continually and see some that are particularly “familiar” (uh oh!), you’ll probably find yourself scratching your head over what to do next. A few options might include:
You could try using the
dig -x
command to look for PTR records associated with individual IPs, but many IP addresses won’t have an inverse address defined.You could try looking those IPs up in DNSDB, Farsight’s Passive DNS system, but DNSDB may give you “too many” host names, old host names (unless you time fence your queries), or host names that have no readily accessible point of contact.
You could also try using whois to figure out who controls those addresses, but often whois is frustratingly slow, or intentionally rate limited, or you may find yourself blocked altogether if you ask for too many records (the whois operator may assume you’re harvesting whois information for spamming or other nefarious purposes).**
The best option is often to just map those IP addresses to ASNs.
ASNs, or Autonomous System Numbers, represent networks routing a particular set of IP addresses. ASNs may be assigned to ISPs, large corporations, universities, government agencies, etc. They’re normally used for BGP, the protocol that glues local networks together into the Internet, but ASNs are also a convenient way of mapping IP addresses to responsible parties (as the saying goes, “If you route it, you’re responsible for it.”).
One tool for doing bulk mapping of IP addresses to ASN is the free IP-to-ASN mapping facility generously provided by Team Cymru [Thank you, Team Cymru!] In order to use it, we need to swap the columns in our file so that the IP address field comes first, and the associated count comes second. We’ll do that by saying:
$ awk '{print $2 " " $1}' < ch80-sorted.txt > ch80-sorted2.txt
As described on the Team Cymru IP-to-ASN page, we then need to edit the file of IP addresses by inserting a line literally reading BEGIN at the top of the file, and a line literally reading END at the bottom of the file. You can do that with any editor of your choice; we show doing this with vim:
$ vim ch80-sorted2.txt :1 i BEGIN <ESC> :$ A END <ESC> :wq
You’re then ready to run that file through the Team Cymru whois server using netcat:
$ netcat whois.cymru.com 43 < ch80-sorted2.txt | sort -n > ch80-asn.txt
The resulting file, ch80-asn.txt, will then contain a list of records that look something like this:
[...] 4134 | 110.82.72.xxx | 54 | CHINANET-BACKBONE No.31,Jin-rong Street, CN 4134 | 110.83.255.xxx | 7 | CHINANET-BACKBONE No.31,Jin-rong Street, CN 4134 | 110.83.51.xxx | 11 | CHINANET-BACKBONE No.31,Jin-rong Street, CN 4134 | 110.83.87.xxx | 31 | CHINANET-BACKBONE No.31,Jin-rong Street, CN [...]
The first field, 4134 in this case, is the ASN. The second vertical-bar-delimited field is the IP address (once again, lightly anonymized for the purpose of this article). The third column of numbers represents the number of “hits” or observations, passed along from our raw data file. The final column is a text description of the ASN appearing in the first column.
This format makes it easy to scan for IP addresses belonging to a particular ASN. You can also obviously easily use the grep command with a suitable regular expression to extract just the records belonging to a particular ASN of interest.
Using PSPP To Statistically Analyze The Team Cymru-Format ASN-Augmented Data File
PSPP is part of the GNU System of Free/Open Source Software (F/OSS). It is a freely available statistical package for analyzing sampled data that shares the syntax of the proprietary statistical package SPSS.
In order to more readily read the Team Cymru ASN-augmented data into PSPP, we begin by subsetting and reformat the data slightly by saying:
$ awk '{print "\"" $1 "\"," "\"" $3 "\"," $5}' < ch80-asn.txt > ch80-pspp.txt
That somewhat arcane-looking statement simply takes the first field from the Team Cymru output, puts it in quotes followed by a comma, and then repeats that for fields 3, with field 5 being written as a simple integer. For example, the four previously mentioned records would now look like:
[...] "4134","110.82.72.xxx",54 "4134","110.83.255.xxx",7 "4134","110.83.51.xxx",11 "4134","110.83.87.xxx",31 [...]
We can then read that data into PSPP by creating a small command file using an editor of our choice. Let’s call that file read-ch80-data.psp:
GET DATA /TYPE=TXT /FILE='ch80-pspp.txt' /DELIMITERS="," /VARIABLES=ASN A10 IP A20 COUNT F10.0 WEIGHT BY COUNT. SORT CASES BY ASN. AGGREGATE OUTFILE=* MODE=REPLACE /PRESORTED /BREAK=ASN /TOTALHITS=N. SORT CASES BY TOTALHITS (D). PRINT OUTFILE='summary-by-asn.txt'/ TOTALHITS (F6.0,2X) ASN EXECUTE.
We can then run that command file by saying:
$ pspp --batch read-ch80-2.psp
When that finishes running, assuming there were no errors, we’ll have a new file called summary-by-asn.txt that lists the top ASNs by number of Conficker sinkhole hits. For the sample we pulled, the top ASNs in that file were:
$ head -6 summary-by-asn.txt 79422 "17974" <== TELKOMNET-AS2-AP, PT Telekomunikasi Indonesia 79311 "45899" <== VNPT-AS-VN, Vietnam Posts and Telecommunications (VNPT) 61492 "4134" <== CHINANET-BACKBONE, No.31,Jin-rong Street, Beijing CN 38408 "3356" <== LEVEL3, Level 3 Communications, Inc., Broomfield Colo. 26545 "7552" <== VIETEL-AS-AP, Viettel Corporation, Hanoi VN 19053 "18403" <== FPT-AS-AP, Corp for Financing and Promoting Technology VN
[The ASN descriptions aren’t included in the PSPP output; you may need to look those up in whois by checking the various regional registries, e.g., www.arin.net, www.apnic.net, www.ripe.net, www.lacnic.net, or www.afrinic.net. For example: $ whois -h whois.apnic.net AS79422
]
Obviously the top ASNs will vary from sample to sample, but at least based on the summary of the data we pulled, a substantial number of Conficker-infected hosts appear to be systems connected by ISPs in the APNIC (Asia Pacific) region. That said, however, obviously there are also Conficker hosts in other parts of the world, too, and the specific distribution of IPs seen may depend on time of day as well as other factors.
When interpreting these values, it’s also important to realize that these values are not weighted according to the size of each networks. For example, Level3 is among the largest of all networks, so it is probably not surprising that they have a substantial number of Conficker-infected customer systems given their overall size.
Conclusion
We hope that you will all take the time to check and make sure that your systems, and your family and friends systems, are not infected with Conficker. If you run an ISP security team or enterprise network, we also hope that you’ll do what you can to help stomp out Conficker on your user’s systems, too.
If you’re a security researcher, we hope you now see how easy it can be to work with Channel 80 data from the Farsight Security, Inc., Security Information Exchange. For more information about obtaining access to SIE or any other Farsight products, please contact the Farsight Security Sales department at [email protected].
Notes
* These efforts of the CWG have been broadly acknowledged. One recent noteworthy example of this was M3AAWG’s decision to award the 2016 Mary Latynski Award for a lifetime of work fighting text spam, malware and DDoS attacks to Rodney Joffe, chairman of the Conficker Working Group. See this link for more detail.
** Another possibility for doing whois queries at scale would be to use the CyberToolBelt whois service. Farsight is a reseller of that product, please contact Farsight Sales for more information about it.
Joe St Sauver, Ph.D. is a Scientist with Farsight Security, Inc.