Audit TLS Version of Devices Connecting to Your Gear with a PCAP
I had a request come across my desk one day: we want to disable SSLv3 and TLS 1.0 on a set of 20+ servers, but would rather not rely on a scream test to see who will be affected. Can you tell us who is connecting to our gear with unsafe versions of this protocol?
I did some research to see if that information might be buried in the Windows Event Log somewhere, but it isn’t. I asked if it might be in an application log somewhere, but it isn’t. I took a quick look through our monitoring systems to see if there would be an easy answer, but there isn’t. Unfortunately, the best way to perform this audit, at least with my skillset, is to perform a packet capture on each server and extract the negotiated SSL/TLS version from each session.
So, I set up a packet capture (with capture filter set to specific TCP ports and headers only) on each one of these servers and hoped that I could come up with an analysis script by the time I stopped those capture. I didn’t quite get there.
Main Points of This Post
These are the main things I wanted to do when writing this script, and the main points I want to make in this post:
Using enums to avoid the dreaded “magic number” confusion
Working with Binary Files in Python
Being bold in the interests of high performance
Primary Structure of the Script
To help orient you (and myself), this is a brief description of how I structured the script.
First, I take in the pcap or directory of pcaps, and pass one of them into the meat of the script.
Next, I look into the global header of the file to determine the endianness of the file (thus ensuring my binary interpretations are accurate later) and to ensure the file contains Ethernet frames (to make sure my offsets work right).
I then pull a packet from the file into a byte array. I discovered that it is much faster to work with a byte array in memory than a file pointer in a multi-gigabyte capture file.
At this point, most packet dissection tools would extract all the header information from the packet and store it in some struct
format. I can save a little bit of compute time by first checking to see if the packet is a TLS handshake, specifically a Client or Server hello.
If I find out that I do care about the packet, then I'll grab the IP and TCP headers, extract the relevant details (5-tuple and TLS version), and save that in a list, which gets written to a CSV file after I've gone through the entire file.
Using Enums in Python
I had an embedded systems professor in college who always stressed "NO MAGIC NUMBERS IN YOUR CODE!" If he or a TA spotted a number in your code anywhere, it was points off. This sort of script, where I'm working at a relatively low level with binary files, seemed like a good application to try avoiding hard-to-read numbers.
The strucutre in Python to do this seemed to be the Enum. My list for this script is as follows:
class Constant(IntEnum):
ETH_HDR = 14
IHL_MASK = int('00001111',2)
TCP_DATA_OFFSET_MASK = int('11110000',2)
WORDS_TO_BYTES = 4
DATA_OFFSET_BITWISE_SHIFT = 4
TLS_RCD_LYR_LEN_OFFSET = 3
TLS_RCD_LYR_LEN = 5
IP_SRC_OFFSET = 12
IP_DST_OFFSET = 16
IP_ADDR_LEN = 4
TCP_DST_OFFSET = 2
TCP_PORT_LEN = 2
TLS_HANDSHAKE_VER_OFFSET = 9
TLS_VER_LEN = 2
To use these, I just had to reference Constant.ETH_HDR
.
It was a good idea, but there were also lines in this script where it didn't contribute that much to readability. Below is one excellent example of that, where I used a ton of Enums to perform array splicing, bitwise comparisons, bitwise shifts, and mathematical operations in a single line. And there are somehow still magic numbers in there.
tcp_len = ((int.from_bytes(packet[Constant.ETH_HDR + ip_len + 12: Constant.ETH_HDR + ip_len + 12 + 1], byteorder = 'big') &
Constant.TCP_DATA_OFFSET_MASK) >> Constant.DATA_OFFSET_BITWISE_SHIFT) * Constant.WORDS_TO_BYTES
Binary Files In Python
Python is surprisingly good at dealing with binary files. All you need to do is insert a b
in the second argument of the open()
method, and things are really nice. Example below:
file_handle = open(<file_name>,'rb+')
Once you do this, you have a pointer that you can move around however you want using the seek()
method. The first argument of that method is the desired offset, and the second argument is the place you want to offset from (0
to reference the beginning of the file, and 1
to reference the current location).
A perfect example of how to use this is interpreting the headers in the PCAP file format. The global header (I used this site as a reference) has the following format:
typedef struct pcap_hdr_s {
guint32 magic_number; /* magic number */
guint16 version_major; /* major version number */
guint16 version_minor; /* minor version number */
gint32 thiszone; /* GMT to local correction */
guint32 sigfigs; /* accuracy of timestamps */
guint32 snaplen; /* max length of captured packets, in octets */
guint32 network; /* data link type */
} pcap_hdr_t;
From here, I care about the magic number and the data link type. The magic number tells me the endianness of the system, which is required to interpret any further data, and the data link type makes sure I'm dealing with Ethernet frames. If I'm not, all of my offsets are screwed up and I might as well abort now.
To read this information, I use the following code:
def report_global_header(file_ptr):
# assumes file_ptr is at head
# of pcap file
# returns byte order
# big
# little
# confirms ethernet frame
# true if ethernet frame
magic_num = file_ptr.read(4)
magic_test = b'\xa1\xb2\xc3\xd4'
if magic_num == magic_test:
order = 'big'
else:
order = 'little'
file_ptr.seek(16,1)
link_layer_type = file_ptr.read(4)
ethernet = int.from_bytes(link_layer_type, byteorder = order)
is_ethernet = ethernet == 1
return order, is_ethernet
The really important bits here are the read()
and seek()
methods, because these move the file pointer.
I start by reading the magic number, which moves the pointer 4 bytes forward. Then I seek()
16 bytes ahead of the current pointer location, bringing me to the network variable.
I perform similar operations to interpret the packet header and read the packet to a byte array.
Being Bold for High Performance
I'll go ahead and qualify this entire section by pointing out that if I was truly trying to be bold for high performance, I would have written this in C.
For this script, I chose to not use an existing Python packet dissection tool like Scapy because of performance. Scapy, in particular, is slow and uses a ton of memory. And I wasn't able to find another pre-built tool that seemed like a significant increase in speed. I implemented my own packet dissection because I could have more control over what data I was pulling and when. I make a relatively small number of operations on packets that aren't relevant to my investigation, and I don't interpret a single piece of information that isn't necessary for the goal of the script.
By making small performance improvements over the course of development, I was able to process over 50GB of packet capture data in about 10 minutes with negligible impact on my laptop's CPU or RAM. Here are a few of the tactics I used to increase performance:
- Pulling the entire packet into a byte array, instead of working with it in the file.
- Reorganizing the dissection to skip over the IP and TCP headers to pull the TLS content type first
- Push interpretation as far back in the script as possible. If I can make my comparison against the binary representation of my desired value, I save cycles.
Conclusion
I was able to develop a script (available here) that could interpret a pcap-based packet capture to audit the SSL/TLS version that other systems use to connect to it. This gave my colleague a nice list of application owners to talk into upgrading TLS before turning off support on his side. He successfully disabled support for the unsafe protocol versions on all of his servers without impacting any applications that relied on them.
In developing this script, I was able to mess with binary files and work with binary representations of data in Python, discovering that Python makes it surprisingly easy. I was also able to implement the wise advice of an old college professor and (mostly) avoid magic numbers in my code. It was a fun project!