WireShark - 9.4. How to reassemble split packets

9.4. How to reassemble split packets
Prev	Chapter 9. Packet dissection	Next

9.4. How to reassemble split packets

Some protocols have times when they have to split a large packet across multiple other packets. In this case the dissection can't be carried out correctly until you have all the data. The first packet doesn't have enough data, and the subsequent packets don't have the expect format. To dissect these packets you need to wait until all the parts have arrived and then start the dissection.

9.4.1. How to reassemble split UDP packets

As an example, let's examine a protocol that is layered on top of UDP that splits up its own data stream. If a packet is bigger than some given size, it will be split into chunks, and somehow signaled within its protocol.

To deal with such streams, we need several things to trigger from. We need to know that this packet is part of a multi-packet sequence. We need to know how many packets are in the sequence. We also need to know when we have all the packets.

For this example we'll assume there is a simple in-protocol signaling mechanism to give details. A flag byte that signals the presence of a multi-packet sequence and also the last packet, followed by an ID of the sequence and a packet sequence number.

msg_pkt ::= SEQUENCE {
	.....
	flags ::= SEQUENCE {
		fragment	BOOLEAN,
		last_fragment	BOOLEAN,
	.....
	}
	msg_id	INTEGER(0..65535),
	frag_id	INTEGER(0..65535),
	.....
}

Example 9.14. Reassembling fragments - Part 1

#include <epan/reassemble.h>
   ...
save_fragmented = pinfo->fragmented;
flags = tvb_get_guint8(tvb, offset); offset++;
if (flags & FL_FRAGMENT) { /* fragmented */
	tvbuff_t* new_tvb = NULL;
	fragment_data *frag_msg = NULL;
	guint16 msg_seqid = tvb_get_ntohs(tvb, offset); offset += 2;
	guint16 msg_num = tvb_get_ntohs(tvb, offset); offset += 2;

	pinfo->fragmented = TRUE;
	frag_msg = fragment_add_seq_check(tvb, offset, pinfo,
		msg_seqid, /* ID for fragments belonging together */
		msg_fragment_table, /* list of message fragments */
		msg_reassembled_table, /* list of reassembled messages */
		msg_num, /* fragment sequence number */
		tvb_length_remaining(tvb, offset), /* fragment length - to the end */
		flags & FL_FRAG_LAST); /* More fragments? */

We start by saving the fragmented state of this packet, so we can restore it later. Next comes some protocol specific stuff, to dig the fragment data out of the stream if it's present. Having decided it is present, we let the function fragment_add_seq_check do its work. We need to provide this with a certain amount of data.

The tvb buffer we are dissecting.
The offset where the partial packet starts.
The provided packet info.
The sequence number of the fragment stream. There may be several streams of fragments in flight, and this is used to key the relevant one to be used for reassembly.
The msg_fragment_table and the msg_reassembled_table are variables we need to declare. We'll consider these in detail later.
msg_num is the packet number within the sequence.
The length here is specified as the rest of the tvb as we want the rest of the packet data.
Finally a parameter that signals if this is the last fragment or not. This might be a flag as in this case, or there may be a counter in the protocol.

Example 9.15. Reassembling fragments part 2

	   
	new_tvb = process_reassembled_data(tvb, offset, pinfo,
		"Reassembled Message", frag_msg, &msg_frag_items,
		NULL, msg_tree);

	if (frag_msg) { /* Reassembled */
		if (check_col(pinfo->cinfo, COL_INFO))
			col_append_str(pinfo->cinfo, COL_INFO,
			" (Message Reassembled)");
	} else { /* Not last packet of reassembled Short Message */
		if (check_col(pinfo->cinfo, COL_INFO))
			col_append_fstr(pinfo->cinfo, COL_INFO,
			" (Message fragment %u)", msg_num);
	}

	if (new_tvb) { /* take it all */
		next_tvb = new_tvb;
	} else { /* make a new subset */
	 	next_tvb = tvb_new_subset(tvb, offset, -1, -1);
	}
}
else { /* Not fragmented */
	next_tvb = tvb_new_subset(tvb, offset, -1, -1);
}

.....
pinfo->fragmented = save_fragmented;

Having passed the fragment data to the reassembly handler, we can now check if we have the whole message. If there is enough information, this routine will return the newly reassembled data buffer.

After that, we add a couple of informative messages to the display to show that this is part of a sequence. Then a bit of manipulation of the buffers and the dissection can proceed. Normally you will probably not bother dissecting further unless the fragments have been reassembled as there won't be much to find. Sometimes the first packet in the sequence can be partially decoded though if you wish.

Now the mysterious data we passed into the fragment_add_seq_check.

Example 9.16. Reassembling fragments - Initialisation

static GHashTable *msg_fragment_table = NULL;
static GHashTable *msg_reassembled_table = NULL;

static void
msg_init_protocol(void)
{
	fragment_table_init(&msg_fragment_table);
	reassembled_table_init(&msg_reassembled_table);
}

First a couple of hash tables are declared, and these are initialised in the protocol initialisation routine. Following that, a fragment_items structure is allocated and filled in with a series of ett items, hf data items, and a string tag. The ett and hf values should be included in the relevant tables like all the other variables your protocol may use. The hf variables need to be placed in the structure something like the following. Of course the names may need to be adjusted.

Example 9.17. Reassembling fragments - Data

...
static int hf_msg_fragments = -1;
static int hf_msg_fragment = -1;
static int hf_msg_fragment_overlap = -1;
static int hf_msg_fragment_overlap_conflicts = -1;
static int hf_msg_fragment_multiple_tails = -1;
static int hf_msg_fragment_too_long_fragment = -1;
static int hf_msg_fragment_error = -1;
static int hf_msg_reassembled_in = -1;
...
static gint ett_msg_fragment = -1;
static gint ett_msg_fragments = -1;
...
static const fragment_items msg_frag_items = {
	/* Fragment subtrees */
	&ett_msg_fragment,
	&ett_msg_fragments,
	/* Fragment fields */
	&hf_msg_fragments,
	&hf_msg_fragment,
	&hf_msg_fragment_overlap,
	&hf_msg_fragment_overlap_conflicts,
	&hf_msg_fragment_multiple_tails,
	&hf_msg_fragment_too_long_fragment,
	&hf_msg_fragment_error,
	/* Reassembled in field */
	&hf_msg_reassembled_in,
	/* Tag */
	"Message fragments"
};
...
static hf_register_info hf[] =
{
...
{&hf_msg_fragments,
	{"Message fragments", "msg.fragments",
	FT_NONE, BASE_NONE, NULL, 0x00,	NULL, HFILL } },
{&hf_msg_fragment,
	{"Message fragment", "msg.fragment",
	FT_FRAMENUM, BASE_NONE, NULL, 0x00, NULL, HFILL } },
{&hf_msg_fragment_overlap,
	{"Message fragment overlap", "msg.fragment.overlap",
	FT_BOOLEAN, BASE_NONE, NULL, 0x00, NULL, HFILL } },
{&hf_msg_fragment_overlap_conflicts,
	{"Message fragment overlapping with conflicting data",
	"msg.fragment.overlap.conflicts",
	FT_BOOLEAN, BASE_NONE, NULL, 0x00, NULL, HFILL } },
{&hf_msg_fragment_multiple_tails,
	{"Message has multiple tail fragments",
	"msg.fragment.multiple_tails",
	FT_BOOLEAN, BASE_NONE, NULL, 0x00, NULL, HFILL } },
{&hf_msg_fragment_too_long_fragment,
	{"Message fragment too long", "msg.fragment.too_long_fragment",
	FT_BOOLEAN, BASE_NONE, NULL, 0x00, NULL, HFILL } },
{&hf_msg_fragment_error,
	{"Message defragmentation error", "msg.fragment.error",
	FT_FRAMENUM, BASE_NONE, NULL, 0x00, NULL, HFILL } },
{&hf_msg_reassembled_in,
	{"Reassembled in", "msg.reassembled.in",
	FT_FRAMENUM, BASE_NONE, NULL, 0x00, NULL, HFILL } },
...
static gint *ett[] =
{
...
&ett_msg_fragment,
&ett_msg_fragments
...

These hf variables are used internally within the reassembly routines to make useful links, and to add data to the dissection. It produces links from one packet to another - such as a partial packet having a link to the fully reassembled packet. Likewise there are back pointers to the individual packets from the reassembled one. The other variables are used for flagging up errors.

9.4.2. How to reassemble split TCP Packets

A dissector gets a tvbuff_t pointer which holds the payload of a TCP packet. This payload contains the header and data of your application layer protocol.

When dissecting an application layer protocol you cannot assume that each TCP packet contains exactly one application layer message. One application layer message can be split into several TCP packets.

You also cannot assume that a TCP packet contains only one application layer message and that the message header is at the start of your TCP payload. More than one messages can be transmitted in one TCP packet, so that a message can start at an arbitrary position.

This sounds complicated, but there is a simple solution. tcp_dissect_pdus() does all this tcp packet reassembling for you. This function is implemented in epan/dissectors/packet-tcp.h.

Example 9.18. Reassembling TCP fragments

#ifdef HAVE_CONFIG_H
# include "config.h"
#endif

#include <epan/packet.h>
#include <epan/emem.h>
#include <epan/dissectors/packet-tcp.h>
#include <epan/prefs.h>

...

#define FRAME_HEADER_LEN 8

/* The main dissecting routine */
static void dissect_foo(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree)
{
    tcp_dissect_pdus(tvb, pinfo, tree, TRUE, FRAME_HEADER_LEN,
                     get_foo_message_len, dissect_foo_message);
}

/* This method dissects fully reassembled messages */
static void dissect_foo_message(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree)
{
    /* TODO: implement your dissecting code */
}

/* determine PDU length of protocol foo */
static guint get_foo_message_len(packet_info *pinfo, tvbuff_t *tvb, int offset)
{
    /* TODO: change this to your needs */
    return (guint)tvb_get_ntohl(tvb, offset+4); /* e.g. length is at offset 4 */
}

...

As you can see this is really simple. Just call tcp_dissect_pdus() in your main dissection routine and move you message parsing code into another function. This function gets called whenever a message has been reassembled.

The parameters tvb , pinfo and tree are just handed over to tcp_dissect_pdus(). The 4th parameter is a flag to indicate if the data should be reassembled or not. This could be set according to a dissector preference as well. Parameter 5 indicates how much data has at least to be available to be able to determine the length of the foo message. Parameter 6 is a function pointer to a method that returns this length. It gets called when at least the number of bytes given in the previous parameter is available. Parameter 7 is a function pointer to your real message dissector.

Prev	Up	Next
9.3. How to handle transformed data	Home	9.5. How to tap protocols