In this section and the upcoming ones, we will take a closer look at the
states and how they are handled for each of the three basic protocols
TCP, UDP and ICMP. Also, we
will take a closer look at how connections are handled per default, if they
can not be classified as either of these three protocols. We have chosen to
start out with the TCP protocol since it is a
stateful protocol in itself, and has a lot of interesting details with regard
to the state machine in iptables.
A TCP connection is always initiated with the 3-way
handshake, which establishes and negotiates the actual connection over which
data will be sent. The whole session is begun with a
SYN packet, then a SYN/ACK
packet and finally an ACK packet to acknowledge the
whole session establishment. At this point the connection is established and
able to start sending data. The big problem is, how does connection tracking
hook up into this? Quite simply really.
As far as the user is concerned, connection tracking works basically the
same for all connection types. Have a look at the picture
below to see exactly what state the stream enters during the different stages
of the connection. As you can see, the connection tracking code does not
really follow the flow of the TCP connection, from
the users viewpoint. Once it has seen one packet(the
SYN), it considers the connection as NEW. Once it
sees the return packet(SYN/ACK), it considers the
connection as ESTABLISHED. If you think about this a
second, you will understand why. With this particular implementation, you can
allow NEW and ESTABLISHED packets to
leave your local network, only allow ESTABLISHED
connections back, and that will work perfectly. Conversely, if the connection
tracking machine were to consider the whole connection establishment as
NEW, we would never really be able to stop outside
connections to our local network, since we would have to allow
NEW packets back in again. To make things more complicated,
there are a number of other internal states that are used for
TCP connections inside the kernel, but which are not
available for us in User-land. Roughly, they follow the state standards
specified within RFC 793 - Transmission Control Protocol on
pages 21-23. We will consider these in more detail further along in this
As you can see, it is really quite simple, seen from the user's point of view.
However, looking at the whole construction from the kernel's point of view,
it's a little more difficult. Let's look at an example. Consider exactly how
the connection states change in the
/proc/net/ip_conntrack table. The first state is reported
upon receipt of the first SYN packet in a connection.
tcp 6 117 SYN_SENT src=192.168.1.5 dst=192.168.1.35 sport=1031 \
dport=23 [UNREPLIED] src=192.168.1.35 dst=192.168.1.5 sport=23 \
As you can see from the above entry, we have a precise state in which a SYN
packet has been sent, (the SYN_SENT
flag is set), and to which as yet no reply has been sent (witness the
[UNREPLIED] flag). The next internal state
will be reached when we see another packet in the other direction.
tcp 6 57 SYN_RECV src=192.168.1.5 dst=192.168.1.35 sport=1031 \
dport=23 src=192.168.1.35 dst=192.168.1.5 sport=23 dport=1031 \
Now we have received a corresponding SYN/ACK in
return. As soon as this packet has been received, the state changes once
again, this time to SYN_RECV.
SYN_RECV tells us that the original
SYN was delivered correctly and that the
SYN/ACK return packet also got through the firewall
properly. Moreover, this connection tracking entry has now seen traffic in
both directions and is hence considered as having been replied to. This is not
explicit, but rather assumed, as was the
[UNREPLIED] flag above. The final step will
be reached once we have seen the final ACK in the
tcp 6 431999 ESTABLISHED src=192.168.1.5 dst=192.168.1.35 \
sport=1031 dport=23 src=192.168.1.35 dst=192.168.1.5 \
sport=23 dport=1031 [ASSURED] use=1
In the last example, we have gotten the final ACK in
the 3-way handshake and the connection has entered the
ESTABLISHED state, as far as the internal mechanisms of
iptables are aware. Normally, the stream will be
ASSURED by now.
A connection may also enter the ESTABLISHED state, but not
be [ASSURED]. This happens if we have
connection pickup turned on (Requires the tcp-window-tracking patch, and
the ip_conntrack_tcp_loose to be set to 1 or higher). The default, without the
tcp-window-tracking patch, is to have this behaviour, and is not changeable.
When a TCP connection is closed down, it is done in
the following way and takes the following states.
As you can see, the connection is never really closed until the last
ACK is sent. Do note that this picture only describes
how it is closed down under normal circumstances. A connection may also, for
example, be closed by sending a RST(reset), if
the connection were to be refused. In this case, the connection would be
closed down immediately.
When the TCP connection has been closed down, the
connection enters the TIME_WAIT state, which
is per default set to 2 minutes. This is used so that all packets that have
gotten out of order can still get through our rule-set, even after the
connection has already closed. This is used as a kind of buffer time so that
packets that have gotten stuck in one or another congested router can still
get to the firewall, or to the other end of the connection.
If the connection is reset by a RST packet,
the state is changed to CLOSE. This
means that the connection per default has 10 seconds before the whole
connection is definitely closed down. RST packets are
not acknowledged in any sense, and will break the connection directly. There
are also other states than the ones we have told you about so far. Here is the
complete list of possible states that a TCP stream
may take, and their timeout values.
Table 7-2. Internal states
These values are most definitely not absolute. They may change with kernel
revisions, and they may also be changed via the proc file-system in the
/proc/sys/net/ipv4/netfilter/ip_ct_tcp_* variables. The
default values should, however, be fairly well established in practice. These
values are set in seconds. Early versions of the patch used jiffies (which
was a bug).
Also note that the User-land side of the state machine does not look at
TCP flags (i.e., RST, ACK, and SYN are flags) set in
the TCP packets. This is generally bad, since you may
want to allow packets in the NEW state to get through the
firewall, but when you specify the NEW flag, you will in
most cases mean SYN packets.
This is not what happens with the current state implementation; instead, even a
packet with no bit set or an ACK flag, will count as
This can be used for redundant firewalling and so on, but it is generally
extremely bad on your home network, where you only have a single firewall. To
get around this behavior, you could use the command explained in the State NEW packets but no SYN bit set section of the Common problems and questions appendix.
Another way is to install the tcp-window-tracking extension
from patch-o-matic, and set the
zero, which will make the firewall drop all NEW packets with anything but the
SYN flag set.