The Internet Protocol (IP) is a data-oriented protocol that allows multiple hosts to talk to each other across network connections. Data in an IP network are sent in blocks referred to as packets or datagrams. They typically have a source host, destination host, and source and destination ports associated with the communication. Layered on top of the IP protocol are other protocols. These are typically transport layers. There are two main transport protocols that are heavily used. The transmission control protocol (TCP) is a stateful delivery mechanism that makes a best effort to deliver the packets requested. If the first attempt fails, multiple attempts are made to route and deliver the packet. This protocol is very good at delivering text files that can not tolerate data corruption. With this protocol, clients have the ability to request redelivery of packets that were not properly received and can handle our of order delivery of packets. This protocol is very good for applications like patch delivery, email, network file shares, and web pages. It is not very good for delivery of streaming video or voice over IP applications.
A packet filter is a piece of software which looks at the header of packets as they pass through, and decides the fate of the entire packet. It might decide to deny the packet (ie. discard the packet as if it had never received it), accept the packet (ie. let the packet go through), or reject the packet (like deny, but tell the source of the packet that it has done so).
Under Linux, packet filtering is built into the kernel (as a kernel module at the moment), and there are a few trickier things we can do with packets, but the general principle of looking at the headers and deciding the fate of the packet is still there.
Why Would I Want to Packet Filter?
Control: when you are using a Linux box to connect your internal network to another network (say, the Internet) you have an opportunity to allow certain types of traffic and disallow others.
Security: simply don’t let anyone connect in, by having the packet filter reject incoming packets used to set up connections.
Watchfulness: Sometimes a badly configured machine on the local network will decide to spew packets to the outside world. It’s nice to tell the packet filter to let you know if anything abnormal occurs; maybe you can do something about it, or maybe you’re just curious by nature.
Iptables is used to set up, maintain, and inspect the tables of IP packet filter rules in the Linux kernel. Several different tables may be defined. Each table contains a number of built-in chains and may also contain user-defined chains. Each chain is a list of rules which can match a set of packets. Each rule specifies what to do with a packet that matches. This is called a target, which may be a jump to a user-defined chain in the same table.
Iptables defines five hook points in the kernels packet processing pathways : PREROUTING, INPUT, FORWARD, POSTROUTING and OUTPUT. Built-in chains are attached to these hook points you can add a sequence of rules for each hook points each rule represents an opportunity to affect or monitor packet flow.
FORWARD that flow through a gateway computer, coming in one interface and going right back out another.
INPUT just before they are delivered to a local process
OUTPUT just after they are generated by a local process
POSTROUTING just before they leave the network interface
PREROUTING just as they arrive from a network interface(after dropping any packets resulting from the interface being in promiscuous mode and after checksum validation.
Options used in Iptables
There are currently three independent tables (which tables are present at any time depends on the kernel configuration options and which modules are present). -t, table option specifies the packet matching table which the command should operate on. If the kernel is configured with automatic module loading, an attempt will be made to load the appropriate module for that table if it is not already there.
The tables are as follows:
This is the default table (if no -t option is passed). It contains the built-in chains INPUT (for packets coming into the box itself), FORWARD (for packets being routed through the box), and OUTPUT (for locally-generated packets).
This table is consulted when a packet that creates a new connection is encountered. It consists of three built-ins: PREROUTING (for altering packets as soon as they come in), OUTPUT (for altering locally-generated packets before routing), and POSTROUTING (for altering packets as they are about to go out).
This table is used for specialized packet alteration. Until kernel 2.4.17 it had two built-in chains: PREROUTING (for altering incoming packets before routing) and OUTPUT (for altering locally-generated packets before routing). Since kernel 2.4.18, three other built-in chains are also supported: INPUT (for packets coming into the box itself), FORWARD (for altering packets being routed through the box), and POSTROUTING (for altering packets as they are about to go out).
By default, each table has chains, which are initially empty, for some or all of the hook points. In addition, you can create your own custom chains to organize your rules. A chains policy determines the fate of packets that reach the end of the chain without otherwise being sent to a specific target. only the built-in targets ACCEPT and DROP can be used as the policy for built in a chain, and the default is ACCEPT. All user-defined chains have an implicit policy of RETURN that cannot be changed. If you want a more complicated policy for a built-in chain or a policy other than RETURN for a user-defined chain, you can add a rule to the end of the chain that matches all packets,with any targets you like You can set the chain’s policy to DROP in case you make a mistake in your catch-all rule or wish to filter out traffic while you make modifications to your catch-all rule(by deleting and re-adding with changes)
An iptables rule consists of one or more match criteria that determine which network packets it affects (all match options must be satisfied for the rule to match a packet) and target specification that determines how the network packets will be affected. The system maintains packet and byte counters for every rule. Every time a packet reaches a rule and matches the rule criteria, the packet counter is incremented,and the byte counter is increased by the size of the matching packet. Both the match and target portion of the rule is optional. If there is no target specification, nothing is done to the packets (processing proceeds as if the rule did not exist except that the packet and byte counters are updated). You can add such a null rule to the FORWARD chain of the filter table with the command: iptables -t filter -A FORWARD
There are a variety of matches available for use with tables, although some are available only for kernels with certain features enabled. Generic Internet protocol(IP) matches (such as protocol, source or destination address) are applicable to any IP packet In addition to the generic matches, iptables include many specialized matches available through dynamically loaded extensions (use iptables -m or match option to inform iptables you want to use one these extensions). There is one match extension for dealing with a networking layer below the IP layer. The mac match extension matches based on MAC address.
Targets are used to specify the action to take when a rule matches a packet and also to specify chain policies. Four targets are built into iptables, and extension modules provide others.
ACCEPT: Let the packet through to next stage of processing.stop traversing the current chain, start at the next stage.
DROP: Discontinue processing the packet completeness, Do not check it against any other rules, chains, or tables.If you want to provide some feedback to the sender.use the REJECT target extension.
QUEUE: Send packet to userspace (i.e code not in the kernel)
RETURN: From a rule in a user-defined chain, discontinue processing this chain, and resume traversing the calling chain at the rule following the one that had this chain as its target. From a rule in a built-in chain, discontinue processing the packet and apply the chain’s policy to it.
Main Files used in IPTables
* /etc/init.d/iptables- INIT script to start|stop|restart the service (and save rulesets).
* /etc/sysconfig/iptables RedHats file for the iptables-save counter files (i.e. The saved rulesets).
* /sbin/iptables – The administration utility/binary.
# iptables list
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
This is what you will see when there are no rule sets in place. Looking at this we see 3 Chains.
* INPUT – Holds rules for traffic directed at this server.
* FORWARD Holds rules for traffic that will be forwarded on to an IP behind this server (i.e. If this box serves as a firewall for other servers).
* OUTPUT – Holds rules for traffic that is coming from this server out to the internet.
Mainly we will be dealing with traffic directed at this server and will be issuing rules for the INPUT Chain. When traffic passes through the kernel, it determines a TARGET based on whether the packet matches a rule or not. General targets are:
* ACCEPT Traffic is accepted for delivery.
* REJECT Traffic is rejected, sending a packet back to the sending host.
* DROP – The traffic is dropped. Nothing is sent back to the sending host.
Configuring Rule Sets
This is the bread-and-butter of packet filtering; manipulating rules. Most commonly, you will probably use the append (-A) and delete (-D) commands. The others (-I for insert and -R for replacing) are simple extensions of these concepts. Each rule specifies a set of conditions the packet must meet, and what to do if it meets them (a `target). For example, you might want to drop all ICMP packets coming from the IP address 127.0.0.1. So, in this case, our conditions are that the protocol must be ICMP and that the source address must be 127.0.0.1. Our target is `DROP.127.0.0.1 is the `loopback interface, which you will have even if you have no real network connection. You can use the `ping program to generate such packets (it simply sends an ICMP type 8 (echo request) which all cooperative hosts should obligingly respond to with an ICMP type 0 (echo reply) packet). This makes it useful for testing.
# ping -c 1 127.0.0.1
PING 127.0.0.1 (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=0.2 ms
127.0.0.1 ping statistics
1 packet transmitted, 1 packet received, 0% packet loss
round-trip min/avg/max = 0.2/0.2/0.2 ms
# iptables -A INPUT -s 127.0.0.1 -p icmp -j DROP
# ping -c 1 127.0.0.1
PING 127.0.0.1 (127.0.0.1): 56 data bytes
127.0.0.1 ping statistics
1 packet transmitted, 0 packets received, 100% packet loss
You can see here that the first ping succeeds (the `-c 1? tells ping to only send a single packet). Then we append (-A) to the `INPUT chain, a rule specifying that for packets from 127.0.0.1 (`-s 127.0.0.1?) with protocol ICMP (`-p icmp) we should jump to DROP (`-j DROP). Then we test our rule, using the second ping. There will be a pause before the program gives up waiting for a response that will never come.
We can delete the rule in one of two ways. Firstly, since we know that it is the only rule in the input chain, we can use a numbered delete, as in:
# iptables -D INPUT 1
To delete rule number 1 in the INPUT chain.
The second way is to mirror the -A command, but replacing the -A with -D. This is useful when you have a complex chain of rules and you don’t want to have to count them to figure out that its rule 37 that you want to get rid of. In this case, we would use:
# iptables -D INPUT -s 127.0.0.1 -p icmp -j DROP
The syntax of -D must have exactly the same options like the -A (or -I or -R) command. If there are multiple identical rules in the same chain, only the first will be deleted.
We have seen the use of `-p to specify the protocol, and `-s to specify a source address, but there are other options we can use to specify packet characteristics. What follows is an exhaustive compendium.
Specifying Source and Destination IP Addresses
Source (`-s, `source or `src) and destination (`-d, `destination or `dst) IP addresses can be specified in four ways. The most common way is to use the full name, such as `localhost or `www.linuxhq.com. The second way is to specify the IP address such as `127.0.0.1?.
The third and fourth ways allow the specification of a group of IP addresses, such as `188.8.131.52/24? or `184.108.40.206/255.255.255.0?. These both specify any IP address from 220.127.116.11 to 18.104.22.168 inclusive; the digits after the `/ tell which parts of the IP address are significant. `/32? or `/255.255.255.255? is the default (match all of the IP address). To specify any IP address at all `/0? can be used, like so:[ NOTE: `-s 0/0? is redundant here. ]
# iptables -A INPUT -s 0/0 -j DROP
This is rarely used, as the effect above is the same as not specifying the `-s option at all.
Many flags, including the `-s (or `source) and `-d (`destination) flags can have their arguments preceded by `! (pronounced `not) to match addresses NOT equal to the ones given. For example. `-s ! localhost matches any packet not coming from localhost.
The protocol can be specified with the `-p (or `protocol) flag. Protocol can be a number (if you know the numeric protocol values for IP) or a name for the special cases of `TCP, `UDP or `ICMP. The case doesn’t matter, so `TCP works as well as `TCP.
The protocol name can be prefixed by a `!, to invert it, such as `-p ! TCP to specify packets which are not TCP.
Specifying an Interface
The `-i (or `in-interface) and `-o (or `out-interface) options specify the name of an interface to match. An interface is a physical device the packet came in on (`-i) or is going out on (`-o). You can use the ifconfig command to list the interfaces which are `up (i.e., working at the moment).
Packets traversing the INPUT chain don’t have an output interface, so any rule using `-o in this chain will never match. Similarly, packets traversing the OUTPUT chain don’t have an input interface, so any rule using `-i in this chain will never match.
Only packets traversing the FORWARD chain have both an input and output interface.
It is perfectly legal to specify an interface that currently does not exist; the rule will not match anything until the interface comes up. This is extremely useful for dial-up PPP links (usually interface ppp0) and the like.
As a special case, an interface name ending with a `+ will match all interfaces (whether they currently exist or not) which begin with that string. For example, to specify a rule which matches all PPP interfaces, the -i ppp+ option would be used.
The interface name can be preceded by a `! with spaces around it, to match a packet which does not match the specified interface(s), eg -i ! ppp+.
Sometimes a packet is too large to fit down a wire all at once. When this happens, the packet is divided into fragments and sent as multiple packets. The other end reassembles these fragments to reconstruct the whole packet.
The problem with fragments is that the initial fragment has the complete header fields (IP + TCP, UDP and ICMP) to examine, but subsequent packets only have a subset of the headers (IP without the additional protocol fields). Thus looking inside subsequent fragments for protocol headers (such as is done by the TCP, UDP and ICMP extensions) is not possible.
If you are doing connection tracking or NAT, then all fragments will get merged back together before they reach the packet filtering code, so you need never worry about fragments.
Please also note that in the INPUT chain of the filter table (or any other table hooking into the NF_IP_LOCAL_IN hook) is traversed after defragmentation of the core IP stack.
Otherwise, it is important to understand how fragments get treated by the filtering rules. Any filtering rule that asks for information we don’t have will not match. This means that the first fragment is treated like any other packet. Second and further fragments won’t be. Thus a rule -p TCP sport www (specifying a source port of `www) will never match a fragment (other than the first fragment). Neither will the opposite rule -p TCP sport ! www.
However, you can specify a rule specifically for second and further fragments, using the `-f (or `fragment) flag. It is also legal to specify that a rule does not apply to second and further fragments, by preceding the `-f with ` ! .
Usually it is regarded as safe to let second and further fragments through since filtering will affect the first fragment, and thus prevent reassembly on the target host; however, bugs have been known to allow crashing of machines simply by sending fragments. Your call.
Note for network-heads: malformed packets (TCP, UDP and ICMP packets too short for the firewalling code to read the ports or ICMP code and type) are dropped when such examinations are attempted. So are TCP fragments starting at position 8?
As an example, the following rule will drop any fragments going to 192.168.1.1:
# iptables -A OUTPUT -f -d 192.168.1.1 -j DROP
Extensions to iptables: New Matches
iptables is extensible, meaning that both the kernel and the iptables tool can be extended to provide new features.
Some of these extensions are standard, and others are more exotic. Extensions can be made by other people and distributed separately for niche users.
Kernel extensions normally live in the kernel module subdirectory, such as /lib/modules/2.4.0-test10/kernel/net/ipv4/netfilter. They are demand loaded if your kernel was compiled with CONFIG_KMOD set, so you should not need to manually insert them.
Extensions to the iptables program are shared libraries which usually live in /usr/local/lib/iptables/, although a distribution would put them in /lib/iptables or /usr/lib/iptables.
Extensions come in two types: new targets, and new matches (well talk about new targets a little later). Some protocols automatically offer new tests: currently, these are TCP, UDP and ICMP as shown below.
For these, you will be able to specify the new tests on the command line after the `-p option, which will load the extension. For explicit new tests, use the `-m option to load the extension, after which the extended options will be available.
To get help on an extension, use the option to load it (`-p, `-j or `-m) followed by `-h or `help, eg:
# iptables -p tcp help
The TCP extensions are automatically loaded if `-p tcp is specified. It provides the following options (none of which match fragments).
Followed by an optional `!, then two strings of flags, allows you to filter on specific TCP flags. The first string of flags is the mask: a list of flags you want to examine. The second string of flags tells which one(s) should be set. For example,
# iptables -A INPUT protocol tcp tcp-flags ALL SYN,ACK -j DROP
This indicates that all flags should be examined (`ALL is synonymous with `SYN,ACK,FIN,RST,URG,PSH), but only SYN and ACK should be set. There is also an argument `NONE meaning no flags.
Optionally preceded by a `!, this is shorthand for `tcp-flags SYN,RST,ACK SYN.
followed by an optional `!, then either a single TCP port, or a range of ports. Ports can be port names, as listed in /etc/services, or numeric. Ranges are either two-port names separated by a `:, or (to specify greater than or equal to a given port) a port with a `: appended, or (to specify less than or equal to a given port), a port preceded by a `:.
is synonymous with `source-port.
are the same as above, only they specify the destination, rather than source, port to match.
followed by an optional `! and a number matches a packet with a TCP option equaling that number. A packet which does not have a complete TCP header is dropped automatically if an attempt is made to examine its TCP options.
An Explanation of TCP Flags
It is sometimes useful to allow TCP connections in one direction, but not the other. For example, you might want to allow connections to an external WWW server, but not connections from that server.
The naive approach would be to block TCP packets coming from the server. Unfortunately, TCP connections require packets going in both directions to work at all.
The solution is to block only the packets used to request a connection. These packets are called SYN packets (ok, technically they’re packets with the SYN flag set, and the RST and ACK flags cleared, but we call them SYN packets for short). By disallowing only these packets, we can stop attempted connections in their tracks.
The `syn flag is used for this: it is only valid for rules which specify TCP as their protocol. For example, to specify TCP connection attempts from 192.168.1.1:
-p TCP -s 192.168.1.1 syn
This flag can be inverted by preceding it with a `!, which means every packet other than the connection initiation.
These extensions are automatically loaded if `-p udp is specified. It provides the options `source-port, `sport, `destination-port and `dport as detailed for TCP above.
This extension is automatically loaded if `-p icmp is specified. It provides only one new option:
followed by an optional `!, then either an icmp type name (eg `host-unreachable), or a numeric type (eg. `3?), or a numeric type and code separated by a `/ (eg. `3/3?). A list of available icmp type names is given using `-p icmp help.
Other Match Extensions
The other extensions in the netfilter package are demonstration extensions, which (if installed) can be invoked with the `-m option.
This module must be explicitly specified with `-m mac or `match mac. It is used for matching incoming packets source Ethernet (MAC) address, and thus only useful for packets traversing the PREROUTING and INPUT chains. It provides only one option:
followed by an optional `!, then an ethernet address in colon-separated hex byte notation, eg `mac-source 00:60:08:91:CC: B7?.
This module must be explicitly specified with `-m limit or `match limit. It is used to restrict the rate of matches, such as for suppressing log messages. It will only match a given number of times per second (by default 3 matches per hour, with a burst of 5). It takes two optional arguments:
followed by a number; specifies the maximum average number of matches to allow per second. The number can specify units explicitly, using `/second, `/minute, `/hour or `/day, or parts of them (so `5/second is the same as `5/s).
followed by a number, indicating the maximum burst before the above limit kicks in.
This match can often be used with the LOG target to do rate-limited logging. To understand how it works, lets look at the following rule, which logs packets with the default limit parameters:
# iptables -A FORWARD -m limit -j LOG
The first time this rule is reached, the packet will be logged; in fact, since the default burst is 5, the first five packets will be logged. After this, it will be twenty minutes before a packet will be logged from this rule, regardless of how many packets reach it. Also, every twenty minutes which passes without matching a packet, one of the burst will be regained; if no packets hit the rule for 100 minutes, the burst will be fully recharged; back where we started.
Note: you cannot currently create a rule with a recharge time greater than about 59 hours, so if you set an average rate of one per day, then your burst rate must be less than 3.
You can also use this module to avoid various denial of service attacks (DoS) with a faster rate to increase responsiveness.
# iptables -A FORWARD -p tcp syn -m limit 1/s -j ACCEPT
Furtive port scanner:
# iptables -A FORWARD -p tcp tcp-flags SYN,ACK,FIN,RST RST -m limit limit 1/s -j ACCEPT
Ping of death:
# iptables -A FORWARD -p icmp icmp-type echo-request -m limit limit 1/s -j ACCEPT
This module works like a hysteresis door, as shown in the graph below.
| / DoS \
| / \
Edge of DoS -|..:\..
= (limit * | /: \
limit-burst) | / : \ .-.
| / : \ / \
| / : \ / \
End of DoS -|/.:..:/.\./.
= limit | : :`- `
-+++> time (s)
LOGIC => Match | Didn’t Match | Match
Say we say match one packet per second with a five packet burst, but packets start coming in at four per second, for three seconds, then start again in another three seconds.
<Flood 1> <Flood 2>
Total ^ Line __ YNNN
Packets| Rate __ YNNN
| mum __ YNNN
10 | Maxi __ Y
| __ Y
| __ Y
| __ YNNN
5 | Y
| Y Key: Y -> Matched Rule
| Y N -> Didn’t Match Rule
0 +> Time (seconds)
0 1 2 3 4 5 6 7 8 9 10 11 12
You can see that the first five packets are allowed to exceed the one packet per second, then the limiting kicks in. If there is a pause, another burst is allowed but not past the maximum rate set by the rule (1 packet per second after the burst is used).
This module attempts to match various characteristics of the packet creator, for locally-generated packets. It is only valid in the OUTPUT chain, and even then some packets (such as ICMP ping responses) may have no owner, and hence never match.
Matches if the packet was created by a process with the given effective (numerical) user id.
Matches if the packet was created by a process with the given effective (numerical) group id.
Matches if the packet was created by a process with the given process id.
Matches if the packet was created by a process in the given session group.
This experimental module must be explicitly specified with `-m unclean or `match unclean. It does various random sanity checks on packets. This module has not been audited, and should not be used as a security device (it probably makes things worse, since it may well have bugs itself). It provides no options.
The State Match
The most useful match criterion is supplied by the `state extension, which interprets the connection-tracking analysis of the `ip_conntrack module. This is highly recommended.
Specifying `-m state allows an additional `state option, which is a comma-separated list of states to match (the `! flag indicates not to match those states). These states are:
A packet which creates a new connection.
A packet which belongs to an existing connection (i.e., a reply packet, or outgoing packet on a connection which has seen replies).
A packet which is related to, but not part of, an existing connection, such as an ICMP error, or (with the FTP module inserted), a packet establishing an ftp data connection.
A packet which could not be identified for some reason: this includes running out of memory and ICMP errors which don’t correspond to any known connection. Generally, these packets should be dropped.
An example of this powerful match extension would be:
# iptables -A FORWARD -i ppp0 -m state ! state NEW -j DROP
Now we know what examinations we can do on a packet, we need a way of saying what to do to the packets which match our tests. This is called a rule’s target.
There are two very simple built-in targets: DROP and ACCEPT. We’ve already met them. If a rule matches a packet and its target is one of these two, no further rules are consulted: the packets fate has been decided.
There are two types of targets other than the built-in ones: extensions and user-defined chains.
One powerful feature which iptables inherits from ipchains is the ability for the user to create new chains, in addition to the three built-in ones (INPUT, FORWARD and OUTPUT). By convention, user-defined chains are lower-case to distinguish them (well describe how to create new user-defined chains below in Operations on an Entire Chain).
When a packet matches a rule whose target is a user-defined chain, the packet begins traversing the rules in that user-defined chain. If that chain doesn’t decide the fate of the packet, then once traversal on that chain has finished, traversal resumes on the next rule in the current chain.
Time for more ASCII art. Consider two (silly) chains: INPUT (the built-in chain) and test (a user-defined chain).
| Rule1: -p ICMP -j DROP | | Rule1: -s 192.168.1.1 |
| Rule2: -p TCP -j test | | Rule2: -d 192.168.1.1 |
| Rule3: -p UDP -j DROP |
Consider a TCP packet coming from 192.168.1.1, going to 22.214.171.124. It enters the INPUT chain and gets tested against Rule1 – no match. Rule2 matches, and its target tests, so the next rule examined is the start of the test. Rule1 in test matches, but doesn’t specify a target, so the next rule is examined, Rule2. This doesn’t match, so we have reached the end of the chain. We return to the INPUT chain, where we had just examined Rule2, so we now examine Rule3, which doesn’t match either.
So the packet path is:
`INPUT | / `test v
| Rule1 | /| | Rule1 | |
| Rule2 / | | Rule2 | |
| Rule3 /+___________________________/
User-defined chains can jump to other user-defined chains (but dont make loops: your packets will be dropped if theyre found to be in a loop).
Extensions to iptables: New Targets
The other type of extension is a target. A target extension consists of a kernel module, and an optional extension to iptables to provide new command-line options. There are several extensions in the default Netfilter distribution:
This module provides kernel logging of matching packets. It provides these additional options:
Followed by a level number or name. Valid names are (case-insensitive) `debug, `info, `notice, `warning, `err, `crit, `alert and `emerg, corresponding to numbers 7 through 0. See the man page for syslog.conf for an explanation of these levels. The default is `warning.
Followed by a string of up to 29 characters, this message is sent at the start of the log message, to allow it to be uniquely identified.
This module is most useful after a limit match, so you don’t flood your logs.
This module has the same effect as `DROP, except that the sender is sent an ICMP `port unreachable error message. Note that the ICMP error message is not sent if (see RFC 1122):
* The packet being filtered was an ICMP error message in the first place, or some unknown ICMP type.
* The packet being filtered was a non-head fragment.
* We’ve sent too many ICMP error messages to that destination recently (see /proc/sys/net/ipv4/icmp_ratelimit).
REJECT also takes a `reject-with optional argument which alters the reply packet used: see the manual page.
Special Built-In Targets
There are two special built-in targets: RETURN and QUEUE.
RETURN has the same effect of falling off the end of a chain: for a rule in a built-in chain, the policy of the chain is executed. For a rule in a user-defined chain, the traversal continues at the previous chain, just after the rule which jumped to this chain.
QUEUE is a special target, which queues the packet for userspace processing. For this to be useful, two further components are required:
* a queue handler, which deals with the actual mechanics of passing packets between the kernel and userspace; and
* a userspace application to receive, possibly manipulate and issue verdicts on packets.
The standard queue handler for IPv4 iptables is the ip_queue module, which is distributed with the kernel and marked as experimental.
The following is a quick example of how to use iptables to queue packets for userspace processing:
# modprobe iptable_filter
# modprobe ip_queue
# iptables -A OUTPUT -p icmp -j QUEUE
With this rule, locally generated outgoing ICMP packets (as created with, say, ping) are passed to the ip_queue module, which then attempts to deliver the packets to a userspace application. If no userspace application is waiting, the packets are dropped.
To write a userspace application, use the libipq API. This is distributed with iptables. Example code may be found in the testsuite tools (e.g. redirect.c) in CVS.
The status of ip_queue may be checked via:
The maximum length of the queue (i.e. the number packets delivered to userspace with no verdict issued back) may be controlled via:
The default value for the maximum queue length is 1024. Once this limit is reached, new packets will be dropped until the length of the queue falls below the limit again. Nice protocols such as TCP interpret dropped packets as congestion, and will hopefully back off when the queue fills up. However, it may take some experimenting to determine an ideal maximum queue length for a given situation if the default value is too small.
Operations on an Entire Chain
A very useful feature of iptables is the ability to group related rules into chains. You can call the chains whatever you want, but I recommend using lower-case letters to avoid confusion with the built-in chains and targets. Chain names can be up to 31 letters long.
Creating a New Chain
Let us create a new chain. Because I am such an imaginative fellow, I’ll call it test. We use the `-N or `new-chain options:
# iptables -N test
It is that simple. Now you can put rules in it as detailed above.
Deleting a Chain
Deleting a chain is simple as well, using the `-X or `delete-chain options. Why `-X? Well, all the good letters were taken.
# iptables -X test
There are a couple of restrictions to deleting chains: they must be empty (see Flushing a Chain below) and they must not be the target of any rule. You cant delete any of the three built-in chains.
If you don’t specify a chain, then all user-defined chains will be deleted, if possible.
Flushing a Chain
There is a simple way of emptying all rules out of a chain, using the `-F (or `flush) commands.
# iptables -F FORWARD
If you don’t specify a chain, then all chains will be flushed.
Listing a Chain
You can list all the rules in a chain by using the `-L (or `list) command.
The `refcnt listed for each user-defined chain is the number of rules which have that chain as their target. This must be zero (and the chain be empty) before this chain can be deleted.
If the chain name is omitted, all chains are listed, even empty ones.
There are three options which can accompany `-L. The `-n (numeric) option is very useful as it prevents iptables from trying to look up the IP addresses, which (if you are using DNS like most people) will cause large delays if your DNS is not set up properly, or you have filtered out DNS requests. It also causes TCP and UDP ports to be printed out as numbers rather than names.
The `-v options shows you all the details of the rules, such as the packet and byte counters, the TOS comparisons, and the interfaces. Otherwise, these values are omitted.
Note that the packet and byte counters are printed out using the suffixes `K, `M or `G for 1000, 1,000,000 and 1,000,000,000 respectively. Using the `-x (expand numbers) flag as well as prints the full numbers, no matter how large they are.
Resetting (Zeroing) Counters
It is useful to be able to reset the counters. This can be done with the `-Z (or `zero) option.
Consider the following:
# iptables -L FORWARD
# iptables -Z FORWARD
In the above example, some packets could pass through between the `-L and `-Z commands. For this reason, you can use the `-L and `-Z together, to reset the counters while reading them.
Other Configuration files
It contains settings for configuration in the /proc/sys directory that are applied at boot time,For example /proc/sys/net/ipv4/ip_forward can be set to 1 at boot time by adding an entry net.ipv4.ip_forward=1 to this file
Dumps the contents of connection tracking structures if you read it.
controls the size of the connection tracking table in the kernel. The default value is calculated based on the amount of the RAM in your computer. You may need to increase it if you are
getting ip_contrack:table full,dropping packet errors in your log files ,see also the entry for /etc/sysctl.conf in this table
You need to set this to 1 for the host to act as a gateway (forwarding packets among the networks connected to its interfaces).see also the entry for /etc/sysctl.conf
7.0 Compile your own kernel
On the RedHat machines you can determine the Kernel you are currently running by looking at the output of the uname -r command which will print the message such as this 2.4-20-20-9
Using your kernel version and your machine type which can be determined by consulting the output of uname -a.
You can find the most appropriate configuration file to use to build your new kernel in a file named something like this /usr/src/linux version/configs/kernel-2.4-20-i686.config
The following configuration option must be selected at a minimum
1. CONFIG_PACKET (direct communication with network interfaces)
2. CONFIG_NETFILTER (the basic kernel support required by the iptables)
3. CONFIG_IP_NF_CONTRACK (required for NAT and Masquerading)
4. CONFIG_IP_NF_FILTER (adds the filter table)
5. CONFIG_IP_NF_IPTABLES (the basic support for user space iptables utility)
6. CONFIG_IP_NF_MANGLE (adds the mangle table)
7. CONFIG_IP_NF_NAT (adds the nat table)
The iptables configuration settings are found in entries with names like CONFIG_IP_NF_*.
8.0 Connect Tracking
Iptables associate packets with the logical connections they belong to (even considers certain UDP communications patterns to imply connections even though UDP is connectionless protocol).Inorder to do this it tracks the progress of connections through their lifecycle and this tracking information is made available through the conntrack match extension
although the underlying TCP connection state model is more complicated, the connection tracking logic assigns one of the states in the table to each connection at any point in time
Connection tacking states
ESTABLISHED: The connection has already seen packets going in both directions.
INVALID: The packet doesn’t belong to any tracked connections
NEW: The packet is starting a new connection or is part of a connection that hasn’t yet seen in both directions
RELATED: The packet is starting a new connection, but the new connection is related to an existing connection.
The connection tracking logic maintains three bits of status information associated with each connection below table contains a list of these status coded as they are named in the conntrack match extension.
ASSURED: For TCP connections indicates the TCP connections set up has been completed
For UDP connections indicates it look like UDP stream to the kernel
EXPECTED: Indicates the connection was expired
SEEN_REPLY The iptables connection tracking logic allows plug-in modules to help identify new connections that are related to existing connections. You need to use these plug-ins.if you want to make multi-connection protocols work right across your gateway/firewall below tables give the main connection tracking helper modules To use these you need to run the mod probe command to install the kernel module
Helper match modules
ip_contrack_amanda Amanda backup protocol(requires CONFIG_IP_NF_AMANDA kernel config)
ip_contrack_ftp File Transfer protocol(requires CONFIG_IP_NF_FTP kernel config)
ip_contrack_tfirc Internet relay chat (requires CONFIG_IP_NF_IRC kernel config)
ip_contrack_tftp Trivial File transfer protocol(requires CONFIG_IP_NF_TFPT kernel config
The kernel automatically tracks packet and byte counts for each rule. This information can be used to do accounting on network usage.
For example, if you add the following four rules to a machine serving as an internet gateway(assuming two network interfaces: eth0 for the internal network and eth1 for the internet connection), kernel tracks the number of packets and bytes exchanged with the outside world.
iptables -A FORWARD -i eth1
iptables -A FORWARD -i eth1
iptables -A INPUT -i eth1
iptables -A OUTPUT -i eth1
After running these commands iptables -L -v shows (note the counts for INPUT and OUTPUT:
the nonzero counts indicate that some traffic had already traversed the chains by the time we displayed the counts.
Basic NAT is the process of translating, or mapping, one set of IP addresses (usually private) to another (usually public). It was originally developed to keep private addresses from being seen on public networks. It so happens that the largest public network, the Internet, virtually assured that NAT would be around for a long time to come.
Think of NAT as acting like a postal interpreter. It waits for a message to be started from someone it knows. It then translates the conversation to come from a different address (usually itself), then sends the message on its way. As the message reply comes back, NAT remembers the conversation, and applies the destination address of the person it knows, sending it back inside to that person. It can remember and process multiple messages at once. In real use, NAT translates all types of IP based traffic between two different address ranges.
There are many reasons why NAT is used. Some of those include the dwindling amount of IPv4 Internet addresses, corporate network security, misused IP addresses on corporate networks, and finally, cost savings on Internet connectivity. It is not my intention to explain all of these, nor is it my intention to give a dissertation on the subject. Rather, I will give a breakdown of how NAT works, and when you will most likely use it.
How does NAT work?
In its simplest form, NAT will translate a range of addresses to another range of addresses. Usually, one range is inside a corporate network and uses the private range approved for that application. Private address ranges are not normally routable over the internet and are usually filtered from ever leaving an internet border router. These ranges include:
10.X.X.X with any mask 255.0.0.0 or greater. 172.16.X.X through 172.31.X.X with any mask 255.240.0.0 or greater. 192.168.X.X with any mask 255.255.0.0 or greater.
Notice that these ranges include all three traditional classes of addresses, A, B, and C. Also note that you do not have to use them by their normal classes as long as your network equipment, and routing protocols will support that.
What do I mean by a greater mask? Let us take the case of 10.X.X.X. That’s Microsofts default network on most NT setups. There is absolutely no need to use the standard class A mask for that network. Doing so would limit you to one network number, in this case, 10, but allow you to have over 16.5 million hosts. That’s a little large for most networks. Instead, it is much more common to have something like 10.0.0.X (255.255.255.0) at your home office, and maybe 10.0.1.X (255.255.255.0) at your remote office. By doing this, you have plenty of room for future networks, with 254 useable host entries per network. So my explanation of a greater mask would be anything greater than the value I listed. For more information on IP addressing, see the IP addressing and Subnets article.
Splitting up the addresses without heed to class is known as classless addressing. It is quite common and most modern networking equipment and OSs will support it.
So, NAT takes care of the translation from private addresses, to public addresses in order for the traffic to be routable on the internet. As the information comes back, the original request is re-mapped to the table held by the router, and information is passed inside to the original machine that requested it. NAT can translate any two address ranges back and forth, but from Private to Public is the most common.
Source NAT and Masquerading
IP masquerading is a form of network address translation (NAT) which allows internal computers with no known address outside their network, to communicate to the outside. It allows one machine to act on behalf of other machines. It is similar to someone buying stocks through a broker (without considering the monetary transaction). The person buying stocks, tells the broker to buy the stocks, the broker gets the stocks and passes them to the person who made the purchase. The broker acts on behalf of the stock purchaser as though he was the one buying the stock. No one who sold the stock knew or cared about whether the broker was buying for himself or someone else.
Please DO NOT confuse routers with firewalls and the performance of IP masquerading. The commands that allow IP masquerading are a simple form of a firewall, however routing is a completely different function, as described previously. Setting a computer up to act as a router is completely different than setting up a computer to act as a firewall. Although the two functions are similar in that the router or firewall will act as a communication mechanism between two networks or subnets, the similarity ends there. A computer can be either a router or a firewall, but not both. If you set up a computer to act as both a router and a firewall, you have defeated the purpose of your firewall!
NAT is classified in to into two different types: Source NAT Destination NAT
Source NAT is when you alter the source address of the first packet: i.e. you are changing where the connection is coming from. Source NAT is always done post-routing, just before the packet goes out onto the wire. Masquerading is a specialized form of SNAT.
Destination NAT is when you alter the destination address of the first packet: i.e. you are changing where the connection is going to. Destination NAT is always done before routing when the packet first comes off the wire. Port forwarding, load sharing, and transparent proxying are all forms of DNAT.
Article Authored by Anoopkiran
Author, Anoopkiran, is a Sr.Systems Engineer with SupportPRO. Anoopkiran specializes in Linux servers Administration. SupportPRO offers 24X7 technical support services to Web hosting companies and service providers.
If you require help, contact SupportPRO Server Admin