linuxpineheadonly

Netstat: network analysis and troubleshooting, explained

The netstat command gives you a set of tools to answer the question “What in blazes is going on on my network?” when things go wrong. To be able to use it effectively at such an occasion, however, you might want to learn how it works right now, so you’ll be prepared. Besides, it never hurts to understand your network just a little better. Read on to find out exactly what netstat is, what you can use it for and how it can help you solve problems and understand your network.

A bundle of network tools
The netstat command doesn’t really do unique things. It can print network statistics, but ifconfig can do so, too. It can print routing tables, but route can do that, too. It can print open connections, but lsof does that, and more. So why use netstat at all? There are two main reasons:

  • netstat bundles a few often-used network analysis actions in a single command and
  • netstat is multi-platform.

That’s right, netstat is there on Windows and Mac, too, and with more or less the same syntax. Learn once, use anywhere. That comes in pretty handy when you’re troubleshooting a network with machines running different operating systems. So, without further ado, let’s dive into netstat’s main functionalities.

Printing network connections
Using netstat, you can list the network connections that currently exist between your machine and other machines, as well as sockets LISTENing for connections from other machines. It can show you which programs are active on your network right now. Have an example:

> sudo netstat -apA inet
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 localhost:46178         *:*                     LISTEN      2484/GoogleTalkPlug
tcp        0      0 localhost:41093         *:*                     LISTEN      2484/GoogleTalkPlug
tcp        0      0 *:ssh                   *:*                     LISTEN      1894/sshd
tcp        0      0 localhost:ipp           *:*                     LISTEN      1226/cupsd
tcp        0      0 *:17500                 *:*                     LISTEN      1366/dropbox
tcp        0      0 Trafalgar.local:54744   wi-in-f17.1e100.n:https ESTABLISHED 1792/firefox
tcp        0      0 localhost:41093         localhost:45741         ESTABLISHED 2484/GoogleTalkPlug
tcp       38      0 Trafalgar.local:32808   v-client-5b.sjc.d:https CLOSE_WAIT  1366/dropbox
tcp        0      0 Trafalgar.local:40998   sjc-not6.sjc.dropbo:www ESTABLISHED 1366/dropbox
tcp        0      0 Trafalgar.local:34354   192.168.1.200:2022      ESTABLISHED 1499/ssh
tcp        0      0 localhost:45741         localhost:41093         ESTABLISHED 2481/plugin-contain
tcp        0      0 Trafalgar.local:34351   192.168.1.200:2022      ESTABLISHED 1349/ssh
tcp       38      0 Trafalgar.local:39336   v-d-2b.sjc.dropbo:https CLOSE_WAIT  1366/dropbox
udp        0      0 *:49678                 *:*                                 642/avahi-daemon: r
udp        0      0 *:bootpc                *:*                                 932/dhclient3
udp        0      0 *:bootpc                *:*                                 1854/dhclient3
udp        0      0 *:17500                 *:*                                 1366/dropbox
udp        0      0 *:mdns                  *:*                                 642/avahi-daemon: r

Now that’s a lot of information! Let me explain some of that, starting with what the columns stand for:

  • The “Proto” column tell us if the socket listed is TCP or UDP. Those are network protocols. TCP makes reliable connections but slows down dramatically if the network quality is bad. UDP stays fast but may lose a few packets or deliver them in the wrong order. TCP connections are used for browsing the web and downloading files. UDP connections are used by certain fast-paced computer games and sometimes by live streams.
  • The “Recv-Q” and “Send-Q” columns tell us how much data is in the queue for that socket, waiting to be read (Recv-Q) or sent (Send-Q). In short: if this is 0, everything’s ok, if there are non-zero values anywhere, there may be trouble. If you look closely at the example, you’ll see that two sockets have a Recv-Q with 38 unread bytes in them. We’ll look into those connections once we know what the other columns mean.
  • The “Local Address” and “Foreign Address” columns tell to which hosts and ports the listed sockets are connected. The local end is always on the computer on which you’re running netstat (in the example, the computer is called “Trafalgar”), and the foreign end is about the other computer (could be somewhere in the local network or somewhere on the internet). If you look closely at the example, you’ll see that two sockets have localhost as the Foreign Address. Strange, right? It means the computer is talking to itself over the network, so to speak. We’ll look into the meaning of that once we know what all the columns mean.
  • The “State” column tells in which state the listed sockets are. The TCP protocol defines states, including “LISTEN” (wait for some external computer to contact us) and “ESTABLISHED” (ready for communication). The stranger among these is the “CLOSE WAIT” state shown by two sockets. This means that the foreign or remote machine has already closed the connection, but that the local program somehow hasn’t followed suit. Note that the two “CLOSE WAIT” sockets are also the ones with 38 unread bytes in the Recv-Q. Strange states and non-empty queues often go together.
  • The “PID/Program name” column tells us which pid owns the listed socket and the name of the program running in the process with that pid. So you can see which programs are using the network and to whom they are connecting.

So how to interpret a line of this output? Let’s have a look at the line ending in “firefox”. Firefox is connected to Foreign Address wi-in-f17.1e100.n[something] on the port reserved for secure HTTP connections (which is 443, by the way). Using the -W option (sudo netstat -apWA inet), the full Foreign Address is shown to be wi-in-f17.1e100.net, which happens to belong to Google. This connection is probably there because Gmail was open in a tab when I ran this command. No problems here.

We encountered a few strange connections while looking at the column meanings. First, there are two connections with a Recv-Q of 38 (which should really be 0) and a “CLOSE WAIT” state. The last column tells us that these sockets both belong to Dropbox, which has a few other connections and a LISTENing socket that seem in better shape. Somehow, it seems that Dropbox is “leaking” sockets, or at least letting them dangle. The foreign machine has closed both connections, but the local Dropbox process is not closing them, meaning the resources the sockets take are not freed. This is probably some sloppy programming on Dropbox’s part, but as long as it only leaves two sockets hanging, it’s not much of a problem. If you have a program that generates a lot of sockets like this, you might want to report it to the creator(s) of the program, and perhaps periodically restart the program to get rid of the mess.

The other strange thing we encountered is that two sockets have localhost as Foreign Address, meaning that there is a network connection between this computer and itself. The PID/Program name column, however, tells us that there are two distinct programs communicating in this way, namely the Google Talk plug-in and plugin-container, which is Firefox’s container program for running plug-ins. In other words, Google Talk plug-in uses TCP networking to communicate between the standalone part and the Firefox plug-in part running on the same computer (note how the local and foreign port numbers match). This way of using network capability is not used very often, but it makes sense in this case. Because networking is unified across platforms (Linux, Windows and Mac all do it in the same way), using it here gives Google Talk plug-in a portable way to communicate between its parts.

Now, let me tell you that it takes some thorough knowledge of networking and what programs should and shouldn’t be using your network to take full advantage of this output of netstat. If you don’t have this knowledge, however, you can still at least get an idea of what might be wrong by searching the internet for programs you don’t know and reading about them to decide if they should be there. To help you understand a bit more what’s going on, here are some tips:

  • If the Foreign Address is *:* (and, with TCP sockets, the state is LISTEN), a socket is usually waiting for some remote host to send the first data. Typical examples: sshd (waits for somebody to open an ssh connection), apache (waits for somebody to request a web page), cupsd (waits for somebody to send a print job), and dhclient (waits for the DHCP server to send, for example, a lease renewal).
  • When connecting to a foreign host, a program on your computer usually doesn’t care which local port is used for the connection. That’s why the port on the local side isn’t usually recognized and translated to a protocol like “https” or “www”; it is actually picked from a range of unreserved ports to avoid confusion with other protocols. Examples of such port numbers (from the example output above): 54744, 32808, and 34354.

Before moving on to the next output type of netstat, let me explain the options used in the example here. First, -a tells netstat to show all sockets, both LISTENing and non-LISTENing. Next, -p tells netstat to show the PID/Program name column, which helps a lot in judging if a socket should be there at all. Finally, “-A inet” tells netstat to show only TCP/UDP sockets. Without this option, the output is often flooded with Unix sockets, which are less interesting from a networking perspective. Note that, on Windows, the “-A inet” can just be omitted, and -p should be replaced with -o. On Mac, there is no equivalent of -p, and “-A inet” becomes “-f inet”. If you need to know program names/pids on Mac, use “lsof -i”.

Printing routing tables
Besides active sockets, netstat can also list the current entries in the routing table of your computer. Routing in the world of networking means deciding where to send a packet with a certain destination. Have another example:

> netstat -r
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
192.168.1.0     *               255.255.255.0   U         0 0          0 eth0
link-local      *               255.255.0.0     U         0 0          0 eth0
default         smoothwall      0.0.0.0         UG        0 0          0 eth0

As you may have guessed, the -r option tell netstat to display your computer’s routing table. To help you interpret the output correctly, let me explain the meaning of the columns:

  • The “Destination” column indicates the pattern that the destination of a packet is compared to. When a packet has to be sent over the network, this table is examined top to bottom, and the first line with a matching destination is then used to determine where to send the packet. The zero in 192.168.1.0 means “match anything at this position”, so 192.168.1.53 matches, and 192.168.1.254 also matches, but 192.168.2.254 doesn’t match. The “link-local” label stands for 169.254.0.0, which is a special range of ip addresses to be used when there is no other way to determine which ip address the computer should have (no DHCP or statically configured address). The “default” label stands for 0.0.0.0 and obviously matches any destination; this last line is kind of a catch-all for packets.
  • The “Gateway” column tells the computer where to send a packet that matches the destination of the same line. An asterisk ( * ) here means “send locally”, because the destination is supposed to be on the same network. The “smoothwall” gateway is actually a computer in the example computer’s network that filters web traffic and has access to the internet, so it makes sense that packets to any non-local destination are sent there so it may forward it to the internet.
  • The “Genmask” column is somewhat advanced (it tells how many bits from the start of the ip address are used to identify the subnet, if that means anything to you), but, as a rule of thumb, it is 255 for any non-zero part of the destination and 0 for parts of the destination that are 0.
  • The “Flags” column shows which flags apply to the current table line. “U” means Up, indicating that this is an active line. “G” means this line uses a Gateway.
  • The “MSS” column lists the value of the Maximum Segment Size for this line. The MSS is a TCP parameter and is used to split packets when the destination has indicated that it somehow can’t handle larger ones. Nowadays, most computers have no problems with the most commonly used maximum packet sizes, so this column usually has the value of 0, meaning “no changes”.
  • The “Window” column is like the MSS column in that it gives the option of altering a TCP parameter. In this case that parameter is the default window size, which indicates how many TCP packets can be sent before at least one of them has to be ACKnowledged. If you don’t know what this means, don’t worry. Like the MSS, this field is usually 0, meaning “no changes”.
  • The “irtt” column stands for Initial Round Trip Time and may be used by the kernel to guess about the best TCP parameters without waiting for slow replies. In practice, it’s not used much, so you’ll probably never see anything else than 0 here.
  • The “Iface” column tells which network interface should be used for sending packets that match the destination. If your computer is connected to multiple subnets on multiple network cards, you may find that some lines have an Iface of eth0 and others have one of eth1. Heck, even if the second network card isn’t connected but just available, there may be some routing rules for it in the table.

So, when your computer is about to send a packet, it looks at the destination of this packet and then starts comparing it to the routing destinations line by line. Suppose the computer wants to send a packet with destination 192.168.1.31. This ip address matches 192.168.1.0, because 0 matches anything on that position, so the packet is sent to the local network (because the gateway is *) of the eth0 interface, without changing the MTT, Window or irtt values. Suppose the computer wants to send a packet to 208.67.222.222 (primary OpenDNS server). This ip address does not match 192.168.1.0, so the first line is skipped. It also doesn’t match link-local (169.254.0.0), so the second line is skipped as well. The ip address does match 0.0.0.0 (any ip address does), so the packet is sent to smoothwall (192.168.1.1) to be forwarded to the internet, where it may reach the OpenDNS server after a few more hops.

Usually, the default lines in your routing table are correctly set, and you won’t have to worry about it. A single wrong line in the routing table can, however, block some or all internet traffic from reaching its destination. The error you often get when this happens is “No route to host”. Sadly, this error also occurs in many other cases, but if you encounter it, have a look at your routing table anyway. If it does contain a bad line or lacks an important one, you can use the route command to change/add it (run “man route” in your command line to see how it works).

Showing interfaces and statistics
Using netstat,you can list the available interfaces on your machine and read some statistics on how they’re doing. An example:

> netstat -i
Kernel Interface table
Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0       1500 0     25055      0      0 0         14239      0      0      0 BMRU
lo        16436 0        16      0      0 0            16      0      0      0 LRU

As you may notice, the columns are messed up a bit. If in doubt to which column a value belongs, count! There are just as many fields as columns in every row, so things do match. Speaking of columns, let me explain their meanings to you:

  • The “Iface” column contains the name of the interface for which statistics are shown. The primary network card is usually called “eth0”. The loopback interface (abbreviated to “lo”) is a virtual network card that allows the computer to make networking connections to itself without bothering a hardware device (remember the Google Talk plug-in from the network connections part?), thus giving better performance.
  • The “MTU” column lists the Maximum Transmission Unit this interface can send at one time. It is an amount of bytes used on a pretty low level, meaning that the actual maximum size of a TCP packet that can be sent without splitting up is some tens of bytes smaller.
  • The “RX-OK/ERR/DRP/OVR” columns give statistics about the packets that have been received by the interface so far. “OK” stands for “correctly received”, “ERR” for “received but with incorrect checksum” (happens when the connection is bad), “DRP” for “dropped because my receive buffer was too full” (happens when too many packets are received in a very short interval), and “OVR” for “dropped because the kernel couldn’t get to it in time” (if this happens, your computer was really busy).
  • The “TX-OK/ERR/DRP/OVR” columns are mostly similar to the RX columns, except that they are about packets that have been sent by the interface so far.
  • The “Flg” column contains the flags that are active for this interface. “B” means “broadcast capability”, meaning that this interface can broadcast a packet to everyone on the same subnet. “M” means “multicast capability”, meaning that this interface can send packets with multiple destinations. “L” means “loopback interface”, meaning that this is an interface that puts everything sent with it immediately in its own receive queue. “U” and “R” mean “up” and “running”, respectively. I guess I don’t have to explain those ;).

In this example, notice the pretty zeroes in the “ERR”, “DRP” and “OVR” columns for both “RX” and “TX”. Apparently, the network over here is in stellar condition :). Also note the MTU value of the “lo” interface. 16436 bytes is more than any normal real network interface can provide, and should therefore be able to send any packet without needing to split up. This further improves the performance of the loopback interface.

Here’s a tip for you: if your network seems far slower than it should be, you might want to run “netstat -ci” to see updates on your network statistics every second. If the “ERR”, “DRP” and/or “OVR” values keep growing, something fishy is going on on your network. Check for interference and/or bad switches or routers. On Windows and Mac, you need to use slightly different command to do the same. On Windows, you should run “netstat -e 1” (-e replaces -i). On Mac, “netstat -iw 1”.

Final remarks

Those were the three netstat functionalities the three platforms have in common, but there is more. Linux netstat can show masqueraded connections using “-m”, while Mac employs the same option to show memory management statistics. Both Linux and Mac can show group membership using the -g option. These extra functionalities aren’t used often, but if you’d like to know more about them, run “man netstat” in your command line for some extra information. And don’t be afraid of some experimenting with it yourself; netstat only displays information, you won’t break anything.

As remarked earlier, you can only use netstat truly effectively if you know much about your network and your Linux system. If you’d like to improve your Linux knowledge with video tutorials, study guides and a live server to practice on, try a month of the Linux Academy! It’s a learning platform specifically created to help you improve your Linux knowledge, and recommended by many people who tried it before you. Stay tuned, and happy learning!

Hands-On Linux Training

6 thoughts on “Netstat: network analysis and troubleshooting, explained

  1. thank you great article. I have some data lines with with some 600+ items to be either read or sent, and the connection is established. An example below what could this represent?

    tcp 677 0 192.168.1.1:80 192.168.1.172:59507 ESTABLISHED

    I cant see any mention of the term SYN_RECV ? what does this represent? and also I have 90+ TCP ports open with the TIME_WAIT, it doesn’t sound good from what I am reading? any thoughts on where I can get advice on the report?

Leave a Reply

Your email address will not be published. Required fields are marked *