109.3 Basic network troubleshooting
Candidates should be able to troubleshoot networking issues on client hosts.
Key Knowledge Areas
- Manually and automatically configure network interfaces and routing tables to include adding, starting, stopping, restarting, deleting or reconfiguring network interfaces.
- Change, view, or configure the routing table and correct an improperly set default route manually.
- Debug problems associated with the network configuration.
Terms and Utilities
Troubleshooting network problems
When a network related problem is reported to you, you have to take a lot of steps to find out where the root cause of the problem is. For example if the report says "I can not open webpages", you have to start from checking if the network interface has an ip, it it is up, it the routing is OK and if the DNS works fine and if everything is OK, you have to try reaching a server on the Internet via
ping command and if you see any problems, you have to use
traceroute to see where your traffic is going wrong. In this lesson, we will review these basic steps.
ifconfig & ip
As you already know,
ip commands can be used to check the IP address of your interfaces. If a network card is going to work, it needs an correct IP address and netmask. Let me check my own computer to see it has an IP address:
[[email protected] ~]$ ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: wlp3s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000 link/ether f0:de:f1:62:c5:73 brd ff:ff:ff:ff:ff:ff 3: enp0s25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 8c:a9:82:7b:89:06 brd ff:ff:ff:ff:ff:ff inet 192.168.1.35/24 brd 192.168.1.255 scope global dynamic wlp3s0 valid_lft 254836sec preferred_lft 254836sec inet6 fe80::8ea9:82ff:fe7b:8906/64 scope link valid_lft forever preferred_lft forever [[email protected] ~]$ ifconfig enp0s25 Link encap:Ethernet HWaddr f0:de:f1:62:c5:73 inet addr:192.168.1.35 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: fe80::8ea9:82ff:fe7b:8906/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:231586 errors:0 dropped:0 overruns:0 frame:0 TX packets:200220 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:198053888 (198.0 MB) TX bytes:51583154 (51.5 MB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:221752 errors:0 dropped:0 overruns:0 frame:0 TX packets:221752 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:150859909 (150.8 MB) TX bytes:150859909 (150.8 MB) [[email protected] ~]$
Both commands show that my IP address is OK. This is assigned to my be the WiFi modem and 192.168.1.35 as IP and 255.255.255.0 looks reasonable.
This is the most common tool when troubleshooting a network problem. Lets see if I can see my network gateway first.
[[email protected] ~]$ route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface default 192.168.1.1 0.0.0.0 UG 600 0 0 wlp3s0 link-local * 255.255.0.0 U 1000 0 0 wlp3s0 192.168.1.0 * 255.255.255.0 U 600 0 0 wlp3s0 [[email protected] ~]$ ping 192.168.1.1 PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. 64 bytes from 192.168.1.1: icmp_seq=1 ttl=254 time=3.02 ms 64 bytes from 192.168.1.1: icmp_seq=2 ttl=254 time=3.08 ms ^C --- 192.168.1.1 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 3.023/3.052/3.082/0.062 ms
I can ping my gateway but is it possible to reach a server on the Internet using its IP address. Lets ping 220.127.116.11; it is very well known server on the Internet and many people use it to check their connectivity.
[[email protected] ~]$ ping 18.104.22.168 PING 22.214.171.124 (126.96.36.199) 56(84) bytes of data. 64 bytes from 188.8.131.52: icmp_seq=1 ttl=50 time=108 ms 64 bytes from 184.108.40.206: icmp_seq=2 ttl=50 time=111 ms 64 bytes from 220.127.116.11: icmp_seq=3 ttl=50 time=113 ms ^C --- 18.104.22.168 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2003ms rtt min/avg/max/mdev = 108.160/111.233/113.717/2.338 ms
This is working fine. Bud can I ping google.com too?
[[email protected] ~]$ ping google.com ping: unknown host google.com
Aha! We found the problem. In this case I have a correct IP address, I can ping 22.214.171.124 but I can not ping google.com with the error message "unknown host". This means my computer can not translate google.com to its IP address; this is a DNS issue:
[[email protected] ~]$ cat /etc/resolv.conf # Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8) # DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
No wonder we had problems with surfing the WWW. There is no active DNS in my computer so no one will be able to reach sites by their domain names! This should be fixed by adding a DNS to my configuration files and then use
ifdown eth0 and then
ifup eth0 to load it.
In some situations you can not reach the Internet (say 126.96.36.199 or 188.8.131.52) but you can ping your gateway. Have a look:
[[email protected] ~]$ ping 184.108.40.206 connect: Network is unreachable [[email protected] ~]$ ping 192.168.1.1 PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. 64 bytes from 192.168.1.1: icmp_seq=1 ttl=254 time=3.03 ms 64 bytes from 192.168.1.1: icmp_seq=2 ttl=254 time=3.31 ms ^C --- 192.168.1.1 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1001ms rtt min/avg/max/mdev = 3.039/3.174/3.310/0.146 ms
What is going on here? Lets see what clues do I have: 1) I can reach the gateway 2) when asking for the Internet, my computer does not know what to do. In this case the default gateway is missing: the computer does not know what to do if a packet is outside its network mask. You know that we can set the default gateway using the /etc/network/interfaces config file but there is also a
route command to show and change the routing configurations on the fly.
routes added or changed via
routecommand will be lost after a reboot! Permanent configurations should come from configuration files.
Lets check our current routing state using
route command as root.
[[email protected] ~]$ sudo route [sudo] password for jadi: Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface link-local * 255.255.0.0 U 1000 0 0 wlp3s0 192.168.1.0 * 255.255.255.0 U 600 0 0 wlp3s0
It is stated that "there is no gateway for the network 192.168.1.0 netmask 255.255.255.0". This means that if you ping a server in this network (say 192.168.1.100), it does not needs any routing. But what happens if you ping 220.127.116.11? Nothing stated in our routing table above! No wonder that we are getting
Network is unreachable when
ping 18.104.22.168. Lets add a default route:
[[email protected] ~]$ sudo route add default gw 192.168.1.1 [[email protected] ~]$ ping 22.214.171.124 PING 126.96.36.199 (188.8.131.52) 56(84) bytes of data. 64 bytes from 184.108.40.206: icmp_seq=1 ttl=50 time=109 ms 64 bytes from 220.127.116.11: icmp_seq=2 ttl=50 time=111 ms 64 bytes from 18.104.22.168: icmp_seq=3 ttl=49 time=199 ms ^C --- 22.214.171.124 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2002ms rtt min/avg/max/mdev = 109.493/140.344/199.612/41.922 ms [[email protected] ~]$ sudo route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface default 192.168.1.1 0.0.0.0 UG 0 0 0 wlp3s0 link-local * 255.255.0.0 U 1000 0 0 wlp3s0 192.168.1.0 * 255.255.255.0 U 600 0 0 wlp3s0
We were able to ping the Internet after adding a default gateway using
route command. It is important to know that we can define even more routes and control exactly where our packets should go based on their destinations.
This command can show many different information about our network. The two main functionalities are showing the routing table and checking the listening ports. Lets check our routing table using it.
[[email protected] ~]$ netstat -nr Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 0.0.0.0 192.168.1.1 0.0.0.0 UG 0 0 0 wlp3s0 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 wlp3s0 192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 wlp3s0
It is also very useful to the see what ports are in listening state in our server:
[[email protected] ~]$ netstat -na | grep 80 tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN
-na switch will show all the open ports and here I'm checking if port 80 (web) is active on my server. It is and listening to any connection from
0.0.0.0 which means "anywhere": Any one can connect to my computers web server and request pages.
in these switches,
-nstands for numeric,
-astands for all ports and
-rstands for routes.
This is a more advanced troubleshooting tool. It is like pinging all the servers between you and your destination one by one and see where our packets go wrong. Lets check how I reach google.com.
[[email protected] ~]$ traceroute google.com traceroute to google.com (126.96.36.199), 30 hops max, 60 byte packets 1 192.168.1.1 (192.168.1.1) 3.090 ms 3.113 ms 3.115 ms 2 85-15-16-103.shatel.ir (188.8.131.52) 29.904 ms 34.851 ms 34.885 ms 3 85-15-0-193.shatel.ir (184.108.40.206) 34.880 ms 39.454 ms 39.459 ms 4 85-15-0-1.shatel.ir (220.127.116.11) 39.451 ms 39.444 ms 43.865 ms 5 172.18.196.17 (172.18.196.17) 43.887 ms 46.506 ms 51.990 ms 6 172.18.196.22 (172.18.196.22) 51.943 ms 29.418 ms 29.430 ms 7 10.10.53.229 (10.10.53.229) 32.421 ms 32.344 ms 34.705 ms 8 10.10.53.222 (10.10.53.222) 39.814 ms 39.773 ms 39.808 ms 9 xe-0-3-3-xcr2.fri.cw.net (18.104.22.168) 126.654 ms 131.411 ms 131.363 ms 10 ae29-xcr1.fri.cw.net (22.214.171.124) 126.570 ms 129.053 ms 129.073 ms 11 126.96.36.199 (188.8.131.52) 173.273 ms 173.284 ms 178.795 ms 12 184.108.40.206 (220.127.116.11) 143.786 ms 145.279 ms 18.104.22.168 (22.214.171.124) 115.669 ms 13 126.96.36.199 (188.8.131.52) 111.427 ms 113.970 ms 113.989 ms 14 184.108.40.206 (220.127.116.11) 127.402 ms 125.199 ms 18.104.22.168 (22.214.171.124) 122.175 ms 15 126.96.36.199 (188.8.131.52) 137.551 ms 137.564 ms 140.535 ms 16 184.108.40.206 (220.127.116.11) 161.583 ms 170.447 ms 170.407 ms 17 18.104.22.168 (22.214.171.124) 170.375 ms 126.96.36.199 (188.8.131.52) 159.352 ms 161.666 ms
On the first step (1) I reach my own router. On the 2nd step, I'm at my ISP which is shutel. I'm still on shatel on 3rd and 4th steps and so on. After a long path, on the 17th step, I reach my destination. This is used to troubleshoot routing problems.
In some routers, the ping trafic is blocked and you will see
* * *at some steps because those servers are blocking ICMP traffic.
There is another command called
tracepath which is very similar to
traceroute. For the LPIC1 level, they are essentially the same!
If you are on an IPv6 network you have to use
ping6 instead of
ping. As easy as that.
dig command is a DNS lookup tool. If you are having problem with a domain name, you can check how it is being resolved to IPs; and by whom.
[[email protected] ~]$ dig google.com ; <<>> DiG 9.9.5-11ubuntu1.3-Ubuntu <<>> google.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 50032 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;google.com. IN A ;; ANSWER SECTION: google.com. 293 IN A 184.108.40.206 ;; Query time: 120 msec ;; SERVER: 220.127.116.11#53(18.104.22.168) ;; WHEN: Fri Apr 01 22:00:05 IRDT 2016 ;; MSG SIZE rcvd: 44
You can see that SERVER 22.214.171.124 is resolving google.com to 126.96.36.199.
The nc (or netcat) utility is used for just about anything under the sun involving TCP, UDP, or UNIX-domain sockets. It can open TCP connections, send UDP packets, listen on arbitrary TCP and UDP ports, do port scanning, and deal with both IPv4 and IPv6. Unlike telnet, nc scripts nicely, and separates error messages onto standard error instead of sending them to standard output, as telnet does with some. This is a very capable command and it is enough for you to be familier with its general concept.