UniFi - Debugging Intermittent Connectivity Issues on your UAP

Overview


In this article the reader will find tips and ideas for solving intermittent connectivity issues with their UAPs (UniFi Access Points). Sometimes this is the cause of a laptop or mobile phone showing full WiFi signal, but pages either won't load or will appear to be loading, but with no results. Read on for other symptoms. 

Table of Contents



Intermittent Connectivity Symptoms


Back to Top

The most common intermittent connectivity symptom is a laptop or mobile phone showing full WiFi signal, but when using network services/browsers, no sites or pages load. Browsers will sometimes report no Internet connectivity, but will also sometimes spin forever loading pages. Sometimes an exclamation point will appear with the WiFi bars, and sometimes the client device will select a auto-assigned IP (such as 169.254.x.x) for its IP address on the LAN.

Many of these types of issues have been fixed in recent UAP firmware releases, so please ensure that you are using the latest firmware release for your devices before investigating further.

This article offers suggestions for debugging and fixing these types of issues for a smooth Internet/network experience.

Wired or Wireless?


Back to Top

First and foremost, it is important to determine whether the problem lies in either the wired or the wireless infrastructure of your network.

Try continuously pinging Google’s public DNS (8.8.8.8) and your router simultaneously from two terminals on a laptop. If there is packet loss to both IPs, then you likely have a wireless issue. If there is only packet loss on 8.8.8.8, then you likely have a wired/Internet issue.

This article focuses on resolving wireless issues only, not wired issues.

Please note that many laptops enable WiFi power-saving mode on their WiFi interface, regardless of whether the laptop is charging or not, and you may see ping responses up to 1.25 seconds late, especially if the laptop is not busy doing anything else on the network. This is designed by laptop manufacturers to conserve power, and this article will focus particularly on resolving packet loss, not packet latency.

Check your Configuration


Back to Top

The following AP network configuration has at least 3 errors in it. Can you spot them?

  1. The subnet mask is too narrow to provide a route to the gateway.
  2. The Gateway itself is not on the same subnet as the static IP address.
  3. There is no Preferred DNS configured.

Misconfiguration can lead to inability to upgrade UAPs, NTP sync failure, among many other intermittent issues that are difficult to diagnose.

For the best experience, it is important to ensure that all UAPs have full citizenship on the network, including DNS access and Internet-routability.

AP Proximity


Back to Top

Try moving closer to the UAP that your client is associated to, to verify if the problem goes away. If the problem is resolved, then you may need to re-assess the locations and count of your UAP deployment. 

Network Loops


Back to Top

Network loops can easily be detected by running tcpdump on the affected UAP and/or UniFi Switches, and by viewing the output in Wireshark. SSH into the affected UAP, and issue the following command:

ssh admin@192.168.1.X (UAP’s IP address)
tcpdump -i br0 -n -v -s 0 -w /tmp/capture.pcap

Then copy the resulting pcap file to your laptop for viewing in Wireshark. 

scp admin@192.168.1.X:/tmp/capture.pcap /tmp

This copies the capture.pcap to /tmp on your computer. You can also use winscp similarly.

Now open the file in Wireshark.

If there is a network loop somewhere, you will see a large amount of multicast or broadcast traffic in the capture file. Typical networks will have less than 100 kbps of multicast/broadcast traffic, totaling only dozens of packets per second. If there are thousands of multicast/broadcast packets per second, then you likely have a network loop somewhere that needs to be resolved. Keep disconnecting infrastructure devices until the number of multicast/broadcast packets goes down to a reasonable number.

Link Budget


Back to Top

The high transmit power (TX power) of UniFi APs is great for single-AP installations, but can be problematic in enterprise/multi-AP deployments. The high TX power will extend the range for slower TX rates only, as faster rates are transmitted at a lower TX power, which is normal for ALL APs and devices. This eats up air-time for faster rates in multi-AP deployments, slowing down the entire network and potentially causing packet loss.

High TX power also causes an imbalance in the WiFi link budget between the mobile client and the UAP, because most mobile clients have a TX power between 14 and 18 dBm (and a hand around the antenna, etc.!). Mobile clients will stay connected (and show full WiFi bars) to an AP with a strong signal from the AP to the mobile client, even if the signal from the mobile client to the AP is not sufficiently strong.

Lowering the TX power on the UAPs to 18 dBm or so will establish a more symmetrical link-budget for most devices and deployments. This can easily be done in the controller UI.

Disabling Features


Back to Top

Some features such as band-steering, minimum RSSI, and connection monitor can cause adverse effects if misconfigured or implemented with an insufficient number APs. It is good to disable all of these features when debugging connectivity problems so that base functionality can be verified without any extra variables. Also, newer features, such as ATF (Air-Time-Fairness) can also be disabled to bring base functionality to the bare minimum.

DHCP Configuration


Back to Top

Some older/misconfigured routers and DHCP servers transmit the DHCP offer/ack messages as broadcast packets, which are much more likely to be dropped. This can lead to slow connection times and intermittent connectivity. Please ensure that your DHCP offers and ack messages are unicast packets, not broadcast (the discover packet from the client can still be broadcast).

Modifying the DTIM Period


Back to Top

Some older/buggy WiFi radios do not handle the DTIM period of 3 properly, causing broadcast/multicast packets for these devices to be dropped. To decrease the DTIM period (which also decreases battery life for WiFi-connected mobile devices), change the DTIM period in the Wireless Network's "Advanced Options" under "802.11 Rate and Beacon Controls".

Wireshark - Packet Capture


Back to Top

Wireshark can be downloaded for most platforms at www.wireshark.org. Modern (2013+) MacBooks are recommended as they 1) have full driver-support for monitor mode, and 2) have premium 3x3 radios that are capable of hearing 3 NSS traffic (up to 1300 Mbps physical rate). Linux can also be used with some laptops, but most laptops only have 2x2 radios, so they are less useful.

  1. Download/install Wireshark
  2. Open Wireshark
  3. Click on the gear icon at the top
  4. Ensure that monitor mode is enabled for the en0 interface: 
  5. Click Close, and restart Wireshark.
  6. Start a capture on en0. You should see beacon, control, and management frames interspersed with data frames.

    Note: As of the time of this writing, there is a bug in Wireshark where capturing in monitor mode will fail the first time it is enabled, unless Wireshark is completely restarted first.

    You can upload this capture to the community or to UBNT support, and be sure include the MAC address of the laptop or mobile device that is having issues.

Trace Packets Through the Network


Back to Top

In some cases, multicast/broadcast packets can be successful whereas unicast packets are not. It is important to understand which type of packets get how far on the network.

You’ll need to determine which “VAP” interface your wireless client is connecting to first. You can ssh into the problematic AP and issue:

iwconfig

In the above example, you can see that ath6 is the VAP for the ubnt-ut-AP-LR network on the 5 GHz radio.

The easiest way to send packets is with ping.

To see if broadcast packets are making it to your UAP, run tcpdump on the athX interface on the UAP (SSH on UAP):

tcpdump -i athX -n -v -s 0 -w /tmp/broadcast.pcap 

and then send some broadcast packets using ping from your laptop (terminal on laptop):

ping 192.168.1.255

Stop the capture, and start another capture named /tmp/unicast.pcap (ssh on UAP):

tcpdump -i athX -n -v -s 0 -w /tmp/unicast.pcap

Next, try to send unicast packets to your router (terminal on laptop):

ping 192.168.1.1 (replace with your router’s IP)

If broadcast packets aren’t being transmitted or received, then the unicast packets won’t go out (due to a missing ARP entry in the OS), either, and you’ll need to force a static ARP entry into your laptop (terminal on laptop): 

sudo arp -s 192.168.1.1 00:00:00:00:00:01 ifscope en0 (Mac OS X)
arp -s 192.168.1.1 00-00-00-00-00-01 (from Administrator Command Line in Windows

Try the ping again, and see if the 00:00:00:00:00:01 unicast packets arrive at the athX interface on the UAP.

After you’ve determined whether there is packet loss from your client to the UAP, now it is time to determine if there is packet loss from the UAP to your client. First, you will need to start Wireshark or tcpdump on your laptop to validate whether packets are getting to your laptop. Then start a broadcast ping from your UAP to the network (ssh on UAP):

ping 192.168.1.255

Capture the results in wireshark/tcpdump, then start a ping to your laptop (ssh on UAP):

ping 192.168.1.X

At the time of writing, UAPs do not have a way to set a static ARP entry, so if unicast traffic can’t be produced from the UAP, you can try producing the packets by setting a static ARP entry on a wired desktop/laptop, then sending the packets from that separate device.

You can give the capture results to the community and/or a UBNT employee to help diagnose where packet loss is occurring.

Lastly, it is good to double-check that the bridge is configured correctly (ssh on UAP):

brctl show

The output should look similar to this:

Bridge Name Bridge ID STP Enabled Interfaces
br0 ffff.44d9e7f9876a no

ath0

ath1

ath2

ath3

ath4

ath5

ath6

ath7

eth0

Are the APs Rebooting?


Back to Top

Check the uptime of the APs to make sure they aren't rebooting. If the uptime keeps getting reset, and coincides with network downtime, then you may have uncovered a bug, and we’d love to know how we can reproduce the problem in our labs. Let us know via our Community.

Check Your Hardware


Back to Top

There is always a small chance that hardware could have been damaged as there are many hands that your UAP has passed through from our factory to your desk. For cases where major packet loss cannot be resolved, regardless of what firmware you try, please continue reading.

It is best to set your 2 GHz and 5 GHz radios up on separate SSIDs, and even better to set both of them to different SSIDs from all other UAPs, for testing purposes. For example, if your network name is “HomeNetwork”, set the 2 GHz SSID to “HomeNetwork-test2” and 5 GHz SSID to “HomeNetwork-test5” so that they don’t conflict with each other or any other SSID. If you have more than one SSID per radio, you only need to test one SSID per radio, so you only need to modify one of the SSIDs per band, not all of them.

After getting your SSIDs in order, please SSH into, and issue the following commands on your UAP:

iwpriv wifi0 get_txchainmask

This will give you the number of chains available on radio 0. This is a mask, so 3 means you have 2 chains, 5 means you have 2 chains, 7 means you have 3 chains, 15 means 4 chains, etc.. Run the command again for the second radio on your UAP (if your UAP is dual-band):

iwpriv wifi1 get_txchainmask

Now we will test each of the chains on your UAP. This will test chain 0:

cm=1; for a in 0 1; do for b in tx rx; do iwpriv wifi$a ${b}chainmask $cm; done; done; killall hostapd

Use “WiFi Analyzer” or some other app to view the signal strength of the UAP’s beacons on chain 0 for both 2 GHz and 5 GHz. You should be standing within 10 feet of the UAP, and if the signal strength is lower than -60 dBm, you may have an issue with that chain. Try the other chains as well:

cm=2; for a in 0 1; do for b in tx rx; do iwpriv wifi$a ${b}chainmask $cm; done; done; killall hostapd

View the signal strength on chain 1. Then:

cm=4; for a in 0 1; do for b in tx rx; do iwpriv wifi$a ${b}chainmask $cm; done; done; killall hostapd

Keep in mind that cm=1, 2, and 4, correspond to chains 0, 1, and 2, and if your UAP does not have chain 2, then you will not see any signal on that chain (and this is normal!)

Next, reboot the UAP to get the chains back to normal. Connect a known-good laptop within 10 feet of your UAP. Run iperf3 on a wired server (Linux, Mac, Windows OK) with:

iperf3 -s

Then run iperf3 from your WiFi connected laptop/mobile device:

iperf3 -u -c SERVER_IP -b 50M -R

Then run in the opposite direction.

iperf3 -u -c SERVER_IP -b 50M

Broken hardware will typically get far less throughput in one direction than the other (i.e. 50 Mbit one direction, and 0 the other). If this is the case, please @mention a UBNT employee in a forum topic for confirmation and next steps.

Ask the Community


Back to Top

If all else fails, feel free to post your symptoms and configuration for the community and/or UBNT engineers. Please be sure that you've followed relevant steps in this guide, and also be sure to include details such as whether band-steering, minRSSI, ATF, VLANs, etc. are enabled. And if your problem gets resolved, make sure you mark the topic as solved!

Related Articles


Back to Top

How to Establish a Connection Using SSH

UniFi - Methods for Capturing Useful Debug Information