So you have just deployed your new shiny Cisco WLCs and you have been waiting for weeks for the cablers to install your APs as per your design and you are sitting there all excited as you can finally enable the switch ports that the APs are connected to but… Oh no, no APs are joining the WLC.
I have certainly been there in that situation and I am going to share with you my usual things to check and troubleshoot.
Firstly we need to understand the process and priority for Cisco APs to discover and join the WLC
AP Discovery Process:
- WLC Discovery
- DTLS/ Join
- Image Download
- Configuration Check
- Registered
Now we know the process we need to understand the different methods that APs can discover WLCs
AP-to-WLC DISCOVERY Algorithm (find as many WLCs as you can):
- AP goes through the
following to compile a list of WLCs
- CAPWAP discovery broadcast on local subnet
- AP broadcasts CAPWAP discovery message on UDP 5246
- WLC responds back with unicast to the AP
- Over the Air Provisioning (OTAP)
- deprecated
- Locally stored
controller IP addr
- remembers up to 8 previously used controllers and tries to come back to them
- DHCP vendor
specific option 43
- IP addr should be ‘mgmt int IP’
- DHCP has option 43 configured, but an AP sees code ‘241’ (like police code 10-4), which is the WLC IP
- Option 43 format
- Windows Server: can be standard IP
- IOS: hex (‘f1040a0f64fd’)
- DNS resolution
of ‘CISCO-CAPWAP-CONTROLLER.localdomain’
- should resolve to the ‘mgmt int IP
- Manually set:
- via CLI:
- capwap ap ip address 10.10.113.5 255.255.255.0 (Example IP & subnet)
- capwap ap ip default-gateway 10.10.113.1 (Example Gateway)
- capwap ap
controller ip address 10.10.111.10 (Example WLC IP)
- Alternatively can do the following command:
- capwap ap primary-base “WLCname” “WLCip”
- via GUI:
- High Availability
- via CLI:
- In no controller found, start over
AP goes through ALL disocevry methods to see how many WLCs it could find before moving to join phase
Now we understand the DISCOVERY phase we can move on to the JOIN phase.
- Can be
hierarhical, hardcoded
- Primary, Secondary, Tertiary
- Tries the secondary etc. if the primary has no space or has not answered the join request
- With many
controllers, one might be configured as a master controller
- CONTROLLER > Advanced > DHCP > Master Controller Mode
- Master controller is prefered to join, if no other controllers are hardcoded
- If there are no
hardcoded controllers and there is no master configured, AP will join
least loaded WLC
- Lowest ratio (%) is preferred
- If the load is identical, secure DTLS tunnel is preffered over the 5046 UDP port
- AP
sends a join request message to every WLC, which contains
- AP Hardware Version
- AP Software Version
- AP name
- Number and type of radios
- Certificate payload
- Session payload; test payload
- Responding
to a Controller Request
- WLC responds with
- Controller name
- Controller type
- AP capacity
- Current AP load
- ‘Master Controller’ status
- AP-Manager IP address
- Certificate payload
- AP waits for its ‘Discovery Interval’ expire, then selects a controller and sends an CAPWAP Join Request to that controller
Now we understand the JOIN phase we can move on to the CONFIGURATION phase.
- AP moves to an image data phase
- Controller upgrades or downgrades the AP
- Code is sent in CAPWAP messages
- Config then sent to AP
- AP applies config to RAM
- AP clears all its parameters upon joining the controller
- Controller sends everything over: SSIDs, channels, powers etc.
- Controller checks/updates APs config frequently
Once AP has successfully completed the configuration phase it should join the WLC – but, what if it doesn’t? Where do I start?…
(I am going to make the assumption that you have recorded all of your AP MAC addresses correlating to a hostname in a nice excel sheet.)
Ok lets start with the Olivia Netwon John (Physical):
- If you are physically on site and able to check your AP – does it have any lights? If so what are they doing – different light statuses can be a good indicator of what is going on with the AP.
- If you are remote can you
access the switch the APs are connected to?
- Check the following
commands:
- “Show CDP neighbour” – Can you see the AP MAC address? Connected to the port you are expecting?
- “Show power-inline” – Can you see the AP drawing the correct amount of power from the switch?
- Check the following
commands:
Ok so we have verified that Olivia (physical – come on keep up kids) is ok and the AP has power and is most likely flashing red, blue & green on repeat. So what’s next? There is no right or wrong way for what order you do these following steps in but is important that we verify all of them.
If possible console on to the AP and view the output message – there could be some information here that makes it very obvious what the issue is. Whilst consoled onto the AP verify that AP has correct FW code on it. If AP is to be controlled by WLC it should be “lightweight” mode and “K9W8” in the FW version – if AP is to not be controlled by WLC should be “autonomous” mode and “K9W7” in the FW version.
Log on to the CLI of the WLC and do some debugs and look at AP join stats to see if we can get any indicators of what the issue may be.
“Show ap join stats detailed [AP MAC Address]” –
Now lets enable some debugs with the following commands
- “debug mac address [AP MAC Address]”
- “debug capwap events enable”
- “debug capwap errors enable”
With these debugs enabled and filtered on the AP MAC address pay attention to the outputs as there may be some key indicators in there for what the reason the AP is not joining the WLC. Like this example below there was a country code mismatch which will look something like this:
Which leads me on nicely to what we can make sure is configured correctly on the WLC that will cause APs to not join:
- Regulatory domain
- If WLC and AP do not match regulatory domains AP will not join WLC
- Time & Date
- If the time and date is significantly out on either the WLC or the AP will not join the WLC
- Licenses
- If the WLC is not licensed/ does not have enough licenses AP will not join the WLC – check you have the right amount available to how many APs should be on the WLC
- Add AP to security policy on
WLC
- Recently I had this issue where a 1562i would not connect to the WLC even though everything configured the same as a 3802i connecting on a different port. I added the MAC address to the security policy on the WLC and it successfully joined!
Ok so lets move onto DHCP.
- Check to make sure that the
DHCP scope has not ran out of leases.
- I have had this issue before
when upgrading code on the WLC and APs rebooting or going off to join
secondary WLC whilst primary is rebooting and then coming back and there
not being enough DHCP leases left.
- In this situation now what I do is shut down the AP VLANs on the switches to contain the APs during the upgrade.
- Is your option 43 HEX
correct?
- I would definitely recommend checking this especially if you did not actually enter it yourself – I have seen in the past a customer who I sent a correct option 43 HEX string somehow had managed to change the HEX string so AP was trying to translate a different IP address for the WLC!
- I have had this issue before
when upgrading code on the WLC and APs rebooting or going off to join
secondary WLC whilst primary is rebooting and then coming back and there
not being enough DHCP leases left.
- Slight curve ball here but
still kind of relates to DHCP opt 43.
- Recently I was upgrading a customers WLAN from a single WLC2504 and old 1142 APs to new WLC5520s and 2802 APs. Option 43 was configured correctly (I double checked) – Turned AP on and could not see it trying to join the new WLC. What happened was DNS was configured to point to the old current WLC2504 so AP was joining that one instead of the new WLC5520! Something for you guys to bear in mind if you are ever in a similar scenario 🙂
Last but not least – the firewall:
- Make sure capwap ports are
allowed through the firewall
- Capwap ports = 5246-5247 and the protocol is UDP
So that is it – my checklist of what I do if APs are not joining the WLC – I hope this post is useful for you guys!
3 Responses
Here’s a Ninja tip I used all the time when testing problems like this:
– Ensure SSH is enabled for your AP’s.
– SSH into your problem AP (in between boot sequences)
– Set up your capwap debugs / packet captures etc
– From the AP CLI, use the hidden command “test capwap restart” which will restart the CAPWAP process
Saves you waiting around for AP reboots when you are troubleshooting such problems.
Enjoy!
Thanks Ash – can always rely on you for some amazing tips! 🙂
You also need to to enable Dynamic AP Manager on the interface through which the APs would join.