JunOS : fpc CMLC: Going disconnected; Routing engine chassis socket closed abruptly

By | August 31, 2019

Getting the following error on your Juniper EX/MX/QFX virtual chassis?

fpc1 CMLC: Going disconnected; Routing engine chassis socket closed abruptly

This message is informational and does not necessary indicate any serious issue, for example in a graceful Routing Engine switchover (GRES) context / if you flipped the routing-engine (RE) on purpose. However if this is message is printed repeateadly, without any virtual chassis (VC) topology change or manual intervention, this may indicate a more serious issue that worth investigating.

The most common issue is cabling between your VC members, and most likely to happen if the members are distant from each others connected with SFP+/QSFP+ and fiber rather than the short length DAC cables (it does not mean that no issue can happen with DAC cables, it is just that the elements and distance in the chain of even is reduced).

Here are the common symptom/possibilities on the vc-ports :

  • Defective SFP+/QSFP+ optic (dying optic, losing power transmission capability)
  • Fiber length too tight for the optic capability (check laser transmit/receive power)
  • Damaged fiber/connector, bad fusion point, dirty connector/optic (investigate with light testers, loopback, OTDR, clean the tips…)
  • Port flapping (can be caused by all the above)
  • If you are using DAC cables and observe CRCs, just swap it out with another one and monitor if there is any change in the situation.

OBSERVATION : When such issue occur and there is a lot of flaps/errors between VC ports members, you may observe a higher load on the CPU/RE than usual and some functionalities such as ICMP drop, SNMP polling issues. If your device is usually very busy and near capacity, more serious service impacting issues may occur on layer 3 services such as BGP and OSPF  as well (especially on EX series devices).

In most cases, you will observe Cyclic Redundancy Check (CRC) errors on the vc-port(s). You should check it out using the following command :

And investigate accordingly based on the tips provided above.

Here is an example output below of a VC with this behavior showing CRC errors (yeah… I knew the fiber that was delivered to me by the cabling technician was bad just by looking at it, I was told it has been tested – trust no one!) :

Be Sociable, Share!