VMware : PCPU 3 locked up. Failed to ack TLB invalidate.

By | December 20, 2015

The following error is usually represented by a “pink screen of death” with the following output :

PCPU 0 locked up. Failed to ack TLB invalidate.
@BlueScreen: PCPU 0 locked up. Failed to ack TLB invalidate.

NMI IPI received. Was eip(base):ebp:cs [0x76a42(0x41802f800000):0x412520x4010](Src 0x1, CPU2

The cause is that when interrupt occur, a context switch must be performed. If the CPU does not perform the flush or take to long to do so, then this purple screen occur.

The possible reasons for this could be hardware or software.

The known hardware issues are with Broadcom (bnx) driver back in ESX 3.5 and also with HP G7/G8 hpsa/scsi driver and ESXi 5.x.

The known software issue is related to ESXi 5.5 and the following patch from esx-base was released ESXi550-201410401-BG.

NOTE : Over a very large number of ESXi hosts running the same hardware, software version and patch level, I couldn’t find a clear pattern and did not see that occurring often, this incident was isolated on very few hypervisors in my case.