Problem Definition
Under certain situations customers may see that LUNs are "lost" and unable to be recovered unless the ESXi host is reset. Customers had reported losing iSER or Fibre Channel connectivity with their VPSA from VMware ESXi hosts (Note: Neither iSER or FC was a contributing factor though).
The VPSA logs showed Lun disconnections. Network switches also show no issues.
The common pattern from ESXi hosts is a reference to "Admission failure in path:"
e.g.
2019-11-16T08:53:18.454Z cpu09:1061573)MemSchedAdmit: 471: Admission failure in path: sioc/storageRM.9061573/uw.9061573
2019-11-16T08:53:18.454Z cpu09:1061573)MemSchedAdmit: 471: Admission failure in path: sioc/storageRM.9061573/uw.9061573
2019-11-16T08:53:18.454Z cpu09:1061573)MemSchedAdmit: 471: Admission failure in path: sioc/storageRM.9061573/uw.9061573
2019-11-16T08:53:18.454Z cpu09:1061573)MemSchedAdmit: 471: Admission failure in path: sioc/storageRM.9061573/uw.9061573
VMware have a KnowledgeBase article for the issue at VMware Admission Path Failure,
an unpatched ESXi host can experience this SIOC related issue when it tries to restore LUN access. However the opposite is seen, the LUN access isn't restored, ESXi hosts then become unresponsive and may report "All Paths Down" and many customers are forced to forcibly reset the ESXi host.
Triggers
In VPSA terms the trigger may typically be a standard VPSA failover event, VPSA upgrade or VPSA IO engine change, any occasion where the LUNs are temporarily "stunned" during a controller failover period. Alternatively the issue may be triggered by an unexpected hardware failure impacting the VPSA and then by association the ESXi hosts connected to that VPSA. ( The VPSA will then recover as by design, whilst the ESXi host(s) require a forced reset ).
The above VMWare KnowledgeBase article provides both a workaround ( ESXi sioc restarts) and a remedy ; in this case an upgrade to the ESXi 6.7 Update 3, with latest patches.
We recommend that all ESXi hosts are on a current supported release and that they are patched to the latest update, especially when these are termed 'Critical' by VMware.