Disclaimer : All the postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.
In the 2013 release of AIX Version 6.1 TL 9 and AIX Version 7.1 TL 3, IBM has announced support for a new “virtual SCSI read/write command timeout” attribute for virtual SCSI client adapter. This is in addition to the already existing “virtual SCSI path timeout” attribute.
These are the attributes of vSCSI client adapter that is seen on an AIX client :
# lsattr -El vscsi0
rw_timeout 0 Virtual SCSI Read/Write Command Timeout True
vscsi_err_recov delayed_fail N/A True
vscsi_path_to 0 Virtual SCSI Path Timeout True
IBM documentation http://pic.dhe.ibm.com/infocenter/powersys/v3r1m5/index.jsp?topic=%2Fp7hb1%2Fiphb1_vios_disks.htm provides detailed description about these attributes. With increased use of Shared Storage Pools (using AIX vSCSI) proper understanding and setting of these attributes would be important depending upon the requirements of customers. In this article I’ll try to demystify these timeout attributes i.e. rw_timeout and vscsi_path_to of vSCSI client adapter.
Below figure depicts a typical redundant configuration of an AIX client served by two VIOSes. AIX client has access to the same vSCSI storage disks (It could be in form of same disks assigned as classical vSCSI or Shared Storage Pool logical units) via two different VIOSes. Here I am showing the example of classical vSCSI, where same disks are mapped via two different VIOSes to the AIX client.
VIOS A is marked in green.
VIOS B is marked in blue.
FC_A and FC_B are the Fibre Channel adapters owned by VIOS A and VIOS B respectively. VIOSes are connected to the Storage Area Network(SAN) via these FC adapters.
VSA_A and VSA_B are the vSCSI Server adapters on the VIOSes. hdisk_C is mapped to the AIX client using these adapters.
AIX client has a pair of vSCSI Client adapters, VCA_A and VCA_B for connecting to VSA on VIOS A and VIOS B respectively.
hdisk_A and hdisk_B are used by the VIOSes.
hdisk_C is mapped to the client, from both the VIOSes.
This is all about a typical redundant vSCSI setup.
vscsi_path_to and rw_timeout are attributes of the VCA on AIX clients. By default, both these attributes are disabled and timeout value set to zero.
-> vscsi_path_to ODM attribute of the vSCSI client adapter is used to set the client-to-VIOS path timeout value.
It helps to detect scenarios where the VIOS stops responding to client I/O requests. This timeout value should be enabled ONLY in configuration where same devices are available to client AIX from multiple VIOSes.
A minimum value required for path timeout to be enabled is 30 seconds. It takes care of problem where the vSCSI server adapter(VSA) on VIOS becomes unresponsive or fails; possibly because of a crashed or hung VIOS. If no I/O requests are being responded by VSA for the time set in vscsi_path_to,VCA on client waits for another 60 seconds for the VSA on the VIOS to respond to its ping message. If no response is received by client VCA, it fails all outstanding I/O commands and closes the connection with VSA on VIOS, before attempting a new connection. MPIO Path Control Module on client would retry the failed commands via the other path(VIOS).
Referring to the figure above vscsi_path_to takes care of any issues in the connectivity between VCA_A and VSA_A (shown in solid green line) and VCA_B and VSA_B(shown in solid blue line). It will help in detecting a hung or a crashed VIOS, and switch to alternative path.
-> rw_timeout ODM attribute(newly introduced in 2013 release of AIX) of the vSCSI client adapter is used to set the read/write command timeout value.
This attribute is aimed to improve the resiliency of vSCSI client adapter by enabling detection and recovery from hung I/O commands. This timeout value could also be enabled in a single VIOS configuration; but it is always recommended to have a dual VIOS configuration.
A minimum value required for read/write command timeout to be enabled is 120 seconds. A particular I/O command is considered hung, if it has not been responded for more than the time(in seconds) set on the “rw_timeout” attribute of the vSCSI client adapter. On detecting hung read/write commands, the adapter would fail the pending/active commands, close the connection of adapter with VIOS and try to establish a new connection. AIX client would be able to recover from scenarios when a particular I/O command is hung in some layer of VIOS.
Referring to the figure above rw_timeout takes care to recover from situation when a particular read/write command is hung in some layer of VIOS between VSA_A and FC_A(shown in dashed green line) and VSA_B and FC_B(shown in dashed blue line).
C program (7)
call graph (1)
Device Drivers (6)
Flash Caching (1)
Kernel Extension (4)
Subscribe to Blog via Email
Google Analytics Statsgenerated by GADWP
- “call graph” generation using Doxygen and Graphviz August 20, 2017
- CMVC user guide for transitioning to Git June 25, 2017
- pretty print symbols in AIX with kdb, KDBSYM, pr August 13, 2016
- Configure SAS controller/disk for use in AIX/VIOS partition May 15, 2016
- Enhanced num_cmd_elems attribute for virtual FC ( NPIV ) AIX March 19, 2016