Disclaimer : All the postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.

 

Virtual Fibre Channel(VFC) is PowerVM’s flagship storage virtualization solution based on  N_Port ID Virtualization (NPIV) standard technology for Fibre Channel networks.
If you are new to this storage virtualization technology, IBM Knowledge Center article on Virtual FC is located here : https://www.ibm.com/support/knowledgecenter/8247-21L/p8hat/p8hat_vfc.htm

This article will provide more details about the enhanced num_cmd_elems attribute for virtual FC ( NPIV ) AIX partitions.
num_cmd_elems” attribute setting of a virtual FC(vFC) adapter on AIX partition imposes limit on the maximum number of command elements that could actively be served by the adapter.
As part of the AIX releases last year ( December 2015 ), vFC adapter has been enhanced to support a maximum of 2048 active command elements. [ sidenote : vFC adapter earlier did support 2048 num_cmd_elems, but were restricted some time back for certain issues reported from Customers. This latest change ensures that the issues of the past in vFC adapter would not re-occur with the higher num_cmd_elems support. ]

You could see this change of “num_cmd_elems” in vFC adapter by this command :
# lsattr -Rl fcs0 -a num_cmd_elems
20…2048 (+1)
As was earlier, 200 is still the default value for num_cmd_elems attribute of a vFC adapter in AIX.

 

Should you care about this change for increased “num_cmd_elems” support in vFC adapter ?

The answer to it “depends” upon the requirements of your application / workload !
All applications have some I/O requirement. Certain applications perform small number of large-sized IO operations, some drive large number of small-sized I/O e.g. Online Transaction Processing (OLTP) workloads and there are others with a mix of both.
OLTP kind of workloads propel the disk driver on the Operating System to generate large number of I/O commands. The performance of these OLTP largely depend upon the number of Input/Output Operations Per Second (IOPS) being driven by the system.
I’ll use the below figure to explain more about this :

num_cmd_elems virtual FC AIX LPAR

Impact of num_cmd_elems on AIX virtual FC partition

For the sample OLTP application in the above illustration, performance is largely driven by the ability of the system to drive more throughput(IOPS). If the disk’s have been tuned with a higher “queue_depth” attribute, it would depend upon if the “num_cmd_elems” attribute setting on the vFC is able to send across all the requests. Though, in a virtual environment ( like this with vFC ), throughput and other I/O characteristics of a partition(LPAR) are also significantly impacted by the load on Virtual I/O Server ( and the physical FC adapter mapped to the client vFC adapter ).
If you are interested to know more in-depth about AIX and VIOS Disk And Fibre Channel Adapter Queue Tuning, I would suggest you to refer to this document : https://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD105745 from IBM Advanced Technical Sales team.

 

How do you know if the “num_cmd_elems” set on the vFC adapter of the partition is restricting the throughput of your application ?

Assuming that your disks queue_depth attribute is not serving as a bottleneck ( which you can verify using the iostat command ); you can find out if the vFC adapter is the bottleneck by using the fcstat command.
Below is a sample output from “fcstat -D fcsX” command ( with num_cmd_elems of vFC adapter set to default value ( i.e. 200 ) and a sample workload driving higher IOPS ) :

# fcstat -D fcs0

FIBRE CHANNEL STATISTICS REPORT: fcs0

Device Type: Virtual Fibre Channel Client Adapter (adapter/vdevice/IBM,vfc-client)
Serial Number: UNKNOWN
Option ROM Version: UNKNOWN
ZA: UNKNOWN
World Wide Node Name: 0xC05076069D61000A
World Wide Port Name: 0xC05076069D61000A

< .. .. output truncated .. .. >

Driver Statistics
  Number of interrupts:   0               
  Number of spurious interrupts:   0               
  Long term DMA pool size:   0
  I/O DMA pool size:  0

  FC SCSI Adapter Driver Queue Statistics
    Number of active commands:   0
    High water mark  of active commands:   180
    Number of pending commands:   0
    High water mark of pending commands:   1
    Number of commands in the Adapter Driver Held off queue:  0
    High water mark of number of commands in the Adapter Driver Held off queue:  0

  FC SCSI Protocol Driver Queue Statistics
    Number of active commands:   0
    High water mark  of active commands:   180
    Number of pending commands:   0
    High water mark of pending commands:   2022

FC SCSI Adapter Driver Information
  No DMA Resource Count: 0               
  No Adapter Elements Count: 0               
  No Command Resource Count: 4419300         

FC SCSI Traffic Statistics
  Input Requests:   5662994         
  Output Requests:  120070          
  Control Requests: 141655          
  Input Bytes:  23424304056     
  Output Bytes: 603911680 

Information derived from above fcstat output, relevant ( highlighted in red above ) to our “num_cmd_elems” discussion :

  •  “No Command Resource Count” field being a high value ( as above ) and if increasing steadily indicates that the disk driver ( because of the application ) is sending in more commands than the vFC adapter is able to handle concurrently.
  •  “High water mark  of active commands” field indicates the maximum number of active commands the virtual FC stack has handled at any point in time on the running system. This would obviously be lower than the num_cmd_elems value set on the vFC adapter.
  •  “High water mark of pending commands” field of fcstat output indicates the maximum number of commands that were pending on the virtual FC stack at any point in time on the running system. If the num_cmd_elems of the vFC is set to a lower value, it is a clear indication for you to increase that. [ NOTE : As mentioned earlier, it is better for you to consider the impact, if any, on other stakeholders in your virtualized environment before making the change. ]

 

How do you know if the increased num_cmd_elems on vFC has improved the throughput of your application ?

fcstat command run on the system with increased num_cmd_elems attribute on vFC adapter should be able to show the difference.
Below is a sample output from fcstat command ( with num_cmd_elems of vFC adapter set to 2048 value and same workload run again ) :

# fcstat -D fcs0

FIBRE CHANNEL STATISTICS REPORT: fcs0

Device Type: Virtual Fibre Channel Client Adapter (adapter/vdevice/IBM,vfc-client)
Serial Number: UNKNOWN
Option ROM Version: UNKNOWN
ZA: UNKNOWN
World Wide Node Name: 0xC05076069D61000A
World Wide Port Name: 0xC05076069D61000A

< .. .. output truncated .. .. >

Driver Statistics
  Number of interrupts:   0               
  Number of spurious interrupts:   0               
  Long term DMA pool size:   0
  I/O DMA pool size:  0

  FC SCSI Adapter Driver Queue Statistics
    Number of active commands:   0
    High water mark  of active commands:   1894
    Number of pending commands:   0
    High water mark of pending commands:   1
    Number of commands in the Adapter Driver Held off queue:  0
    High water mark of number of commands in the Adapter Driver Held off queue:  0

  FC SCSI Protocol Driver Queue Statistics
    Number of active commands:   0
    High water mark  of active commands:   1895
    Number of pending commands:   0
    High water mark of pending commands:   1

FC SCSI Adapter Driver Information
  No DMA Resource Count: 0               
  No Adapter Elements Count: 0               
  No Command Resource Count: 0               

FC SCSI Traffic Statistics
  Input Requests:   4945548         
  Output Requests:  3464            
  Control Requests: 311             
  Input Bytes:  20391759980     
  Output Bytes: 41137280        

Adapter Effective max transfer value:   0x100000

 

No Command Resource Count” field of the above fcstat output is a 0 ( highlighted in green above ) indicating that the vFC adapter is no longer keeping any incoming I/O commands pending for lack of free command elements. Also, “High water mark  of active commands” has increased significantly indicating a positive impact on the overall throughput of the system. Similarly a low value of “High water mark of pending commands” is also a good sign.
Changing the “num_cmd_elems” to a higher value did wonders to the throughput achieved by my workload !

 

Which AIX releases is this change available in ?

6100 TL 09 SP 06 ( APAR IV76258 )
7100 TL 04 SP 00 ( APAR IV76270 )
7200 TL 00 SP 00
7100 TL 03 SP 07 ( APAR IV76968 )

Tuning the complete I/O stack optimally in a virtualized environment gives you the best benefits of both worlds : Virtualization and Performance.
What do you think ?

 

P.S. Keep up to date and get help with IBM PowerVM by joining IBM PowerVM LinkedIn Group. Also, there is a IBM AIX Technology Forum in LinkedIn where queries and enhancements in AIX are shared and discussed.

 

3 Responses to Enhanced num_cmd_elems attribute for virtual FC ( NPIV ) AIX

  1. What are the negative impacts of increasing the num_cmd_elems number?

    • sangeek says:

      As I had tried to highlight in the blog, it would be unwise to blindly increase the num_cmd_elems without considering the requirement of the workload. It is also important to consider the impact to the other LPARs of the machine which share the same physical FC adapter on the VIOS. Complicating things further, there could be an overall impact on the SAN performance ( impacting other workloads/servers connected to the shared SAN ).
      As an example of the negative impact by increasing num_cmd_elems, it could so happen that the latency of IO’s could increase significantly thereby offsetting the increase in throughput(IOPS).

  2. Laurent Oliva says:

    hello,

    in the following tech doc, it’s wrote :

    https://www-03.ibm.com/support/techdocs/atsmastr.nsf/5cb5ed706d254a8186256c71006d2e0a/d1f54f4cd1431d5a8625785000529663/$FILE/AIX-VIOS_DiskAndAdapterQueueTuningV1.2.pdf

    For the AIX PCM:

    new num_cmd_elems = +

    In other doc, the same advice :

    http://www.circle4.com/forsythe/aixperf-ionetwork.pdf

    Per Dan Braden:

    Set num_cmd_elems to at least

    high active + high pending or 512+104=626

    If i apply this analysis method on a workload i’m studying (which is not behind a vios but with physical HBA) :

    High water mark of active commands: 614
    High water mark of pending commands: 993

    ==> num_cmd_elems is configured at 1024.

    But “No Adapter Elements Count: 4128362”

    So the minimum value for num_cmd_elems i should put is : 614 + 993 = 1607

    If i applied the method you wrote in your article, my num_cmd_elems would appeared as correct but it’s not the case.

    Another thing that can make confusing in your article is : your VIO client is for sure not the only client from the VIOS.
    So how are you able to reproduce the same workload I/O context in that client but with such fluctuant activites from other VIO clients ?

    Who has right ?

    Thanks for sharing
    Keep sharing

What do you think ?

Set your Twitter account name in your settings to use the TwitterBar Section.
%d bloggers like this: