“call graph” generation using Doxygen and Graphviz

I always wished of having a easy method to generate a “call graph” ( aka control flow graph ) depicting the calling relationship between the subroutines of a source code( in my case for code in C language). Now that i finally found a method ( and most importantly time to experiment it out ); i am documenting the steps in this blog. I used Doxygen coupled with Graphviz to get this done.

Firstly, a big thanks to the contributors of this page ( https://en.wikipedia.org/wiki/Call_graph ) in Wikipedia where i got to see the multiple options to get this done. And of-course went through tips from this post of stackoverflow : https://stackoverflow.com/questions/517589/tools-to-get-a-pictorial-function-call-graph-of-code.

Of the various options to get this done i decided to do the “Doxygen”(http://www.doxygen.org) way because i had good vibes about this tool! Also given that couple of projects that i worked with earlier had used Doxygen; i had some understanding on how it worked. And most importantly, your code need “NOT” have used “Doxygen style comments” for the call graph feature to work. I am sure many of you’ll heave a sigh of relief; as did i ;-).

Now the natural companion for Doxygen to get this work done is Graphviz ( Graph Visualization Software : http://www.graphviz.org/ ).

The reason these two tools are related is that Graphviz provides “dot” tool which is used by “Doxygen” to generate call graph.

Depending upon your requirement you could generate :

  • class diagrams
  • class inheritance graphs
  • direct and indirect include dependencies of files
  • caller and call graph for subroutines

Now with this background, let me jump into how to set this up and get it running :
Please note :

  1. This is what i did on MacOS 10.12.5. Apart from the installation stuff; i think other steps would work the same way for Linux.
  2. I rely heavily on brew for getting developer packages into Mac; and i see it does a great job. If you have not yet used it you can get more details here : https://brew.sh/ 

1. Install Graphviz

==> brew install graphviz

2. Install Doxygen

==> brew install doxygen

3. Go the parent directory of you code-base; and generate a Doxygen configuration file. By default the configuration file gets the name of Doxyfile

==> doxygen -g

Configuration file `Doxyfile’ created.
Now edit the configuration file and enter
  doxygen Doxyfile
to generate the documentation for your project

4. Now customize the configuration file for graph generation and other personalized needs :

The configuration file is a plain text file; with “TAG“s to tailor the setting of Doxygen for a specific project. All the TAGs are pre-filled with default values; you have to just edit the value of the TAG which you want to change from the default behavior.

I made the changes to these TAGs for the reasons mentioned in statements beginning with hashes(#) :
( Listed below in the order of importance i though they are of )

#Only if this option is set, Doxygen uses the “dot” tool
#of Graphviz to generate graphs
HAVE_DOT               = YES

# To make Doxygen generate a call dependency graph
# for every global function or class method
CALL_GRAPH             = YES

# To make Doxygen generate a caller dependency graph
# for every global function or class method
CALLER_GRAPH           = YES

# Since you’ll mostly be having newer version of dot(>1.8.10) ( in my case it was 2.38.0 )
# enabling this option will make dot run faster
DOT_MULTI_TARGETS      = YES

# Since all of my code was in C, i enabled this option.
# and i see that Doxygen run completed sooner for my case after this option was enabled.
OPTIMIZE_OUTPUT_FOR_C  = YES

# Since my code did not have any Doxygen style comments,
# i enabled this tag
EXTRACT_ALL            = YES

# If you code is spread across sub-directories,
# enabling this option will make Doxygen walk through code in all of them
RECURSIVE              = YES

# If you don’t need the LaTeX output;
# better disable this option; as it saves a lot of time during graph generation
GENERATE_LATEX         = NO

# Modified the graph image file format to be svg ( default is png )
DOT_IMAGE_FORMAT       = svg

# These other tags i enabled; don’t matter much for C code i guess
EXTRACT_PACKAGE        = YES
EXTRACT_STATIC         = YES
EXTRACT_LOCAL_CLASSES  = YES
STRICT_PROTO_MATCHING  = YES

5. Now that the configuration options have been set up as per your need;  let me show you a sample execution and output

==> doxygen

Searching for include files…
Searching for example files…
Searching for images…

.. <output trucated> ..

Patching output file 12/13
Patching output file 13/13
lookup cache used 21/65536 hits=107 misses=21
finished…

It will generate “html” directory in same path where it was executed.
Open the “index.html” file located inside the “html” sub-folder in a web-browser of your choice.

6. You’ll find the file dependecy and call graph ( and caller graph too ) by clicking on the filename under the Files tab of the output.

Sample screenshots from my example run :

A. File dependency view of File “tree_b.c” under Files tab.

File dependency view under "Files" tab
B. Call Graph of “main” routine of this example program.

call graph for main routine
C. Call Graph and Caller Graph of “insert” sub-routine of this example program.

call and caller graph for a sub-routine

 

 

 

In the end, i’ll highlight the fact that these call graphs are static and are derived from the source code.
This would be suitable for most cases if you are interested to understand the overall code-flow and code-layout of some new project you are thrown in to work on :-).
However, if you are debugging some complex issue and are interested to generate runtime call graph based on the execution path of a program; you could take the help of certain run time profilers with call graph functionality.

CMVC user guide for transitioning to Git

This blog is written with an intention to act as a transition guide for existing users of CMVC ( or similar SCM tools ) to start using Git.
For folks who are new to either CMVC or Git; both are Source Code Management(SCM) tools like CVS, SVN(Subversion) and Clearcase. If you have been into software programming, there is a very good chance you would have used one of these ( or other similar SCM tools ).
There are very good reasons on why you should consider using Git as your SCM and there are multiple articles out there in the “Internet (www.google.com 😃)” which already give out the details. In short, unlike the earlier generation client-server model SCMs; Git was designed to be distributed in nature.
This blog is not intended to teach Git or help getting started with it. It is intended for audience who have already grasped the basics of Git; but are unsure on how to relate the existing activities on other SCMs (like CMVC) to Git philosophy.
If you have not yet done a “Git 101” kind of a course, i suggest you to do that first.
You could use one of these links; or search in Google to get the learning resource which suits your taste :
  1. http://assets.en.oreilly.com/1/event/45/Git%20101%20Tutorial%20Presentation.pdf
  2. https://git-scm.com/book/en/v2
  3. https://www.atlassian.com/git/tutorials

Is there a centralized server as source code repository in Git ?

Git provides all of the well renowned logic to ease out source code version controlling; but you still need a place( read storage ) where you could place all of your code “safely”. Although you could use your local computer as the repository, obviously it is the least safe place! Apart from that, you need a place where other members/collaborators could “share/contribute” to the same project.
For that you need to use a source code hosting facility provided by the likes of GitHub, GitLab, Bitbucket, Launchpad etc.
This wikipedia page has a very comprehensive list of such hosting facilities : https://en.wikipedia.org/wiki/Comparison_of_source_code_hosting_facilities.
The popular of the lot (at the time of writing this blog) is GitHub, and i’ll refer to some of the terms used in GitHub wherever required for the rest of this post.
To get familiar with GitHub you could get started here : https://guides.github.com/activities/hello-world/

Can an entire project life-cycle be managed with GitHub ?

With GitHub alone, it might not be easy to manage a project. but there is this useful add-on called ZenHub.
ZenHub provides useful extensions on top of GitHub for better management of projects. One of the popular choices for “Agile project management with GitHub” ( as ZenHub claims ) !

How do i relate my work in CMVC(or other similar SCMs) to this new way of working with Git?

Let’s start with a one-to-one mapping between the commands/philosophy in CMVC and map that to “Git / GitHub / ZenHub” put together.
Repository
A repository contains all of the code ( and documentation etc. ) required for a project. Just like you would have used a centralized server for CMVC; you could consider GitHub as the alternative here. Though, Git has the ability to make a directory in your local computer as a self-sufficient, full-fledged source code repository; that would be a risk for bigger projects. Also, you’ll not get the other project management facilities you get with “GitHub + ZenHub“.
Defect
A “Defect” of CMVC is known as an “Issue” in GitHub parlance.
Once you are in a “repository” view of GitHub you’ll see multiple tabs and the 2nd one being “Issues”
Just like a CMVC defect_number, GitHub assigns a number for issues. You can assign and work on an issue similar to a defect.
Use the #<issueID> while adding message ( e.g. “Fix #10” ) to your git file commits and then the commits automatically gets associated with the particular issue ( e.g. #10 ).
Feature
With ZenHub extension to GitHub you can create an “Epic” which is more like a “Feature” of CMVC.
In Agile terminology, an Epic is a large piece of work, which could be broken down into user stories.
And here each user story can be tracked as an “Issue”.
Working on a Defect ( aka Issue )
Let’s say that you are going to start work on a project for which a repository is already created in GitHub ( i.e. with all the existing source code “or” maybe a new empty repository ).
  • First thing you need to do is, “clone”(git clone) the repository locally and if you don’t specify a branch, the “master” branch gets cloned.
  • In most cases it might be sufficient for you to have this single local copy (aka clone) for most of your work. ( This is similar to a “File -extract” of the entire code-based in CMVC ).
  • Then you “create a branch” ( git branch, aka context ) with an appropriate name maybe including the issue# you are working on e.g. issue_123. You could though give any name to the branch.
  • Then you need to “checkout” ( git checkout ) that branch (i.e. you move to the context of that branch).
You could find it amusing initially on why there are 2 steps ( clone + branch ) to start working on a branch(defect/issue); but you’ll realize the need when you find out that Git provides the flexibility for you to work on multiple branches(read issues) from a single code copy/location.
You could do some partial work on a branch and move to a new branch for some other work ( e.g. a separate defect ); just by changing the context ( aka checking out a new branch ). And come back to the earlier defect by switching back to the earlier context (branch).
File operations
  • Always check the context(branch) before starting to make changes to a file. If you are not in the right branch, checkout the required branch first.
  • Once inside the appropriate branch, modify the files as required; by directly editing them. ( nothing like “File -checkout” of CMVC required ).
  • When you are done with all the changes, commit ( i.e. “git add  and then git commit ) the changes to the branch. This is similar to “File -checkin” in CMVC. Remember all of this is happening on your local computers repository. When you are done with all of your changes; you need to push ( git push ) the locally created/committed branch to the remote server( aka origin in GitHub terms ).
  • Once your branch is pushed to the remote GitHub repository( aka origin) you could raise a review request by creating a “pull request” in GitHub.
  • At this stage, the marked reviewers would be able to review your changes.
  • You could address the review comments by making the changes in the same branch and then redo : git add, git commit, git push.
  • Once the reviewers are satisfied with your changes, you( or repository owner ) will “merge” your changes to the higher level branch or master ( read more here for one preferred method : https://guides.github.com/introduction/flow/ ).
Note : 
  1. In Git there is no explicit “File -checkout” required before editing a file; as each contributor could be editing the same file at the same time. There is no explicit “write lock” placed on the file as is the case with CMVC. This i see as complementary to the flexibility ( distributed work ) of Git.
  2. Unlike CMVC and other traditional version control systems which are centralized, Git is a distributed revision control system. What it means is that you can do work from your local copy of the repository even when you are not connected to the network.

Wow, i am overwhelmed, what next ?

Start using Git; then go back and revise some of the other training articles which you referred earlier to get started with Git ( maybe go to the advanced sections then ). I am recommending a revision because there are many other “useful” features in Git + GitHub + ZenHub; which you will only appreciate later; only after you are comfortable with the basics.
If you have any questions specific to this article, i’ll be happy to help in the comments section.

pretty print symbols in AIX with kdb, KDBSYM, pr

Disclaimer : All the postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.

 

Some time back, I had written this blog AIX ‘bosdebug’ to debug Kernel extensions on the usefulness of pretty printing a data-structure in AIX Kernel Debugger “KDB” by providing it the symbol information using the ‘bosdebug‘ command. That pretty much serves the purpose on a live system during kernel_extension / device_driver development. That though, is not very useful while working on system dumps where you need to use “command kdb“. Also for several reasons even during development it would be preferable to observe data-structures from “command kdb” rather than by halting a system and using KDB(System Kernel Debugger).

This article covers the details on enabling pretty printing of symbols in ‘command kdb‘. For that, you would need to use “KDBSYM environment variable” and i stumbled across this information while reading through the help of ‘pr command’ in ‘command kdb‘.
The help message of ‘pr‘ command pretty much cover it all :

(0)> pr -?
print <type> <address>
    Formatted dump of memory at <address> as if it were of type <type>.
    <type> must be a type recognized by the debug object file kdb
    draws its symbols from.  This file can either be generated
    automatically when crash is run via -i flags, or by setting
    the KDBSYM environment variable to be the name of a file
    containing debug symbols with the structure types you want to
    print. “address” can be an address or a kernel global variable.

    For example, to print the struct vnode at 12345,
    kdb -i /usr/include/sys/vnode.h
    (0)> print vnode 012345

    To create a symbols file ahead of time for faster invocation
    $echo ‘#include <sys/vnode.h>’ > symbols.c
    echo ‘main() { ; }’ >> symbols.c
    $ cc -g -o symbols symbols.c -qdbxextra /* for 32 bit kernel */
    $ cc -g -q64 -o symbols symbols.c -qdbxextra /* for 64 bit kernel */
    $ KDBSYM=/bin/pwd/symbols ; export KDBSYM
    $ kdb dump unix
    (0)> print vnode 012345

    Kernel global variable can be used instead of absolute address.
    For example,
    (0)> print Simple_lock suspending_q_lock

 

Using a system dump of a sample kernel extension as an example i’ll demonstrate this functionality.
Below is the ‘stat‘ command output from this sample simulated system crash :

 

This stack above is because of an “Illegal Trap Instruction Interrupt in Kernel” because of a failed assert() in function : read_contents(); seen on top of the above stack.

From the dummy kernel extension code of read_contents(), i know that the input to this function was a pointer of type ‘struct info‘ and the crash was caused by a failed assert ( as seen in code below ) :

struct prim
{
    unsigned short type;
    unsigned int   id;
    unsigned long  priority;
};

struct info
{
    struct prim primary;
    unsigned int flags;
    unsigned int state;
    unsigned long size;
    int error;
    char buffer[1024];
};

int read_contents(struct info *infop)
{
    assert(infop->primary.type == 0x99);
    .. ..

To investigate further, i need to look at the values of members in ‘struct info‘ and deduce the possible code flow.
The stock method is to use the  “display double word data”  command dd and dump the contents of the structure in terminal and mark out the structure member values based on individual data-types sizes and considering space of structure padding. phew !

 

This method of inferring structure member values though might be sufficient in some cases, is a very tedious and many times error-prone for complex structure types.
This is where the wonderful command in kdb called “pr comes in handy 🙂

pr       print             print a formatted structure at an address

As was mentioned in the ‘pr‘ command usage message, it requires KDBSYM environment variable to point to the name of the object file which contains the debug symbols with the structure types you want to do a formatted print.

In my case, I set KDBSYM to point the debug symbol object file i had generated :
export KDBSYM=/home/sangeek/sKE/symbols
and initiated command kdb on the system dump file.

And this is what i see using the ‘pr’ command in kdb :

The above formatted structure output makes things pretty easy to view and understand.
I now save a lot of time debugging kernel extension issues and really spend my time on where it should be spent 😉

A very useful functionality, lacking adoption; attributing some of the blame to lack of sufficient documentation.

In short, command kdb, pr command and KDBSYM rocks \o/

Configure SAS controller/disk for use in AIX/VIOS partition

Disclaimer : All the postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.

 

This article is a step-by-step guide to configure SAS controller/disk for use in AIX or VIOS logical partition.
I had to configure SAS disks for my work on an AIX partition and found it difficult to locate any “how-to” article on the Internet.
Though i figured out that there was a command “sissasraidmgr” to get this done.
I was sort of convinced that I would have use the complicated command ( yuck ) to get the job done and decided to go through the MAN page for the command line options. To my surprise ( voila ) this is what I found in the Description of the command’s man page ( important sections highlighted for your convenience ) :

       The sissasraidmgr command is used to create, delete, and maintain RAID
       arrays on a Peripheral Component Interconnect-X (PCI-X) or PCI Express
       (PCIe) SAS RAID controller. Attention: See the Power Systems SAS RAID       <<<<<<
       Controllers for AIX reference guide and become familiar with the
       storage management concepts before you run the sissasraidmgr command.
       Attention: The System Management Interface Tool (SMIT) smit sasdam fast     <<<<<<
       path is the preferred method to manage a SAS RAID controller.
       Attention: Service tasks require special training and must not be
       performed by nonservice personnel.

This man page description solved half of my problems :
 A  : It indicated that there was a “Power Systems SAS RAID Controllers” reference guide for AIX. After a quick search in Google i got it from this link : PDF ( SAS RAID Controllers for AIX )
 B  : There is an easier and intuitive SMIT interface to manage the SAS devices 🙂

I went through the required information in the “reference guide” and configured the SAS disks for my use. I highly recommend you to read through the guide for all your work with SAS controllers and SAS disks. Though, i later figured out that there was an IBM Knowledge Center article on it ( though not as comprehensive as the reference guide ) ; here’s the link : Preparing disks for use in SAS disk arrays

 

The rest of this article is my version of a step-by-step ( get-it-done ) guide, to configure your SAS controller and get your SAS disks for use.
[ NOTE : I assume that your logical partition is installed with AIX or VIOS and are not already using these SAS disks ]

 

 1 . To start, get the list of adapters available on the partition ( and grep for sas adapter ). Here I see that my partition has a single SAS adapter.

# lsdev -Cc adapter | grep sas
sissas0 Available 01-00 PCIe3 12GB Cache RAID SAS Adapter Quad-port 6Gb x8   <<<   Here I have a PCI Express 3.0 SAS RAID Card with write cache size upto 12GB

You could find more details about the SAS adapter assigned to your partition by referring to “SAS RAID Controllers” reference guide and searching for the “Customer Card ID Number” of your adapter, found using “lscfg -vpl sissas0” command.

 

 2 . Next, see the list of devices which are children of the SAS adapter ( and further SAS Protocol device ) :

# lsdev -p sissas0
sas0  Available 01-00-00 Controller SAS Protocol
sata0 Available 01-00-00 Controller SATA Protocol

# lsdev -p sas0
pdisk1   Available 01-00-00    Physical SAS Solid State Drive
pdisk2   Available 01-00-00    Physical SAS Solid State Drive
ses0     Available 01-00-00    SAS Enclosure Services Device
ses1     Available 01-00-00    SAS Enclosure Services Device

From the above output you could see that i have two Physical SAS Solid State Drives ( SSD ).
pdisks represent physical disks that are formatted to 528 bytes per sector. To be available for use by the AIX operating system or any application these pdisks should be part of a disk array and formatted to 512 bytes per sector.

 

 3 . For our work to configure the SAS disks we’ll use SMIT (easier than command, huh!). The ” smit sasdam ” fast path takes you directly to the main menu of “IBM SAS Disk Array Manager“.
To make a pdisk to be available for use in the SAS disk array select “Create an Array Candidate pdisk and Format to RAID block size“. Next, select the SAS controller “sisasas0” and press Enter.

Create_an_Array_Candidate_pdisk_and_Format_to_RIAD_block_size

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 4 . In the next screen select the disks from the list of pdisks assigned to the SAS adapter. Here i have selected both the available pdisks, pdisk1 and pdisk2.

Create an Array Candidate pdisk and Format to RAID block size -- Select pdisks

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 5 . On submitting the previous operation, the selected pdisks are formatted and the completion message is displayed once the operation is complete.

Create an Array Candidate pdisk and Format to RAID block size -- Format complete

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 6 . Now the pdisks are formatted and ready to be used for Disk Array. Next step is the create a disk array using these pdisks.
To get that done we need to return back to the main menu of “IBM SAS Disk Array Manager” in SMIT and select “Create a SAS Disk Array“.
Here again, we have to select the same SAS controller.

Create a SAS Disk Array

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 7 . The next step requires the desired “RAID Level” for the Disk Array to be selected.
You should select the appropriate RAID level based on your requirement. If not already familiar with use of RAID in data storage you could know more about it in this Wikipedia link : https://en.wikipedia.org/wiki/RAID. Here I have selected RAID 0 ( to get a single hdisk stripped across two physical disks i.e. pdisks ).
This means that the size of the created hdisk would be equal to the sum of both the pdisks. Also, stripping would help to improve the throughput achieved for the hdisk by distributing the read and write operations to both the pdisks in the order of the stripe size.

Create a SAS Disk Array- Select a RAID Level

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 8 . Next screen asks to select the “Stripe Size” to be used for striping.
Here I have selected the recommended value of 256Kb.

Create a SAS Disk Array-Select a Stripe Size

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 9 . Pdisks to be used in the disk array needs to be selected in the next screen. Here I have selected both the pdisks.
After the completion of this step a new hdisk would be created in my LPAR.

Create a SAS Disk Array - Select Disks to Use in the Array

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 10 . You could use this command to view the new SAS hdisk in your partition. For my case it was hdisk5 :

# lsdev -Cc disk | grep SAS
hdisk5 Available 01-00-00 SAS RAID 0 SSD Array


Voila, you have your SAS hdisk for use.

The disks that I had here were Solid State Drive(SSD) disks; and if you have SSDs in your machines, you could try out the dazzling and charismatic Flash caching feature in AIX which requires SSD disks.
You could not only increase the storage I/O throughput of your applications multifold; you would also reduce a lot of I/O congestion in your Storage Area Network(SAN).

A detailed article on the Flash caching feature of AIX is available in this IBM developer Works article : Integrated Server Based Caching of SAN Based Data.

Hope you found this article useful and please feel free to leave back any comments/suggestions you might have !

 

Enhanced num_cmd_elems attribute for virtual FC ( NPIV ) AIX

Disclaimer : All the postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.

 

Virtual Fibre Channel(VFC) is PowerVM’s flagship storage virtualization solution based on  N_Port ID Virtualization (NPIV) standard technology for Fibre Channel networks.
If you are new to this storage virtualization technology, IBM Knowledge Center article on Virtual FC is located here : https://www.ibm.com/support/knowledgecenter/8247-21L/p8hat/p8hat_vfc.htm

This article will provide more details about the enhanced num_cmd_elems attribute for virtual FC ( NPIV ) AIX partitions.
num_cmd_elems” attribute setting of a virtual FC(vFC) adapter on AIX partition imposes limit on the maximum number of command elements that could actively be served by the adapter.
As part of the AIX releases last year ( December 2015 ), vFC adapter has been enhanced to support a maximum of 2048 active command elements. [ sidenote : vFC adapter earlier did support 2048 num_cmd_elems, but were restricted some time back for certain issues reported from Customers. This latest change ensures that the issues of the past in vFC adapter would not re-occur with the higher num_cmd_elems support. ]

You could see this change of “num_cmd_elems” in vFC adapter by this command :
# lsattr -Rl fcs0 -a num_cmd_elems
20…2048 (+1)
As was earlier, 200 is still the default value for num_cmd_elems attribute of a vFC adapter in AIX.

 

Should you care about this change for increased “num_cmd_elems” support in vFC adapter ?

The answer to it “depends” upon the requirements of your application / workload !
All applications have some I/O requirement. Certain applications perform small number of large-sized IO operations, some drive large number of small-sized I/O e.g. Online Transaction Processing (OLTP) workloads and there are others with a mix of both.
OLTP kind of workloads propel the disk driver on the Operating System to generate large number of I/O commands. The performance of these OLTP largely depend upon the number of Input/Output Operations Per Second (IOPS) being driven by the system.
I’ll use the below figure to explain more about this :

num_cmd_elems virtual FC AIX LPAR

Impact of num_cmd_elems on AIX virtual FC partition

For the sample OLTP application in the above illustration, performance is largely driven by the ability of the system to drive more throughput(IOPS). If the disk’s have been tuned with a higher “queue_depth” attribute, it would depend upon if the “num_cmd_elems” attribute setting on the vFC is able to send across all the requests. Though, in a virtual environment ( like this with vFC ), throughput and other I/O characteristics of a partition(LPAR) are also significantly impacted by the load on Virtual I/O Server ( and the physical FC adapter mapped to the client vFC adapter ).
If you are interested to know more in-depth about AIX and VIOS Disk And Fibre Channel Adapter Queue Tuning, I would suggest you to refer to this document : https://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD105745 from IBM Advanced Technical Sales team.

 

How do you know if the “num_cmd_elems” set on the vFC adapter of the partition is restricting the throughput of your application ?

Assuming that your disks queue_depth attribute is not serving as a bottleneck ( which you can verify using the iostat command ); you can find out if the vFC adapter is the bottleneck by using the fcstat command.
Below is a sample output from “fcstat -D fcsX” command ( with num_cmd_elems of vFC adapter set to default value ( i.e. 200 ) and a sample workload driving higher IOPS ) :

# fcstat -D fcs0

FIBRE CHANNEL STATISTICS REPORT: fcs0

Device Type: Virtual Fibre Channel Client Adapter (adapter/vdevice/IBM,vfc-client)
Serial Number: UNKNOWN
Option ROM Version: UNKNOWN
ZA: UNKNOWN
World Wide Node Name: 0xC05076069D61000A
World Wide Port Name: 0xC05076069D61000A

< .. .. output truncated .. .. >

Driver Statistics
  Number of interrupts:   0               
  Number of spurious interrupts:   0               
  Long term DMA pool size:   0
  I/O DMA pool size:  0

  FC SCSI Adapter Driver Queue Statistics
    Number of active commands:   0
    High water mark  of active commands:   180
    Number of pending commands:   0
    High water mark of pending commands:   1
    Number of commands in the Adapter Driver Held off queue:  0
    High water mark of number of commands in the Adapter Driver Held off queue:  0

  FC SCSI Protocol Driver Queue Statistics
    Number of active commands:   0
    High water mark  of active commands:   180
    Number of pending commands:   0
    High water mark of pending commands:   2022

FC SCSI Adapter Driver Information
  No DMA Resource Count: 0               
  No Adapter Elements Count: 0               
  No Command Resource Count: 4419300         

FC SCSI Traffic Statistics
  Input Requests:   5662994         
  Output Requests:  120070          
  Control Requests: 141655          
  Input Bytes:  23424304056     
  Output Bytes: 603911680 

Information derived from above fcstat output, relevant ( highlighted in red above ) to our “num_cmd_elems” discussion :

  •  “No Command Resource Count” field being a high value ( as above ) and if increasing steadily indicates that the disk driver ( because of the application ) is sending in more commands than the vFC adapter is able to handle concurrently.
  •  “High water mark  of active commands” field indicates the maximum number of active commands the virtual FC stack has handled at any point in time on the running system. This would obviously be lower than the num_cmd_elems value set on the vFC adapter.
  •  “High water mark of pending commands” field of fcstat output indicates the maximum number of commands that were pending on the virtual FC stack at any point in time on the running system. If the num_cmd_elems of the vFC is set to a lower value, it is a clear indication for you to increase that. [ NOTE : As mentioned earlier, it is better for you to consider the impact, if any, on other stakeholders in your virtualized environment before making the change. ]

 

How do you know if the increased num_cmd_elems on vFC has improved the throughput of your application ?

fcstat command run on the system with increased num_cmd_elems attribute on vFC adapter should be able to show the difference.
Below is a sample output from fcstat command ( with num_cmd_elems of vFC adapter set to 2048 value and same workload run again ) :

# fcstat -D fcs0

FIBRE CHANNEL STATISTICS REPORT: fcs0

Device Type: Virtual Fibre Channel Client Adapter (adapter/vdevice/IBM,vfc-client)
Serial Number: UNKNOWN
Option ROM Version: UNKNOWN
ZA: UNKNOWN
World Wide Node Name: 0xC05076069D61000A
World Wide Port Name: 0xC05076069D61000A

< .. .. output truncated .. .. >

Driver Statistics
  Number of interrupts:   0               
  Number of spurious interrupts:   0               
  Long term DMA pool size:   0
  I/O DMA pool size:  0

  FC SCSI Adapter Driver Queue Statistics
    Number of active commands:   0
    High water mark  of active commands:   1894
    Number of pending commands:   0
    High water mark of pending commands:   1
    Number of commands in the Adapter Driver Held off queue:  0
    High water mark of number of commands in the Adapter Driver Held off queue:  0

  FC SCSI Protocol Driver Queue Statistics
    Number of active commands:   0
    High water mark  of active commands:   1895
    Number of pending commands:   0
    High water mark of pending commands:   1

FC SCSI Adapter Driver Information
  No DMA Resource Count: 0               
  No Adapter Elements Count: 0               
  No Command Resource Count: 0               

FC SCSI Traffic Statistics
  Input Requests:   4945548         
  Output Requests:  3464            
  Control Requests: 311             
  Input Bytes:  20391759980     
  Output Bytes: 41137280        

Adapter Effective max transfer value:   0x100000

 

No Command Resource Count” field of the above fcstat output is a 0 ( highlighted in green above ) indicating that the vFC adapter is no longer keeping any incoming I/O commands pending for lack of free command elements. Also, “High water mark  of active commands” has increased significantly indicating a positive impact on the overall throughput of the system. Similarly a low value of “High water mark of pending commands” is also a good sign.
Changing the “num_cmd_elems” to a higher value did wonders to the throughput achieved by my workload !

 

Which AIX releases is this change available in ?

6100 TL 09 SP 06 ( APAR IV76258 )
7100 TL 04 SP 00 ( APAR IV76270 )
7200 TL 00 SP 00
7100 TL 03 SP 07 ( APAR IV76968 )

Tuning the complete I/O stack optimally in a virtualized environment gives you the best benefits of both worlds : Virtualization and Performance.
What do you think ?

 

P.S. Keep up to date and get help with IBM PowerVM by joining IBM PowerVM LinkedIn Group. Also, there is a IBM AIX Technology Forum in LinkedIn where queries and enhancements in AIX are shared and discussed.