Detail Data

CPU Data, collectl -sC

# SINGLE CPU STATISTICS
#   CPU  USER NICE  SYS WAIT IRQ  SOFT STEAL IDLE INTRPT
These are the same fields as reported for verbose CPU Summary, each preceeded by the CPU number. If running collectl V2.5.0 or greater AND you request interrupt summary data, the INTRPT field will also be included.

CPU The CPU number which the stats are associated with
User Time spent in User mode, not including time spend in "nice" mode.
Nice Time spent in Nice mode, that is lower priority as adjusted by the nice command and have the "N" status flag set when examined with "ps".
Sys This is time spent in "pure" system time.
Wait Also known as "iowait", this is the time the CPU was idle during an outstanding disk I/O request. This is not considered to be part of the total or system times reported in brief mode.
Irq Time spent processing interrupts and also considered to be part of the summary system time reported in "brief" mode.
Soft Time spent processing soft interrupts and also considered to be part of the summary system time reported in "brief" mode.
Steal Time spend in involuntary wait state while the hypervisor was servicing another virtual processor.
Intrpt If the interrupt summary stats were requested at the same time, this will be included which is the aggregate number of interrupts for each CPU.

Disk Data, collectl -sD

# DISK STATISTICS (/sec)
#          <---------reads---------><---------writes---------><--------averages--------> Pct
#Name       KBytes Merged  IOs Size  KBytes Merged  IOs Size  RWSize  QLen  Wait SvcTim Util
Name Name of the disk the statistics are being reported for.
KBytes KB read/sec
Merged Read requests merged per second when being dequeued. These statistics are not available in older kernels which only record disk statistics in /proc/stat.
IOs Number of reads/sec
Size Average read I/O size in KBytes
KBytes KB written/sec
Merged Write requests merged per second when being dequeued.
IOs Number of writes/sec
Size Average write I/O size in KBytes
RWSize Average combined read and write I/O size in KBytes. This is not the average of the read and write sizes but rather the sum of the reads/write divided by the number of I/Os
QLen Average number of requests queued
Wait Average time in msec for a request has been waiting in the queue
SvcTim Average time in msev for a request to be serviced by the device
Util Percentage of CPU time during which I/O requests were issued

Infiniband, collectl -sX

# INFINIBAND STATISTICS (/sec)
#HCA    KBIn   PktIn  SizeIn   KBOut  PktOut SizeOut  Errors
HCAHCA instance name
KBInKB received/sec.
PktInReceived packets/sec.
SizeInAverage incoming packet size in KB
KBOutKB transmitted/sec.
PktOutTransmitted packets/sec.
SizeOutAverage outgoing packet size in KB
ErrsCount of current errors. Since these are typically infrequent, it is felt that reporting them as a rate would result in either not seeing them OR round-off hiding their values.

Interrupts, collectl -sJ

# INTERRUPT DETAILS
# Int    Cpu0   [Cpu...]   Type            Device(s)

IntInterrupt number within the range 0-255. Note that only those interrupts that have had any activity since the last monitoring interval will be reported
CPUn...The CPU for which the interrupt count is being reported. There will be one column/CPU
TypeInterrupt type
DeviceThe names of the devices which are generating this interrut

Lustre Data, collectl -sL

There are several formats the lustre detail data can take based on whether you're looking at a client or an OSS (there is not any MDS specific detail data, though it does share the same disk-level buffer size data as the OSS). Furthermore, if one specifies the -sLL form of the detail switch OST level details will be reported where appropriate.

Lustre Client, collectl -sL

# LUSTRE CLIENT DETAIL (/sec)
#Fils  KBRead  Reads SizeKB KBWrite Writes SizeKB
FilsysName of the filesystem these stats apply to
KBReadKBs read/sec
SizeKBAverage read size
Reads Reads/sec
KBWriteKBs written/sec
WritesWrites/sec
SizeKBAverage write size

Lustre Client, collectl --lustops O

# LUSTRE CLIENT DETAIL (/sec)
#Fils  Ost     KBRead  Reads SizeKB KBWrite Writes SizeKB
The data here is the same as that reported for the standard client side lustre data except now it is broken down by OST within the file system.

FilsysName of the filesystem these stats apply to
Ost OST name within the filesystem
KBReadKBs read/sec
Reads Reads/sec
SizeKBAverage read size
KBWriteKBs written/sec
WritesWrites/sec
SizeKBAverage write size

Lustre Client RPB-Buffer Stats, collectl --lustopts B

# LUSTRE CLIENT DETAIL: RPC-BUFFERS (pages)
#Filsys  Ost   RdK  Rds   1K   2K   ...   WrtK Wrts   1K   2K   ...
This form also includes the reads/writs within the filesystem, but also add the sizes of the RPM buffers. Since these numbers always apply to OSTs you need to use the -sLL form of the subsystem switch.

FilsysName of the filesystem these stats apply to
OstOST name within the filesystem
RdKKBs read/sec
RdsReads/sec
nKNumber of pages of of this size read
WrtKKBs written/sec
WrtsWrites/sec
nKNumber of pages of of this size written

Lustre Client Metadata, collectl -sL --lustopts M

# LUSTRE CLIENT DETAIL: METADATA
#Filsys   KBRead  Reads KBWrite  Writes  Open Close GAttr SAttr  Seek Fsync DrtHit DrtMis

FilsysName of the filesystem these stats apply to
KBReadKBs read/sec
ReadsReads/sec
KBWriteKBs written/sec
WritesWrites/sec
OpenOpens/sec
CloseCloses/sec
GAttrGet Attributes/sec
SAttrSet Attributes/sec
SeekSeeks/sec
FsyncFSyncs/sex
DrtHitDirty Hits/sec
DrtMisDirty Misses/sec

Lustre Client Readhead, collectl -sL --lustopts R

# LUSTRE CLIENT DETAIL: READAHEAD
#Filsys   KBRead Reads  KBWrite Writes  Pend  Hits Misses NotCon MisWin LckFal  Discrd ZFile ZerWin RA2Eof HitMax

FilsysName of the filesystem these stats apply to
KBReadKBs read/sec
ReadsReads/sec
KBWriteKBs written/sec
WritesWrites/sec
PendPending issued pages
HitsHits
MissesMisses
NotConReadpage not consecutive
MisWinMiss inside window
LckFalFailed lock match
DiscrdRead but discarded
ZFileZero length file
ZerWinZero size window
RA2EofRead-ahead to EOF
HitMaxHit max r-a issue

Lustre OSS, collectl -sL

# LUSTRE FILESYSTEM SINGLE OST STATISTICS (/sec)
#Ost            KBRead   Reads  SizeKB    KBWrite  Writes  SizeKB
OstOST name
KBReadKBs read/sec
ReadsReads/sec
SizeKBAverage read size
KBWriteKBs written/sec
WritesWrites/sec
SizeKBAverage write size

Lustre OSS RPC Buffers, collectl -sL --lustopts B

# LUSTRE FILESYSTEM SINGLE OST STATISTICS
#Ost            RdK  Rds   1P   2P  ...  WrtK Wrts   1P   2P  ...
FilsysName of the filesystem these stats apply to
OstOST name within the filesystem
RdKKBs read/sec
RdsReads/sec
nPNumber of pages of of this size read
WrtKKBs written/sec
WrtsWrites/sec
nPNumber of pages of of this size written

Lustre OSS and MDS Disk Buffers, collectl -sL --lustopts D

This display is very similar the the RPC buffers in that the sizes of different size I/O requests are reported. In this case there are requests send to the disk driver. Note that this report is only available for HP's SFS.

# LUSTRE DISK BLOCK LEVEL DETAIL (units are 512 bytes)
#DISK RdK  Rds 0.5K   1K   2K   ...   WrtK Wrts 0.5K   1K   2K   ...
DiskName of the disk these stats apply to
RdKReads/sec
RdsKBs read/sec
nKNumber of blocks of of this size read
WrtKWrites/sec
WrtsKBs written/sec
nKNumber of blocks of of this size written

Network Data, collectl -sN

# NETWORK STATISTICS (/sec)
#Num    Name  KBIn  PktIn SizeIn  MultI   CmpI  ErrIn  KBOut PktOut  SizeO   CmpO ErrOut

Num Each network interface is numbered, starting with 0
Name Name of the interface
KBIn Incoming KB/sec
PktIn Incoming packets/sec
SizeI Average incoming packet size in bytes
MultI Incoming multicast packets/sec
CmpI Incoming compressed packets/sec
ErrIn Incoming errors/sec
KBOut Outgoing KB/sec
PktOut Outgoing packets/sec
SizeO Average outgoing packet size in bytes
CmpO Outgoing compressed packets/sec
ErrOut Outgoing errors/sec

NFS Data, collectl -sF

The reporting of NFS data is a little different in that there is a verbose mode to the detail data - I hope to straighten this out in a future release. In any event, the following is the normal form.

#<----NFS MetaOps---->
#  meta commit retran
meta The number of meta data operations/second
commits The number of commit operations/second
retrans The number of retransmit operations/second

These next two forms require --verbose and apply to NFS versions 2 and 3 respectively. You should consult nfs documentation for detailed definitions of these fields.

# NFS V2 SERVER (/sec)
#NULL GETA SETA ROOT LOOK REDL READ WCAC WRIT CRE8 RMOV RENM LINK SYML MKDR RMDR RDIR FSST
# NFS V3 SERVER (/sec)
#NULL GETA SETA LOOK ACCS RLNK READ WRIT CRE8 MKDR SYML MKND RMOV RMDR RENM LINK RDIR RDR+ FSTA FINF PATH COMM

Process Data, collectl -sZ

# PROCESS SUMMARY (faults are /sec)
# PID  User     PR  PPID S   VSZ   RSS  CP  SysT  UsrT Pct  AccuTime  RKB  WKB MajF MinF Command
PID Pid of the process
User Name of user which this process is running under. In playback mode on a different machine, use -oP to direct collectl to use the password file named in collectl.conf (default is /etc/passwd) to lookup the corresponding username. Otherwise the UID will be reported instead.
PR Process priority
PPID PID of this process's parent
S Process State: S - Sleeping, D - Uninterruptable Sleep, R - Running, Z - Zombie or T - Stopped/Traced
VSZ This is the amount of VS memory used by this process
RSS This is the amount of RSS memory used by this process
CP CPU number this process is currently running on
SysT The amount of System Time this process used during this interval
UsrT The amount of User Time this process used during this interval
Pct Percentage of the current interval taken up by this task (the User and System time are used for this calculation)
AccuTime Total accumulated System and User time since the process began execution
RKB This is the number of kilobytes of data written by each process. Both this and the WKB field are only present if the kernel had proces I/O monitoring enabled which is not the default as of 2.6.23.
WKB This is the number of kilobytes of data read by each process
MajF Major Page Faults per second
MinF Minor Page Faults per second
Command Command that is running. Path and command line options are NOT included unless --procopts w

Process I/O Data, collectl --procopts i

# PID  User    PPID S  SysT  UsrT   RKB   WKB  RKBC  WKBC  RSys  WSys  Cncl  Command
PID Pid of the process
User Name of user which this process is running under. In playback mode on a different machine, use -oP to direct collectl to use the password file named in collectl.conf (default is /etc/passwd) to lookup the corresponding username. Otherwise the UID will be reported instead.
PPID PID of this process's parent
S Process State: S - Sleeping, D - Uninterruptable Sleep, R - Running, Z - Zombie or T - Stopped/Traced
SysT The amount of System Time this process used during this interval
UsrT The amount of User Time this process used during this interval
RKB Attempt to count the number of bytes which this process really did cause to be fetched from the storage layer by doing calls to read_bytes. This is done at the submit_bio() level, so it is accurate for block-backed filesystems.
WKB Attempt to count the number of bytes which this process caused to be sent to the storage layer by doing calls to write_bytes. This is done at page-dirtying time.
RKBC Number of bytes which were read via read, readv, pread and sendfile. Since these requests are satisfied from kernel pagecache they won't be accounted for by RKB, because they didn't require any I/O.
WKBC Number of bytes which were written via write, writev, pwrite and sendfile. Like RKBC, since the I/O uses the pagecache these values won't be accounted for by WKB.
RSys Number of read syscalls, specifically: read, pread, readv and sendfile
WSys Number of write syscalls, specifically: write, pwrite, writev and sendfile
Cncl Number of cancelled write bytes.

Process Memory Data, collectl --procmem

# PID  User     S VmSize  VmLck  VmRSS VmData  VmStk  VmExe  VmLib MajF MinF Command
PID Pid of the process
User Name of user which this process is running under. In playback mode on a different machine, use -oP to direct collectl to use the password file named in collectl.conf (default is /etc/passwd) to lookup the corresponding username. Otherwise the UID will be reported instead.
S Process State: S - Sleeping, D - Uninterruptable Sleep, R - Running, Z - Zombie or T - Stopped/Traced
VmSize Size of Virtual memory used by the entire process
VmLck Size of Locked Virtual Memory
VmRSS Size of Resident Virtual Memory
VmData Size of Virtual Memory used for heap
VmStk Size of Virtual Memory used for stack
VmExe Size of Virtual Memory used for exe and statically linked libraries
VmLib Size of Virtual Memory used for dynamically linked libraries
MajF Major Page Faults per second
MinF Minor Page Faults per second
Command Command that is running. Path and command line options are NOT included unless --procopt w

Slab Data, collectl -sY

There are actualy 2 different formats for slab data. This one applies to all kernels prior to 2.6.22 and contains the same fields as the Summary report for each named slab and loses the caches fields.

# SLAB DETAIL
#               <-----------Objects----------><---------Slab Allocation------>
#Name           InUse   Bytes   Alloc   Bytes   InUse   Bytes   Total   Bytes
Objects
InUse Total number of objects that are currently in use.
Bytes Total size of all the objects in use.
Alloc Total number of objects that have been allocated but not necessarily in use.
Bytes Total size of all the allocated objects whether in use or not.
Slab Allocation
InUse Number of slabs that have at least one active object in them.
Bytes Total size of all the slabs.
Total Total number of slabs that have been allocated whether in use or not.
Bytes Total size of all the slabs that have been allocted whether in use or not.

This second format applies to the new SLUB allocator starting with the 2.6.22 kernel. As with the old format slab detail report, the same fields as are found in the Slab Summary Report are shown for each named slab.

# SLAB DETAIL
#               <----------- objects -----------><--- slabs ---><----- memory ----->
#Slab Name       Size  /slab   In Use     Avail    SizeK  Number      UsedK    TotalK
Objects
Size Size of a single slab object
/Slab The number of objecs in a single slab
InUse The total number of objects that have been allocated to processes.
Avail The total number of objects that are available in the currently allocated slabs. This includes those that have already been allocated toprocesses.
Slabs
SizeK The size of one slab, which typically contains multiple objects
Number This is the number of individual slabs that have been allocated and taking physical memory.
Memory
UsedK Memory used by those objects that have been allocated to processes.
TotalK Total physical memory allocated to processes. When there is no filtering in effect, this number will be equal to the Slabs field reported by -sm.