Showing posts with label iostat. Show all posts
Showing posts with label iostat. Show all posts

Thursday 17 October 2019

Elementary Linux Performance Monitoring

The basic tool here is top
Monitoring a single process can be done with -p option, in the next example we measure the MySQL process:

[root@(db-master) ~]# top -p 2521
top - 15:42:54 up 40 days, 10:46,  4 users,  load average: 0.14, 0.24, 0.48
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu0  :  1.0 us,  1.0 sy,  0.0 ni, 98.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:  32551020 total, 32285684 used,   265336 free,   149660 buffers
KiB Swap:  3129340 total,   402572 used,  2726768 free. 16662620 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 2521 mysql     20   0 18.725g 0.014t   4548 S 6.000 46.50   2735:03 mysqld

Load Average is a linux/unix mystery: Linux load averages are "system load averages" that show the running thread (task) demand on the system as an average number of running plus waiting threads. This measures demand, which can be greater than what the system is currently processing. 
For an extended excellent article around Linux Load Average, refer to Brendan Gregg's Blog

On the other hand good old ps which is available on all UNIX flavors and LINUX distributions can also help. The following command shows the most CPU consuming processes  in ascending order along with their virtual size 

[root@(db-master) ~]# ps -e -o pid,pcpu,vsz,comm= | sort -n  --key=3
...
 1669  0.0 752396 isecespd
 1759  0.0 1561472 isectpd
 2521 52.4 19634584 mysqld

To get the process tree try pstree -aAl:

[root@(db-master) ~]# pstree -aAl
systemd --switched-root --system --deserialize 24
  |-VGAuthService -s
  |-agetty --noclear tty1 linux
  |-automount -p /var/run/automount.pid
  |   `-5*[{automount}]
  |-cron -n
  |-dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
  |-discagnt /etc/init.d/discagnt start
  |   `-discagnt
  |-haveged -w 1024 -v 0 -F
...

For systems that do not have  pstree  try ps -ejH  

To get information about threads created by processes  try  ps -eLf

To get information about disk performance try iostat:

 [root@(mmcp_prod_corp)(db-master) ~]# iostat -dcm
Linux 4.4.121-92.117-default (mo-1400a55c2)     10/17/19        _x86_64_        (8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           7.22    0.00    0.59    1.19    0.00   91.00

Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               1.56         0.01         0.01      44144      51244
sdb             146.49         5.48         1.79   19159479    6250758

Finally to see all open files by a process such as data/shared objects/dynamic libraries and sockets use lsof. In the following example we can see all open files of mysql process:

[root@(db-master) ~]# lsof -p 2521
COMMAND  PID  USER   FD   TYPE             DEVICE     SIZE/OFF     NODE NAME
mysqld  2521 mysql  cwd    DIR              254,2         4096  6815769 /monsoon/mysql/data
mysqld  2521 mysql  rtd    DIR              254,0         4096        2 /
mysqld  2521 mysql  txt    REG              254,0    250387936   794500 /usr/sbin/mysqld
mysqld  2521 mysql  mem    REG              254,0        97056  1065145 /lib64/libresolv-2.22.so
mysqld  2521 mysql  mem    REG              254,0        26976  1065107 /lib64/libnss_dns-2.22.so


To see the TCP listening server sockets on a linux server, we can do that with netstat -tulpn

[root@(db-master) ~]# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:3306            0.0.0.0:*               LISTEN      2521/mysqld
tcp        0      0 0.0.0.0:2738            0.0.0.0:*               LISTEN      3282/discagnt
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      3289/sshd
tcp        0      0 127.0.0.2:25            0.0.0.0:*               LISTEN      3671/master
tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN      3671/master
tcp        0      0 127.0.0.1:6010          0.0.0.0:*               LISTEN      38622/0
tcp        0      0 :::7938                 :::*                    LISTEN      3317/nsrexecd
tcp        0      0 :::5666                 :::*                    LISTEN      1/systemd
udp     4352      0 0.0.0.0:68              0.0.0.0:*                           1521/wickedd-dhcp4
udp        0      0 10.97.6.160:123         0.0.0.0:*                           3343/ntpd


while for all open TCP sockets:

[root@(db-master) ~]# netstat -t
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 mo-1400a55c2.zone:mysql mo-6740a22da.zone:46138 ESTABLISHED
tcp        0     64 mo-1400a55c2.zone1.:ssh mo-657dabf53.zone:58606 ESTABLISHED
tcp        0      0 mo-1400a55c2.zone:mysql mo-23acddcc0.zone:50068 ESTABLISHED





Tuesday 12 September 2017

disk-benchmark A mutlipurpose benchmark program that can simulate your application's I/O performance

disk-benchmark tool - get it here!

Sometimes we need to have a prior estimation of I/O performance of a program we plan to develop or we currently posses.
This may be triggered by a number of reasons:
  • Order specific Disk hardware in advance
  • Plan to rent cloud based volume from a cloud provider
  • Estimate the total performance of your application in order to establish operational scenarios and calculate KPIs.
  • Check the cloud providers SLA compliance.
In the past I dealt with all those challenges using standard Linux methods for benchamarking a volume like the classic one:


dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=direct

Or other similar methods or tools like iostat.

The problem with all those methods, is that you can have an idea of how your disk performs in general, but not according to a given scenario, for example:
  • 20 concurrent users each of them reads and writes of a random file of size between 20k and 1 MB with a pause of 2 seconds for 5 mins.
  • 10 concurrent users each of them reads/ writes a file of 60kb with a pause of 2 seconds after read repeatedly for 100 times. 


Unless you go to very sophisticated tools like JMeter,  you don't really have something very handy. On the other hand, sophisticated tools most of the times, have a significant learning curve but of course in most cases, you want something to use it in the next 5 mins with very simple options just like the above scenarios. To amend this situation, last year, I developed a small C program that can be used to do the job, the disk-benchmark program available on Illumine IT Consulting GitHub URL:

https://github.com/illumine/disk-benchmark

This is a benchmark program to test Hard Drives, SSD Drives, HBAs, RAID Adapters & Storage Controllers. This is a really simple C program that you can compile using the standard GNU/gcc compiler that comes with your Linux distribution.

How to setup the disk-benchmark in your Linux system:
Installation of the disk-benchmark is as simple as this:

# git clone https://github.com/illumine/disk-benchmark
# cd disk-benchmark/src/
# gcc disk-benchmark.c -o disk-benchmark  -l pthread -lrt  -O3  -Wall
# ls -l disk-benchmark
-rwxr-xr-x 1 root root 23365 Apr 15 10:23 disk-benchmark

A simple scenario implementation using disk-benchmark

Scenario: 10 concurrent users each writing and reading a file of size ~10MB in /var.  Each user pauses for some seconds randomly picked from the interval [2,10] sec.  The command that implements the above scenario has as follows:

[root@mo-8f752419d src]# ./disk-benchmark -p /var -t 10 -a 10000000 -E 2:10

Test scenario:
test path=/var
Threads=10, sleep sec between write/read = 1, repeats per thread=5, random pick sleep sec from [2 10]
Lower file size=1024, Upper file size=10240, Absolute file size=10000000
Read/Write buffer size=8192,  Buff Siz W 0, Buf Siz R 0,
Do write only=0, Delete files=1
Print values only=0 dont print scenario info= 0, dont print clocks=0 dont print headers=0 print date=1
Work Continously=0  Work Continously Sleep Brake=5

T=7, Avg W=0.016134 Avg R=0.002160 Total W=0.080671 Total R=0.010801 Total Time=0.091473 Sleep=4.600000  Avg File Size =10000000.000000
T=2, Avg W=0.014436 Avg R=0.002411 Total W=0.072179 Total R=0.012056 Total Time=0.084234 Sleep=4.800000  Avg File Size =10000000.000000
T=4, Avg W=0.016104 Avg R=0.002189 Total W=0.080520 Total R=0.010943 Total Time=0.091463 Sleep=4.800000  Avg File Size =10000000.000000
T=9, Avg W=0.011966 Avg R=0.002069 Total W=0.059829 Total R=0.010347 Total Time=0.070176 Sleep=4.800000  Avg File Size =10000000.000000
T=6, Avg W=0.013065 Avg R=0.001826 Total W=0.065323 Total R=0.009128 Total Time=0.074451 Sleep=5.000000  Avg File Size =10000000.000000
T=1, Avg W=0.015399 Avg R=0.003005 Total W=0.076996 Total R=0.015025 Total Time=0.092021 Sleep=5.200000  Avg File Size =10000000.000000
T=8, Avg W=0.012883 Avg R=0.002303 Total W=0.064416 Total R=0.011513 Total Time=0.075930 Sleep=5.200000  Avg File Size =10000000.000000
T=3, Avg W=0.015850 Avg R=0.002492 Total W=0.079251 Total R=0.012458 Total Time=0.091709 Sleep=5.400000  Avg File Size =10000000.000000
T=0, Avg W=0.013430 Avg R=0.002697 Total W=0.067151 Total R=0.013487 Total Time=0.080637 Sleep=5.600000  Avg File Size =10000000.000000
T=5, Avg W=0.016659 Avg R=0.002387 Total W=0.083293 Total R=0.011934 Total Time=0.095226 Sleep=5.600000  Avg File Size =10000000.000000

T=-1, Avg W=0.014593 Avg R=0.002354 Total W=0.072963 Total R=0.011769 Total Time=0.084732 Sleep=5.100000  Avg File Size =10000000.000000
Wall time 28.000000, CPU time 0.880000
Tue Sep 12 13:36:26 2017