Unixchips

Monday, February 1, 2016

solaris ZFS

ZFS is a combined file system and logical volume manager designed by sun microsystems . The features of ZFS includes its high scalability,maximum integrity,drive pooling and multiple
RAID levels. ZFS uses concept of storage pools to manage physical storage, instead of volume manager ZFS aggregates devices in to storage pool.The storage pool describes physical characteristics of the storage and act as an arbitrary data store from which file systems can be created.Also you don't need to predefine the size of the file system and file systems inside the ZFS pools will be automatically grow with in the disk space allocated to the storage pool.

High scalability

ZFS is the 128 bit file system that is capable of zettabites of data ( 1 billion terabytes) of data , no matter how much harddrive you have zfs is capable to manage it.

Maximum Integrity

All the data inside the ZFS occupied with a checksum which will ensure it's data integrity . You can ensure that your data will not encounter silent data corruption.

Drive pooling

ZFS is just like our system RAM, when ever you need more space, you need to insert the new HDD only and it will automatically added in to the pool . So no need of headache as formating, initializing, partitioning etc.

Capability of different RAID levels

We can configure multiple RAID levels using ZFS and performance wise it is upto the mark with hardware raids.

Configuration Part

Stripped pool

In this case data is stripped between multiple HDD's and no backup is available. The data access speed is high in this case

1. In our server we have below HDD's attached

bash-3.00# format

Searching for disks...done

AVAILABLE DISK SELECTIONS:

0. c0d0 <DEFAULT cyl 2085 alt 2 hd 255 sec 63>

/pci@0,0/pci-ide@1,1/ide@0/cmdk@0,0

1. c0d1 <VBOX HAR-34e30776-506a21a-0001-1.01GB>

/pci@0,0/pci-ide@1,1/ide@0/cmdk@1,0

2. c1d1 <DEFAULT cyl 513 alt 2 hd 128 sec 32>

/pci@0,0/pci-ide@1,1/ide@1/cmdk@1,0

2. Create the stripped pool

bash-3.00# zpool create unixpool c0d1 c1d1

3. Check the pool status

bash-3.00# zpool status unixpool

pool: unixpool

state: ONLINE

scrub: none requested

config:

NAME STATE READ WRITE CKSUM

unixpool ONLINE 0 0 0

c0d1 ONLINE 0 0 0

c1d1 ONLINE 0 0 0

errors: No known data errors

bash-3.00# zpool list unixpool

NAME SIZE ALLOC FREE CAP HEALTH ALTROOT

unixpool 1.98G 78K 1.98G 0% ONLINE -

bash-3.00# zfs list unixpool

NAME USED AVAIL REFER MOUNTPOINT

unixpool 73.5K 1.95G 21K /unixpool

Mirrored Pool

As the name pointed this will have mirrored configuration and occupies backup. Data Read speed is high in this case, but write speed is slow

1. Creating the mirrored pool

bash-3.00# zpool create unixpool mirror c0d1 c1d1

2. Verifying the pool status

bash-3.00# zpool status unixpool

pool: unixpool

state: ONLINE

scrub: none requested

config:

NAME STATE READ WRITE CKSUM

unixpool ONLINE 0 0 0

mirror-0 ONLINE 0 0 0

c0d1 ONLINE 0 0 0

c1d1 ONLINE 0 0 0

errors: No known data errors

bash-3.00# zpool list unixpool

NAME SIZE ALLOC FREE CAP HEALTH ALTROOT

unixpool 1016M 108K 1016M 0% ONLINE -

bash-3.00# zfs list unixpool

NAME USED AVAIL REFER MOUNTPOINT

unixpool 73.5K 984M 21K /unixpool

Mirroring multiple disks

*****************

1. Creating the mirrored pool

bash-3.00# zpool create unixpool2m mirror c0d1 c1d1 mirror c2t0d0 c2t1d0

2. verifying the pool status

bash-3.00# zpool status unixpool2m

pool: unixpool2m

state: ONLINE

scrub: none requested

config:

NAME STATE READ WRITE CKSUM

unixpool2m ONLINE 0 0 0

mirror-0 ONLINE 0 0 0

c0d1 ONLINE 0 0 0

c1d1 ONLINE 0 0 0

mirror-1 ONLINE 0 0 0

c2t0d0 ONLINE 0 0 0

c2t1d0 ONLINE 0 0 0

errors: No known data errors

bash-3.00# zpool list unixpool2m

NAME SIZE ALLOC FREE CAP HEALTH ALTROOT

unixpool2m 2.04G 81K 2.04G 0% ONLINE -

bash-3.00# zfs list unixpool2m

NAME USED AVAIL REFER MOUNTPOINT

unixpool2m 76.5K 2.01G 21K /unixpool2m

Raid 5 (RaidZ) spool

This needs minimum 3 HDD's and able to sustain single HDD failure . If the disks are different sizes we need to use -f option while creating

1. Creating the raidz

bash-3.00# zpool create -f unixpoolraidz raidz c0d1 c1d1 c2t0d0

2. Checking the status

bash-3.00# zpool status unixpoolraidz

pool: unixpoolraidz

state: ONLINE

scrub: none requested

config:

NAME STATE READ WRITE CKSUM

unixpoolraidz ONLINE 0 0 0

raidz1-0 ONLINE 0 0 0

c0d1 ONLINE 0 0 0

c1d1 ONLINE 0 0 0

c2t0d0 ONLINE 0 0 0

errors: No known data errors

bash-3.00# zpool list unixpoolraidz

NAME SIZE ALLOC FREE CAP HEALTH ALTROOT

unixpoolraidz 2.98G 150K 2.98G 0% ONLINE -

bash-3.00# zfs list unixpoolraidz

NAME USED AVAIL REFER MOUNTPOINT

unixpoolraidz 93.9K 1.96G 28.0K /unixpoolraidz

Raid 6 (RaidZ2)

This needs minimum 4 HDD's and can sustain 2 HDD failures

1. Creating the raidz2

bash-3.00# zpool create -f unixpoolraidz2 raidz2 c0d1 c1d1 c2t0d0 c2t1d0

2. Checking the status

bash-3.00# zpool status unixpoolraidz2

pool: unixpoolraidz2

state: ONLINE

scrub: none requested

config:

NAME STATE READ WRITE CKSUM

unixpoolraidz2 ONLINE 0 0 0

raidz2-0 ONLINE 0 0 0

c0d1 ONLINE 0 0 0

c1d1 ONLINE 0 0 0

c2t0d0 ONLINE 0 0 0

c2t1d0 ONLINE 0 0 0

errors: No known data errors

bash-3.00# zpool list unixpoolraidz2

NAME SIZE ALLOC FREE CAP HEALTH ALTROOT

unixpoolraidz2 3.97G 338K 3.97G 0% ONLINE -

bash-3.00# zfs list unixpoolraidz2

NAME USED AVAIL REFER MOUNTPOINT

unixpoolraidz2 101K 1.95G 31.4K /unixpoolraidz2

Destroying a zpool

We can destroy the zpool even it is mounted status also

bash-3.00# zpool destroy unixpoolraidz2

bash-3.00# zpool list unixpoolraidz2

cannot open 'unixpoolraidz2': no such pool

Importing and exporting the pool

As a unix administrator you might be facing some situation for storage migration. In that case we have a option called storage pool migration from one storage system to other

1. We have a pool called unixpool now

bash-3.00# zpool status unixpool

pool: unixpool

state: ONLINE

scrub: none requested

config:

NAME STATE READ WRITE CKSUM

unixpool ONLINE 0 0 0

mirror-0 ONLINE 0 0 0

c0d1 ONLINE 0 0 0

c1d1 ONLINE 0 0 0

mirror-1 ONLINE 0 0 0

c2t0d0 ONLINE 0 0 0

c2t1d0 ONLINE 0 0 0

errors: No known data errors

2. Exporting the unixpool

bash-3.00# zpool export unixpool

3. Checking the status

bash-3.00# zpool status unixpool

cannot open 'unixpool': no such pool

Once you imported you can see there is no pool is available now. Now we need to import the pool to a new system

4. Check the pool need to be imported using zpool command

bash-3.00# zpool import

pool: unixpool

id: 13667943168172491796

state: ONLINE

action: The pool can be imported using its name or numeric identifier.

config:

unixpool ONLINE

mirror-0 ONLINE

c0d1 ONLINE

c1d1 ONLINE

mirror-1 ONLINE

c2t0d0 ONLINE

c2t1d0 ONLINE

5. zpool import unixpool

bash-3.00# zpool status unixpool

pool: unixpool

state: ONLINE

scrub: none requested

config:

NAME STATE READ WRITE CKSUM

unixpool ONLINE 0 0 0

mirror-0 ONLINE 0 0 0

c0d1 ONLINE 0 0 0

c1d1 ONLINE 0 0 0

mirror-1 ONLINE 0 0 0

c2t0d0 ONLINE 0 0 0

c2t1d0 ONLINE 0 0 0

errors: No known data errors

We have successfully imported the unixpool now.

Zpool scrub option for integity check and repairing

We have a option for zpool integrity check and repairing just like our fsck in conventional unix file system . We can use zpool scrub for to achive this. From below command you can see the scrub option completed status and timing also

bash-3.00# zpool scrub unixpool

bash-3.00# zpool status unixpool

pool: unixpool

state: ONLINE

scrub: scrub completed after 0h0m with 0 errors on Mon Feb 1 18:30:31 2016

config:

NAME STATE READ WRITE CKSUM

unixpool ONLINE 0 0 0

mirror-0 ONLINE 0 0 0

c0d1 ONLINE 0 0 0

c1d1 ONLINE 0 0 0

mirror-1 ONLINE 0 0 0

c2t0d0 ONLINE 0 0 0

c2t1d0 ONLINE 0 0 0

errors: No known data errors

Thank you . Please post your comments and suggestions

Tuesday, January 26, 2016

Performance analysis in unix - Disk I/O

If you are a sysadmin, sometimes you might be faced in some situation where disk I/O is playing a Villon role in overall system performance(especially in DB systems.). There are verity of reasons for that starting from disk issues to HBA driver issues which we cannot predict. But monitoring and analyzing disk performance is a major role for a sysadmin to avoid any system performance degrade.

The primary tool using to analyse disk performance issues is iostat, also sar -d provide historical performance data along with third party tools called
dtrace for solaris 10 servers.

iostat -xn output

device- disk details of the server
r/s - read per second
w/s - write per second
kr/s - kbytes read per second
kw/s- kbytes written per second
wait - Average number of transactions that are waiting for the service (queue length)
actv - Average number of transactions that are actively being served .
svc_t - Average service time in milliseconds
%w - Percentage of time that queue is not empty
%b - Percentage od time that disk is full busy.

In the above output if svc_t (service time) values is more than 20 ms on the disks that are in use, we can consider it as a sluggish performance.
But now a days for the disks with large amount of cache, it is always advisable to monitor the service time intermediately if disk is not busy also.For example
if the writes and reads on a fiber attached disk cache is increased it will cause the service time to be increased 3-5 ms.

If we consider %b value in above output, if the disk is showing continuously 60 % utilization for a period of time, we can consider as the disk is saturated. Whether the application is really impacted due to this disk utilization can be evaluated by using service time parameter from the above output.

Disk saturation

The high disk saturation can be measured from %w value from iostat output. High disk saturation will slow the system performance as the number of process queued up will be increased. Ideally %w > 5 can be considered as high disk saturation . In this case setting sd_max_throttile to 64 will be helpful ( sd_max_throttile will determine how many job's are queued up in a single HBA and its default value is 256). Another reason for high %w is due to scsi devices precedence, low scsi ID devices have less precedence than high scsi ID devices.

We need to check the behavior of the disk I/O as whether it is random or sequential. Sequential I/O which involves while reading or writing large files or folders and it is bit speedy compared to random I/O. This behavior can be analysed by using sar -d command , for example if (blks/s) / (r+w/s) is < 16KB then I/O is random and if the same output is > 128KB then the I/O behavior is sequential.

Disk Errors

iostat -eE will show the disk error details from last reboot and we need to consider below parameters while considering disk errors .
*********************************************************************************
bash-3.00# iostat -eE
---- errors ---
device s/w h/w trn tot
cmdk0 0 0 0 0
sd0 0 0 0 0
nfs1 0 0 0 0
cmdk0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Model: VBOX HARDDISK Revision: Serial No: VB4d87fd3f-3f00 Size: 17.18GB <1717 9803648 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0
sd0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: VBOX Product: CD-ROM Revision: 1.0 Serial No:
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 2 Predictive Failure Analysis: 0
*********************************************************************************
Possible solutions for disk I/O problems are given below

1. check the file system kernel parameters to make sure that inode caches are working properly
2.Spread the I/O traffic along with multiple disks (especially with RAID setup or ZFS)
3. Redesign the problematic process to reduce the number of disk I/O's (especially by cachefs or application based cache)
4. Setting the proper write throttle value . For example if the ufs_write value is set to 1 ( as default) if the no if writes exceeds ufs_HW then writes are suspended untill the no of writes reaches up to ufs_LW.
(ufs_writes -- if this value is non zero the number of bytes outstanding for the writes in a file will be checked . ufs_HW--- Number of bytes outstanding on a single file maximum value. ufs_LW---when the writes completed and number of writes is less than this value, all the pending (sleeping) process will be awaken and start writes)
5. Database I/O should be done to row disk partitions ( please avoid NFS )

File system performance

When considering file system performance an important role will be file system latency which will impact I/O performance. Below are the main reasons for file system latency

1. Disk I/O wait - This will be short as 0 in the event of read cache hit. For a synchronous I/O event this can be altered by adjusting cache parameters.

2. File system cache misses- Missing of block,buffer,metadata,name look up caches will highly impact file system latency

3. File system locking - Most file system have file system locking and this will be a major impact in case of bigger files like database files .

4. Metadata updating - creations,deletions,updation of file extensions will cause extra latency allows for file system metadata.

As mentioned earlier file system caches have an important role in I/O performance. The major file system cache's are below

1. DNLC ( Directory Name Lookup Cache) - This cache looks up the vnode to directory path lookup information which will avoid directory lookup every time on fly.

2. Indode Cache - Stores the file metadata information ( like size/access time) etc.

3. Rnode Cache - This is stored in NFS clients regarding the information about NFS mount points.

4. Buffer Cache - This is the reference between physical metadata ( eg: block placement on the filesystem) and logical data which is stored in other caches .

Physical disk Layout

The disk layout for a hard drive includes below

1. boot block

2. Super block

3. Inode list - ( The number of inodes can be changed by mkfs command)

4. data blocks

Inode Layout

Each inode contains below information

1. File type,permission etc

2. Number of hardlinks to the file

3. UID

4.GID

5. byte size

6. array of block addresses

7. generation number ( incremented every time when the inode is re used)

8. access time

9. modification time

10. change time

11. Number of sectors

12. Shadow inode location ( which can be used with ACL)

So overall disk I/O performance includes lots of dependencies including application tuning , disk physical setup,different cache sizes etc and as a sysadmin we need to consider all these factors need to be tuned to improve I/O performance as nutshell.

Wednesday, December 23, 2015

Performance Analysis in solaris 10- Memory

In real world memory performance issues plays a major role in system performance. Unix system memory contains 2 types , one is physical memory which is attached to the DIMM modules of the hardware and second is swap space which is a dedicated space in the disk which is treated as a memory by OS ( since the disk I/O is much slower than the I/O to the memory in generic way we will prefer to use swap space less frequently as possible).
tivity
Swap space is only used when the physical memory is too small to accommodate system memory requirements . At that time space is freed in physical memory by paging (moving) it out to swap space ( also keep it in mind that if the increase in paging to swap space will degrade system CPU performance )

vmstat command
***********

vmstat reports virtual memory statics regarding kernel thread, virtual memory,disk,thread,cpu activity etc. Also note that in multiple CPU systems this will show the average of the number of CPU output.

The details of the command is given below

kthr - This indicates the kernel threads details in 3 states .

r - The number of kernel threads in run queue

b - The number of blocked threads which are waiting for I/O paging

w - Number of swapped out lightweight processes (LWP - processes which is running under same kernel thread and shares its system resources and addresses with other lwp's) that are waiting for resources to finish

memory - Report the usage of real and virtual memory

swap - available swap space ( in kb)

free - size of the free list (in kb)

page - Report about the page faults and paging activity . Details of this section is given below

re - page reclaims

mf - minor faults

pi - kilobytes paged in

po - kilobytes paged out

fr - kilibytes freed

de - anticipated short term memory shortfall ( in KB)

sr- pages scanned by clock algorithm

disk - Reports the number of disk operations per second . There are slots up to 4 disks with a letter and number ( letter indicates the disk type like scsi.ide ) and number is the logical number

faults - Reports trap/interrupt rates

in - interrupts

sy - system calls

cs- cpu context switches

cpu - Breakdown usage of the cpu time. In multi processor systems it will be average of all the CPU's

us - user time

sys - system time

id - idle time

swap usage analysis

****************

If you concentrate in swap analysis we need to use below mentioned two commands .

bash-3.2# swap -s
total: 267032k bytes allocated + 86184k reserved = 353216k used, 964544k available

bash-3.2# swap -l
swapfile dev swaplo blocks free
/dev/zvol/dsk/rpool/swap 181,1 8 2097144 2097144

But there is major difference between these two commands , in the first one we are using 353216k of (964544 + 353216)k which means of 26% in use. In the second one you can see as all the 2097144 is free means 0% is used. In the first command (swap -s ) includes the portion of physical memory also which is using as swap. The major difference in usage of these two commands are in generic if you are checking the swap usage over time you can use swap -s. (if the system performance is good). But if the system performance is degraded you need to concentrate more about the change in swap usage and what causes that change ( also keep it in mind that swap -l displays output in 512 bytes and swap -s displays in 1024 byte blocks) .

If the system run's out of swap space it will show the error messages given below and we might think about expanding the same using creating the swap file. In general while creating the swap you have to provide size as half of the system physical memory

for example if the system memory is 8GB the ideal swap size should be 4GB

***********************************************************
application is out of memory

malloc error O

messages.1:Sep 21 20:52:11 mars genunix: [ID 470503 kern.warning]
WARNING: Sorry, no swap space to grow stack for pid 100295 (myprog)
***********************************************************

Creating the swap file

1.Login as super user

2. Create the swap file using mkfile <name> <size in k/m/g> filename

3. Activate the swap file using /usr/sbin/swap -a /path/filename

4. Add the entry at /etc/vfstab

/path/filename - - swap - no -

5. Verify the swap file using /usr/bin/swap -l

As a nutshell while configuring the swap , please keep it in below points

Never allocate swap with size less then 30% of RAM.
Determine whether large applications (such as compilers and databases) will be using the /tmp directory. If one or several of your application have a huge demand for swap space, use the swap -s command to monitor swap resources on a similar existing system tro get estimate of the actual requirements.

Cache

If we check the free -m command in a unix box we can see major portion of the memory is in cached column. So what is mean by that cache, is it currently used by system?

[root@testserver ~]# free -m
             total       used       free     shared    buffers     cached
Mem:         15976      15195        781          0        167       9153
-/+ buffers/cache:       5874      10102
Swap:         2000          0       1999

In this case you can see 9GB is cached . These caches are called page caches / dirty caches which will act as a temporary memory for read and write process. During the write process the contents of these dirty cache will be periodically transferred to the system storage . Till 2.6.31 version , the process called pdflush will ensure that the data is transferring to system storage and clearing the dirty pages periodically. But after this kernel version there will be a thread for each device ( like sda/sdb) will monitor this mechanism

root@pc:~# ls -l /dev/sda
brw-rw---- 1 root disk 8, 0 2011-09-01 10:36 /dev/sda
root@pc:~# ls -l /dev/sdb
brw-rw---- 1 root disk 8, 16 2011-09-01 10:36 /dev/sdb
root@pc:~# ps -eaf | grep -i flush
root       935     2  0 10:36 ?        00:00:00 [flush-8:0]
root       936     2  0 10:36 ?        00:00:00 [flush-8:16]

This same mechanism is applicable for reading also, file blocks will be transferred from disk to page cache for reading . For example if you access 100MB file twice , in second time the access will be faster as it is fetching from the cache. If Linux needs more memory for normal applications than is currently available, areas of the Page Cache that are no longer in use will be automatically deleted.

Mostly log files or database dump file ( data files) are mostly accumulated by page cache as it is accessed continuously . So configuring perfect log rotate or zipping it periodically will release the page cache when it will be really needed for system performance .

Tuesday, December 15, 2015

Performance analysis in solaris 10 - CPU

Performance analysis is one of the key task for every system admins which is an important point for the the system availability ( especially production systems with SLA basis).We should do the periodical check for the various system parameters and ensure nothing is getting in to wrong way which is hampering the normal operations of the production systems .

The main factores we should consider for system performance analysis are disk IO, CPU,Memory & Swap,network and zones ( i am omitting other service components like name services , NFS, kernel tuning etc.. which can be discussed separately).

CPU Loading

Load average is the average over time the number of processes in run queue. This is used to represent the load on CPU and load average refers to three numbers with 1-5-15 minutes intervals . Typically the load average divided by the number of cpu cores are used to find the load per cpu and the load average above 1 per cpu is considered as cpu is fully utilized . Also a general rule of thumb in load average is "average value which is 4 times the number of cpu results a sluggish performance".

Load average can be monitored by the command uptime or monitoring the run queue time of the processors using sar -q command

uptime
*******
bash-3.2# uptime
4:29pm up 34 day(s), 14:45, 2 users, load average: 0.45, 0.49, 0.54

The last 3 numbers are the load average which will be 1,5,15 minutes interval. Now we need to find what is load metric . This metric for a particular load at given point of time is how many processes are queued per the running process ( including the current running ones). For example in last minute if the load average is 0.50 means half of the time of the last minute CPU was idle with out any running processes. Another example of the load average is 2.50 in last minute means average of 1.5 processes are waiting to run in the queue and the CPU was overloaded by 150%

The load average can be monitored by analyzing the run queue length and amount of time to take for that using the sar -q command .

Using the sar-q command we will getto know the following information
1. The average queue length while the queue is occupied
2. The percentage of time that the queue is occupied.

If you check the command output header you will get below details from sar -q

SunOS testsolaris 5.10 Generic_144488-05 sun4v 12/15/2015

00:00:00 runq-sz %runocc swpq-sz %swpocc
01:00:01 1.0 1 0.0 0
02:00:00 1.0 1 0.0 0
03:00:01 1.0 1 0.0 0
04:00:00 1.0 1 0.0 0
05:00:00 1.1 1 0.0 0
06:00:00 1.0 1 0.0 0
07:00:00 1.0 1 0.0 0

08:00:01 1.1 5 0.0 0
............................................

Average 1.0 3 0.0 0

run-sz - This indicates the number of kernel threads in the memory which is waiting to occupy the CPU. Normal value of this should be less than 2 and if it is consistently become high the system CPU is fully utilized ( can consider about adding more CPU)

%runocc - This indicates the run queue (dispatch) occupancy . The consistent run queue occupancy is the CPU saturation .

swap-sz - The average number of swapped out processes

%swapocc- The percentage of time in which the processes are swapped out.

So by over all if the %runocc is greater than 90 and runq-sz value is greater than 2 we should consider about adding more CPU for a consistent system performance.

Prstate
***********
This is one of the most widely utilized system utility for below cases

1. How much my system utilized in case of CPU & memory
2. Utilization of the system ( zone wise,user wise,process wise )
3. How are the processes/threads utilizing the system ( user bond, I/O bond)

PID: the process ID of the process.

USERNAME: the real user (login) name or real user ID.

SIZE: the total virtual memory size of the process, including all mapped files and devices, in kilobytes (K), megabytes (M), or gigabytes (G).

RSS: the resident set size of the process (RSS), in kilobytes (K), megabytes (M), or gigabytes (G).

STATE: the state of the process (cpuN/sleep/wait/run/zombie/stop).

PRI: the priority of the process. Larger numbers mean higher priority.

NICE: nice value used in priority computation. Only processes in certain scheduling classes have a nice value.

TIME: the cumulative execution time for the process.

CPU: The percentage of recent CPU time used by the process. If executing in a non-global zone and the pools facility is active, the percentage will be that of the processors in the processor set in use by the pool to which the zone is bound.

PROCESS: the name of the process (name of executed file).

NLWP: the number of lwps in the process.

Also you can sort the prstat by ascending ( S option) or descending (s option) with respect to below parameters

cpu - sort by cpu usage ( by default this option is applicable)

pri - By process priority

rss- Set by resident set size

size- By size of the process image

time- Sort by execution time

If you want the utilization report according to zone wise use prstat -Z . Here you can see global zone and testzone separately

Also one more option in prstat which is called microstat accounting (prstat -m) and it will provide the CPU latency , system time, etc

In nutshell we can assume CPU performance issues as below

1. The number of processes in run queue is greater than the number of CPU's in the system
2. If the process queue is 4 times more than the number of available CPU's in the system
3. Also if the CPU idle time is 0 and system time is double than the user time , then the system is facing some major CPU shrink.

Also we have 3rd party performance analysis tools like Dtrace which will be discussed separately in other occasion .

Tuesday, December 8, 2015

Unix useful Tips & Tricks

Python script to check dependency services of a particular service in RHEL 7

Below is a sample python script to check dependency services of a particular service in RHEL 7. It will prompt you to enter the service need to be verified

******************************************

#!/usr/bin/python

import os

import subprocess

servname = raw_input ("Please enter servicename need to be checked:")

os.system ("systemctl list-dependencies --after $servname")

************************************************

as an example it prompted me to enter the service and i have entered gdm.service

[root@redhat7-test ~]# python2.7 service_dep.py

Please enter servicename need to be checked:gdm.service

output is below

************

├─accounts-daemon.service

├─gdm.service

├─network.service

├─rhnsd.service

├─rtkit-daemon.service

└─multi-user.target

├─abrt-ccpp.service

├─abrt-oops.service

├─abrt-vmcore.service

├─abrt-xorg.service

├─abrtd.service

├─atd.service

├─avahi-daemon.service

├─brandbot.path

├─chronyd.service

├─crond.service

├─cups.path

├─dbus.service

├─hypervkvpd.service

├─hypervvssd.service

├─irqbalance.service

├─kdump.service

├─ksm.service

├─ksmtuned.service

├─libstoragemgmt.service

├─libvirtd.service

├─mdmonitor.service

├─ModemManager.service

├─netcf-transaction.service

├─network.service

├─NetworkManager.service

├─plymouth-quit-wait.service

├─plymouth-quit.service

├─postfix.service

├─rescue.service

├─rhel-configure.service

├─rhnsd.service

├─rhsmcertd.service

├─rngd.service

├─rsyslog.service

├─smartd.service

├─sshd.service

├─sysstat.service

├─systemd-logind.service

├─systemd-user-sessions.service

├─tuned.service

├─vmtoolsd.service

├─basic.target

│ ├─rhel-import-state.service

│ ├─systemd-ask-password-plymouth.path

│ ├─paths.target

│ │ ├─brandbot.path

│ │ ├─cups.path

...........................................( to be continued)......................................

ASM driver issue after RHEL patching

In most of the cases we used to face Oracle ASM module loading issue after patching. when we try to /etc/init.d/oracleasm listdisks, we will get the message as unable to load the oracle ASM. So in this case we have patched the server from RHEL 5.6 to RHEL 5.11 which is upgraded the kernel from 2.6.18-238.9.1.el5 to 2.6.18-398.el5.

1. First create a directory in below structure

#mkdir /lib/modules/2.6.18-398.el5/kernel/drivers/addon/oracleasm/

2. copy the oracle ASM module from old kernel structure to the new one

#cp /lib/modules/2.6.18-238.9.1.el5/kernel/drivers/addon/oracleasm/oracleasm.ko lib/modules/2.6.18-398.el5/kernel/drivers/addon/oracleasm/

3. Then load the new module using below command

#modprobe /lib/modules/2.6.18-398.el5/kernel/drivers/addon/oracleasm/oracleasm.ko

4. Install the module using below command

#insmod /lib/modules/2.6.18-398.el5/kernel/drivers/addon/oracleasm/oracleasm.ko

5. Now if we check the oracle ASM rpms you can see below output

# rpm -qa | grep oracleasm

oracleasm-2.6.18-238.9.1.el5-2.0.5-1.el5

oracleasm-support-2.1.4-1.el5

oracleasmlib-2.0.4-1.el5

6. We need to get the manual installation script from RPM which is need to be executed

#rpm -q --scripts oracleasm-2.6.18-238.9.1.el5-2.0.5-1.el5

postinstall scriptlet (using /bin/sh):

depmod -ae 2.6.18-238.9.1.el5

7. Now install the depmod script

#depmod -ae 2.6.18-238.9.1.el5

8. After this installation once you reboot the server and check the oracle ASM status we can see all the disks are visible

#/etc/init.d/oracleasm listdisks

ASMDATA01

ASMDATA02

ASMDATA03

ASMDATA04

ASMDATA05

ASMDATA06

ASMDATA07

ASMDATA08

ASMDATA09

ASMDATA10

ASMDATA11

ASMDATA12

ASMDATA13

ASMDATA14

ASMDATA15

ASMDATA16

ASMDATA17

ASMDATA18

ASMDATA19

ASMDATA20

ASMDATA21

ASMDATA22

******************************************************************************

Linux Password reuse configuration is not working in RHEL 5 & 6

Generally if we configure password remember option in /etc/pam.d/system-auth our assumption is it should work as expected. But unfortunately in RHEL 5/6 system it will not work

For example the default /etc/pam.d/system-auth file in RHEL 6 is given below

The configuration for password remember/reuse is

password sufficient pam_unix.so sha512 shadow nullok try_first_pass use_authtok remember=1

This will create a file called /etc/security/opasswd and used passwords will be stored in encripted format in side this file. But in this case the file is generated but it will not get updated as expected . So what ever we will configure as remember count it will not update.

-rw-------. 1 root root 0 Aug 15 2011 opasswd_old

So in this case the solution which is working is we need to load one more module inside system-auth called

pam_pwhistory.so

[root@rhel6 security]# locate pam_pwhistory.so

/lib64/security/pam_pwhistory.so

So the updated system-auth file will be shown as below

Now if you check the /etc/security/opasswd file it is getting updated as expected

[root@rhel6 security]# cat opasswd

ratheesh:500:4:$6$tL/oHK6s$1lHNvcvFT03XB7r6BiWGSppsra9aU3/dRpXykLr9VXZuNxbB6upACW1iBipLFv3gWrM/neG884MM17ifeijSG/,$6$zkCpe2ES$Fabn1qFnLROzHI7iCDOIbKLnUzhH6kCDj5SARVzEUQ7u4DzlrlBOqmE3snXSXSnRLlhYppRlF4fa1woQtHv.o0,$6$e/AlfJGO$6B0X7zKZMwWZ0ldVg99XqjuvtyM5q3RS/M1ZLIfbOUvz77GgU2B87PijTh.ZE7DkRVUz5TNGZVni4mkZzpDNp0,$6$XNOH1T4B$545KhPkth4nfVEkFBaSJsQk/rbmoZ5JDyyws3PCftInxxmxRaRvHF9bklxRiu6raNJKGEJA5FfkJVssCDEp9c0

Thats it....

********************************************************************************

RHEL7 Graphical User Interface (GUI) is not coming

By default when you install RHEL 7 , GUI will not come up as default and same need to be installed separately. In this case we need to install Gnome packages and it's dependencies using the yum repository .

1. Login to the RHEL 7 system

2. We need to check the groups available from yum repository

[root@redhat7-test tmp]# yum grouplist

Loaded plugins: product-id, subscription-manager

This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.

There is no installed groups file.

Maybe run: yum groups mark convert (see man yum)

Available environment groups:

Minimal Install

Infrastructure Server

File and Print Server

Basic Web Server

Virtualization Host

Server with GUI

Available Groups:

Compatibility Libraries

Console Internet Tools

Development Tools

Graphical Administration Tools

Legacy UNIX Compatibility

Scientific Support

Security Tools

Smart Card Support

System Administration Tools

System Management

Done

3. Now we need to install the group called "Server with GUI"

[root@redhat7-test tmp]# yum groupinstall 'Server with GUI'

...........................output will be omitted......................................

xorg-x11-drv-void.x86_64 0:1.4.0-23.el7 xorg-x11-drv-wacom.x86_64 0:0.23.0-6.el7

xorg-x11-font-utils.x86_64 1:7.5-18.1.el7 xorg-x11-fonts-Type1.noarch 0:7.5-9.el7

xorg-x11-glamor.x86_64 0:0.6.0-2.20140918git347ef4f.el7 xorg-x11-server-common.x86_64 0:1.15.0-32.el7

xorg-x11-server-utils.x86_64 0:7.7-4.el7 xorg-x11-xkb-utils.x86_64 0:7.7-9.1.el7

yajl.x86_64 0:2.0.4-4.el7 yelp-libs.x86_64 1:3.8.1-7.el7

yelp-xsl.noarch 0:3.8.1-2.el7 zenity.x86_64 0:3.8.0-4.el7

Complete!

4. In RHEL7 init run levels are mentioned as "targets" , so we need to check the current target using below command

[root@redhat7-test tmp]# systemctl get-default

multi-user.target

5. Modify the target as "graphical target" using below command

[root@redhat7-test tmp]# systemctl enable graphical.target --force

rm '/etc/systemd/system/default.target'

ln -s '/usr/lib/systemd/system/graphical.target' '/etc/systemd/system/default.target'

[root@redhat7-test tmp]# systemctl get-default

graphical.target

6. Reboot the system and Accept the licence agreement ( bit confusing this option, select carefully with respect to instructions)

That's it ....................

*********************************************************************************

Space cleanup

Sometime while doing the cleanup of filesystem we used to face as , if we compress or delete the big files also it will not reflect in the partition,which will be a big confusion as what is happening .

1. First we need to compress or delete the files which is big in size. Here we need to compress the root .

[root@testserver mail]# ls -lrt
total 660744
----
-rw------- 1 root root 3131137980 Dec 8 16:30 root.

[root@testserver mail] # df -h /var

Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rootvg-varvol
3.9G 3.5G 300M 95% /var

[root@testserver mail]# gzip root

[root@testserver mail]# df -h /var

Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rootvg-varvol
3.9G 3.9G 0 100% /var
2. From the above exercise we found that the size of the /var is increased to 100 even though we had compressed the root mail. so what will do on next

3. We need to find the process which is doing the cleanup of root mail using below command

[root@testserver mail]#lsof | grep -i deleted
.........................output is omitted............
gdm-rh-se 6085 root txt REG 253,1 49184 3801116 /usr/libexec/gdm-rh-security-token-helper.#prelink#.LcSyru (deleted)
yum-updat 6119 root txt REG 253,1 4736 263076 /usr/bin/python.#prelink# (deleted)
gzip 10101 root 3r REG 253,2 3038580296 458776 /var/spool/mail/root (deleted)
gzip 10102 root 3r REG 253,2 3038580296 458776 /var/spool/mail/root (deleted)
.............................output is omitted ...................

here the pid is 10101 and 10102 which is pointing deleted process

4.So get in to the proc and pid
[root@testserver mail]#cd /proc/10101/fd

[root@testserver fd]# ls -l
total 0
lrwx------ 1 root root 64 Dec 8 16:40 0 -> /dev/pts/1
lrwx------ 1 root root 64 Dec 8 16:40 1 -> /dev/pts/1
lrwx------ 1 root root 64 Dec 8 16:40 2 -> /dev/pts/1
lr-x------ 1 root root 64 Dec 8 16:40 3 -> /var/spool/mail/root (deleted)

5. Now we need to clean up the respective pointers from these PID's

[root@testserver fd]# > 3

6. Just try to check the size of the /var partition again . It is cleaned

[root@ testserver fd]# df -h /var
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rootvg-varvol
3.9G 974M 2.8G 26% /var
*******************************************************************************

How to change hostname and ip address in RHEL 7

RHEL 7 is entirely different than other versions of the redhat and here i am providing the steps to change hostname and ip address in RHEL 7

1. In RHEL7 hostname details are saved in /etc/hostname . So changing the hostname is made easy in RHEL7

[root@redhat7-test ~]# cat /etc/hostname
redhat7-test

2. We need to change the ip in below file for RHEL 7

[root@redhat7-test ~]# cat /etc/sysconfig/network-scripts/ifcfg-enp0s3

TYPE=Ethernet

BOOTPROTO=none

DEFROUTE=yes

PEERDNS=yes

PEERROUTES=yes

IPV4_FAILURE_FATAL=no

IPV6INIT=yes

IPV6_AUTOCONF=yes

IPV6_DEFROUTE=yes

IPV6_PEERDNS=yes

IPV6_PEERROUTES=yes

IPV6_FAILURE_FATAL=no

NAME=enp0s3

UUID=d65c34f9-dd5d-4190-9e4f-f7f89c4df01f

IPADDR=192.168.1.19

NETMASK=255.255.255.0

DEVICE=enp0s3

ONBOOT=yes

3. Once we update the ipaddress restart the network service using below command

[root@redhat7-test ~]#systemctl restart network

4. check the ipaddress using below command

[root@redhat7-test ~]# ip addr show

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

valid_lft forever preferred_lft forever

inet6 ::1/128 scope host

valid_lft forever preferred_lft forever

2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

link/ether 08:00:27:6d:64:e7 brd ff:ff:ff:ff:ff:ff

inet 192.168.1.19/24 brd 192.168.1.255 scope global enp0s3

valid_lft forever preferred_lft forever

inet6 fe80::a00:27ff:fe6d:64e7/64 scope link

valid_lft forever preferred_lft forever
*************************************************************************

Creating local repo in RHEL 7

1. first mount the rhel 7 DVD in the local mount point from the drive

[root@redhat7-test /]#mount -o loop /dev/cdrom /tmp/

[root@redhat7-test /]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/mapper/rhel-root 14G 4.6G 9.5G 33% /

devtmpfs 488M 0 488M 0% /dev

tmpfs 497M 0 497M 0% /dev/shm

tmpfs 497M 6.6M 491M 2% /run

tmpfs 497M 0 497M 0% /sys/fs/cgroup

/dev/sda1 4.7G 125M 4.6G 3% /boot

/dev/loop0 3.7G 3.7G 0 100% /tmp

2. Create a directory for RHLE 7 repository

[root@redhat7-test /]#mkdir -p /var/www/html/rhel7

3. Now we need to copy the contents to the local directory

[root@redhat7-test /]#cd /tmp

[root@redhat7-test /]#tar cvf - . | (cd /var/www/html/rhel7/; tar xvf -)

[root@redhat7-test /]#cd /; umount /tmp/

4. Go to the directory where the repository configuration is

[root@redhat7-test tmp]# cd /etc/yum.repos.d/

configure the file as below (file name is rhel7.repo)

[rhel7]

name=rhel7

baseurl=file:///var/www/html/rhel7/

enabled=1

gpgcheck=0

5. Now execute below commands

[root@redhat7-test /]#yum clean all

[root@redhat7-test /]#yum repolist all

..............output is omitted...................................................

6. Now we need to install create repo command

[root@redhat7-test /]yum install -y createrepo

Loaded plugins: product-id, subscription-manager

This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.

rhel7 | 4.1 kB 00:00:00

Resolving Dependencies

--> Running transaction check

---> Package createrepo.noarch 0:0.9.9-23.el7 will be installed

--> Processing Dependency: deltarpm for package: createrepo-0.9.9-23.el7.noarch

--> Processing Dependency: python-deltarpm for package: createrepo-0.9.9-23.el7.noarch

--> Running transaction check

---> Package deltarpm.x86_64 0:3.6-3.el7 will be installed

---> Package python-deltarpm.x86_64 0:3.6-3.el7 will be installed

--> Finished Dependency Resolution

Dependencies Resolved

========================================================================================================================================================================

Package Arch Version Repository Size

========================================================================================================================================================================

Installing:

createrepo noarch 0.9.9-23.el7 rhel7 92 k

Installing for dependencies:

deltarpm x86_64 3.6-3.el7 rhel7 82 k

python-deltarpm x86_64 3.6-3.el7 rhel7 31 k

Transaction Summary

========================================================================================================================================================================

Install 1 Package (+2 Dependent packages)

Total download size: 205 k

Installed size: 553 k

Downloading packages:

------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Total 9.1 MB/s | 205 kB 00:00:00

Running transaction check

Running transaction test

Transaction test succeeded

Running transaction

Installing : deltarpm-3.6-3.el7.x86_64 1/3

Installing : python-deltarpm-3.6-3.el7.x86_64 2/3

Installing : createrepo-0.9.9-23.el7.noarch 3/3

rhel7/productid | 1.6 kB 00:00:00

Verifying : python-deltarpm-3.6-3.el7.x86_64 1/3

Verifying : deltarpm-3.6-3.el7.x86_64 2/3

Verifying : createrepo-0.9.9-23.el7.noarch 3/3

Installed:

createrepo.noarch 0:0.9.9-23.el7

Dependency Installed:

deltarpm.x86_64 0:3.6-3.el7 python-deltarpm.x86_64 0:3.6-3.el7

Complete!

7. Create the repository at /var/www/html/rhel7 ( this command will take some time to execute)

[root@redhat7-test yum.repos.d]# createrepo /var/www/html/rhel7

Spawning worker 0 with 4432 pkgs

Workers Finished

Saving Primary metadata

Saving file lists metadata

Saving other metadata

Generating sqlite DBs

Sqlite DBs complete

Our local repository is ready ..

Tuesday, November 17, 2015

How to check ID 's with sudo access in solaris 10

Here i am providing command to list sudo sccess ID's in solaris 10

*********************************************************************************
cat /etc/passwd | cut -d: -f1 | xargs -L1 sudo -l -U | ggrep -B1 "(ALL) ALL" | grep ^User | cut -d" " -f2

*********************************************************************************

Monday, November 9, 2015

Booting procedure in Solaris SPARC architecture

The different phases of solaris boot process are describing below.

Power on –> POST –>Boot Device (1-15) –>ufs boot loader –>Kernel –>/file system–>/sbin/init –> /svc/lib/svc.startd

Boot PROM base

The PROM displays the system identification number along with Banner,Hostid, macaddress,prompt chip release, version and physical memory size. This process also contains POST ( Power On Self Test) which is the hardware diagnosis routine and initializes the installed hardware.

We can see the POST messages in a serial through serial console . If serial console is not connected you can see the output of the POST through the command prtdiag -v

Sample output is given below

After the POST the PROM loads the boot compilation program called bootblk

Boot program phase

This phase will start reading the boot program which is available in 1 - 15 sector of the HDD. The OBP (Open Boot PROM) loads the primary boot program called bootblk from the boot device . ( if the bootblk is not present it has to be regenerated by running the command installgrub from a CDROM.)

ufsboot: This is a secondary boot program and this program loads the kernel core image files.

kernel: The kernel file location is /Platform/arch-i/kernel/sparkv9/unix ( if the processor is amd sparkv9 will change to amd etc) . As a part of kernel loading process the kernel banner will display including the kernel version number. The kernel initializes itself and reading modules with the help of ufsboot program untill it will load enough modules to mount the root file system . If the system complains not able to write to the root file system the booting procedure will struck in this phase.

The system parameters which is needed for booting is set at /etc/system file . Its main contents are given below

moddir: Changes path of kernel modules.
forceload: Forces loading of a kernel module.
exclude: Excludes a particular kernel module.
rootfs: Specify the system type for the root file system. (ufs is the default.)
rootdev: Specify the physical device path for root.
set: Set the value of a tuneable system parameter.
Init initialization phase

The kernel starts the PID 1 and which starts the /sbin/init process and internally this will starts /lib/svc/bin/svc.started which is responsible for below processes

a. configuring all network devices

b. mounting all file system

c. starts all network services

d. runs rc-scripts which brings the machine to multi user mode

In solaris 10 svc.startd is a separate boot process which is responsible for starting and stopping services during boot process. But the services which starts during start up and ends while down is configured in /etc/init.d directory

Different runlevels in solaris

Init s –>single user mode

Init 1 –> maintenance mode

Init 2 –> multiuser mode (NFS disabled)

Init 3 –> multiuser serve mode (NFS enable to share)

Init 4 –> not implemented for future purpose

Init 5 –> Shutdown & power off

Init 6 –> Shutdown & reboot

Init 0 –> Shutdown & skipped to ok prompt

All these processes as a nutshell i am providing you as a flowchart below