Friday, May 15, 2015

7 Important CellCLI Commands for Exadata DBA

7 Important CellCLI Commands for Exadata DBA


Now you are an Exadata DBA and I suppose you know Exadata basic Components and Features, your next responsibility is to manage cell Servers. This includes checking health of cell server and maintaining them. Exadata has a very good Health Check Tool Called as Exachk, DBA can use it.Oracle introduced a Cell Command Line Interface know as cellCLI for Database Administrator to interact with cell servers.

Before starting CellCLI Exadata DBA commands let's first understand Storage structure of cell Server. All the storage is presented to Cell Servers, Thy are provided to database server but as ASM Disks only. First of all there is a physical disk which is provided to cell server as LUN, Exadata S/W makes Cell disk on top of it. Now each cell disk is divided in to three grid disks data, reco, system_dg. Grid disks are provide to ASM storage for use.

The relation between physical disk, LUN, Cell Disk and Grid disk is given below:


From the above diagram it's clear to work on Exadata Cell level we need to use CellCLI and sometimes Linux commands.

Here, I am listing some most common use of CellCLI command for Exatadata DBA's to manage cell Servers.

1. Check Cell Status: This command is used to check cell status, This command will list all basic information about a cell like Name of cell, Status, status of all services cellsrv, ms and rs etc.
CellCLI> list cell
exadatalcel10      online
For Detailed display of cell configuration use below command.
CellCLI> list cell detail
         name:                   exadatalcel10
         bbuTempThreshold:       60
         bbuChargeThreshold:     800
         bmcType:                IPMI
         cellVersion:            OSS_11.2.3.2.1_LINUX.X64_130109
         cpuCount:               24
         diagHistoryDays:        7
         fanCount:               12/12
         fanStatus:              normal
         flashCacheMode:         WriteThrough
         id:                     1038FMKIRLN
         interconnectCount:      3
         interconnect1:          bondib0
         iormBoost:              0.0
         ipaddress1:             192.168.11.23/25
         kernelVersion:          2.6.32-400.11.1.el5uek
         locatorLEDStatus:       off
         makeModel:              Oracle Corporation SUN FIRE X4270 M2 SERVER SAS
         metricHistoryDays:      7
         offloadEfficiency:      1,000.0
         powerCount:             2/2
         powerStatus:            normal
         releaseVersion:         11.2.3.2.1
         releaseTrackingBug:     14522699
         status:                 online
         temperatureReading:     29.0
         temperatureStatus:      normal
         upTime:                 46 days, 2:31
         cellsrvStatus:          running
         msStatus:               running
         rsStatus:               running
Exadata DBA has to check fields highlighted in above output for health of Exadata Cell.

To display only few fields of whole cell output, use attribute and their name with comma separation.

 CellCLI> list cell attributes name,interconnectCount
         exadatalcel10      3
2. Physical Disk Information: As shown in above figure, all Physical disks are attached to cell servers. So DBA can also list down all physical disks on the servers and detail about them.
CellCLI> list physicaldisk
         20:0            E15SBS          normal
         20:1            E15QMK          normal
         20:2            E18SWT          normal
         20:3            E18SW4          normal
         20:4            E18VFV          normal
         20:5            E138FM          normal
         20:6            E18SW7          normal
         20:7            E18V8C          normal
         20:8            E13HC9          normal
         20:9            E1370B          normal
         20:10           E12300          normal
         20:11           E18VG7          normal
         FLASH_1_0       0944M01FMP      normal
         FLASH_1_1       0944M01F8X      normal
         FLASH_1_2       0944M01F7A      normal
         FLASH_1_3       0944M01FRD      normal
         FLASH_2_0       1030M03TT5      normal
         FLASH_2_1       1030M03TWV      normal
         FLASH_2_2       1030M03TT6      normal
         FLASH_2_3       1030M03TTD      normal
         FLASH_4_0       1030M03T3U      normal
         FLASH_4_1       1030M03RQL      normal
         FLASH_4_2       1030M03RQM      normal
         FLASH_4_3       1030M03TEH      normal
         FLASH_5_0       0929M00Q2P      normal
         FLASH_5_1       0938M016CG      normal
         FLASH_5_2       0940M018AY      normal
         FLASH_5_3       0942M01AJA      normal
Above command shows that: This cell has 12 Physical disks (20:0 to 20:11)and 5 Flash cards (FLASH1 to FLASH5) and each Flash card has four partitions like FLASH_1_0 to FLASH_1_3.

To look into detail of each physical disk DBA should use this.
CellCLI>  list physicaldisk E15SBS detail

         name:                   20:0
         deviceId:               19
         diskType:               HardDisk
         enclosureDeviceId:      20
         errMediaCount:          0
         errOtherCount:          0
         foreignState:           false
         luns:                   0_0
         makeModel:              "SEAGATE ST360057SSUN600G"
         physicalFirmware:       0B25
         physicalInsertTime:     2013-03-10T14:45:18-04:00
         physicalInterface:      sas
         physicalSerial:         E15SBS
         physicalSize:           558.9109999993816G
         slotNumber:             0
         status:                 normal

CellCLI> list physicaldisk FLASH_5_3 detail

         name:                   FLASH_5_3
         diskType:               FlashDisk
         luns:                   5_3
         makeModel:              "Sun Flash Accelerator F20 PCIe Card"
         physicalFirmware:       D21Y
         physicalInsertTime:     2013-03-10T14:45:18-04:00
         physicalSerial:         0942M01AJA
         physicalSize:           22.8880615234375G
         slotNumber:             "PCI Slot: 5; FDOM: 3"
         status:                 normal
In above two outputs, Database Administrator can easily identify if it's a physical disk of flash diskusing diskType field.

Some other options are.

CellCLI> list physicaldisk attributes name, disktype, errCmdTimeoutCount, errHardReadCount, errHardWriteCount
CellCLI>  list physicaldisk where diskType='Flashdisk'
CellCLI> list physicaldisk where diskType=flashdisk and status='poor performance' detail

3. LUN Disks Detail: A physical disk is presented as LUN to Cell storage server. Though providing Physical Disks to Storage as LUN is an automatic process taken care by Exadata S/W itself, but sometimes these are not presented in that case Exadata DBA has to find out LUN details.
CellCLI> list lun

         0_0     0_0     normal
         0_1     0_1     normal
         0_2     0_2     normal
         0_3     0_3     normal
         0_4     0_4     normal
         0_5     0_5     normal
         0_6     0_6     normal
         0_7     0_7     normal
         0_8     0_8     normal
         0_9     0_9     normal
         0_10    0_10    normal
         0_11    0_11    normal
         1_0     1_0     normal
         1_1     1_1     normal
         1_2     1_2     normal
         1_3     1_3     normal
         2_0     2_0     normal
         2_1     2_1     normal
         2_2     2_2     normal
         2_3     2_3     normal
         4_0     4_0     normal
         4_1     4_1     normal
         4_2     4_2     normal
         4_3     4_3     normal 
         5_0     5_0     normal
         5_1     5_1     normal
         5_2     5_2     normal
         5_3     5_3     normal 

CellCLI> list lun 0_0 detail 
         name:                   0_0
         cellDisk:               CD_00_exadatalcel10
         deviceName:             /dev/sda
         diskType:               HardDisk
         id:                     0_0
         isSystemLun:            TRUE
         lunAutoCreate:          FALSE
         lunSize:                557.861328125G
         lunUID:                 0_0
         physicalDrives:         28:0
         raidLevel:              0
         lunWriteCacheMode:      "WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU"
         status:                 normal

CellCLI> list lun 5_3 detail

         name:                   5_3
         cellDisk:               FD_15_exadatalcel10
         deviceName:             /dev/sdy
         diskType:               FlashDisk
         id:                     5_3
         isSystemLun:            FALSE
         lunAutoCreate:          FALSE
         lunSize:                22.8880615234375G
         physicalDrives:         FLASH_5_3
         status:                 normal
The difference between above two LUNs 0_0 and 5_3 is:

 a. 0_0 is hard disk while 5_3 is a flash Disk.
 b. 0_0 is a system lun whereas 5_3 is not a system lun.

The important thing to notice for Exadata DBA is status of LUN shown as "status: normal". If status is not normal DBA has to open an SR with Oracle Support.

 Few more LUN related commands
CellCLI> list lun attributes name, cellDisk, raidLevel, status
CellCLI> list lun where disktype=harddisk 

4. Cell Disks Report: LUN's are presented as cell disks to the cell server, which is actually a part of Exadata technology, Till Physical disks and LUN's  it was common Linux OS concepts. 
CellCLI> list celldisk
         CD_00_exadatalcel10        normal
         CD_01_exadatalcel10        normal
         CD_02_exadatalcel10        normal
         CD_03_exadatalcel10        normal
         CD_04_exadatalcel10        normal
         CD_05_exadatalcel10        normal
         CD_06_exadatalcel10        normal
         CD_07_exadatalcel10        normal
         CD_08_exadatalcel10        normal
         CD_09_exadatalcel10        normal
         CD_10_exadatalcel10        normal
         CD_11_exadatalcel10        normal
         FD_00_exadatalcel10        normal
         FD_01_exadatalcel10        normal
         FD_02_exadatalcel10        normal
         FD_03_exadatalcel10        normal
         FD_04_exadatalcel10        normal
         FD_05_exadatalcel10        normal
         FD_06_exadatalcel10        normal
         FD_07_exadatalcel10        normal
         FD_08_exadatalcel10        normal
         FD_09_exadatalcel10        normal
         FD_10_exadatalcel10        normal
         FD_11_exadatalcel10        normal
         FD_12_exadatalcel10        normal
         FD_13_exadatalcel10        normal
         FD_14_exadatalcel10        normal
         FD_15_exadatalcel10        normal
  
CellCLI> list celldisk CD_08_exadatalcel10 detail

         name:                   CD_08_exadatalcel10
         comment:
         creationTime:           2013-03-12T13:44:44-04:00
         deviceName:             /dev/sdi
         devicePartition:        /dev/sdi
         diskType:               HardDisk
         errorCount:             0
         freeSpace:              0
         id:                     81d86abc-6e34-4775-a8a3-ccb8c7adeabe
         interleaving:           none
         lun:                    0_8
         physicalDisk:           E18SVS
         raidLevel:              0
         size:                   557.859375G
         status:                 normal
Few other cell disk commands

CellCLI> list celldisk attributes name, devicePartition where size>200g;
CellCLI> list celldisk attributes name,status,size 

5. Grid Disk Knowledge: Each Hard disk type Cell disk is divided in to three grid disks, Which are reco, data, dbfs. Reco is used for redo log file, Data is used for storing database datafiles and dbfs is for other purpose like back etc.

By default Data is only stored in Hard Disk type grid disks not in Flash Disk type gird disks, grid disk made out of Flash disks are used for Cell level data buffering purpose.

Each Exadata system has 12 Hard disk and each Hard disk has three grid disk made out of it, So there should be total 36 Grid Disks. 

CellCLI> list griddisk

         DATA_DMORL_CD_00_exadatalcel10     active
         DATA_DMORL_CD_01_exadatalcel10     active
         DATA_DMORL_CD_02_exadatalcel10     active
         DATA_DMORL_CD_03_exadatalcel10     active
         DATA_DMORL_CD_04_exadatalcel10     active
         DATA_DMORL_CD_05_exadatalcel10     active
         DATA_DMORL_CD_06_exadatalcel10     active
         DATA_DMORL_CD_07_exadatalcel10     active
         DATA_DMORL_CD_08_exadatalcel10     active
         DATA_DMORL_CD_09_exadatalcel10     active
         DATA_DMORL_CD_10_exadatalcel10     active
         DATA_DMORL_CD_11_exadatalcel10     active
         DBFS_DG_CD_02_exadatalcel10        active
         DBFS_DG_CD_03_exadatalcel10        active
         DBFS_DG_CD_04_exadatalcel10        active
         DBFS_DG_CD_05_exadatalcel10        active
         DBFS_DG_CD_06_exadatalcel10        active
         DBFS_DG_CD_07_exadatalcel10        active
         DBFS_DG_CD_08_exadatalcel10        active
         DBFS_DG_CD_09_exadatalcel10        active
         DBFS_DG_CD_10_exadatalcel10        active
         DBFS_DG_CD_11_exadatalcel10        active
         RECO_DMORL_CD_00_exadatalcel10     active
         RECO_DMORL_CD_01_exadatalcel10     active
         RECO_DMORL_CD_02_exadatalcel10     active
         RECO_DMORL_CD_03_exadatalcel10     active
         RECO_DMORL_CD_04_exadatalcel10     active
         RECO_DMORL_CD_05_exadatalcel10     active
         RECO_DMORL_CD_06_exadatalcel10     active
         RECO_DMORL_CD_07_exadatalcel10     active
         RECO_DMORL_CD_08_exadatalcel10     active
         RECO_DMORL_CD_09_exadatalcel10     active
         RECO_DMORL_CD_10_exadatalcel10     active
         RECO_DMORL_CD_11_exadatalcel10     active 
From the above output, Exadata DBA can see there are 12 Data grid disks, 12 Reco grid disks but only 10 DBFS grid disks total makes 34 grid disk, while this should be 36. Rest two partitions on disk are used for Cell S/W and for it RAID configuration.

Now, let's verify this.

DBA can find out cell disk and we will cross check cell disk size with sum of all grid disk made out of it.
CellCLI> list celldisk CD_02_exadatalcel10 detail
          name:                   CD_02_exadatalcel10
         comment:
         creationTime:           2013-03-12T13:44:41-04:00
         deviceName:             /dev/sdc
         devicePartition:        /dev/sdc
         diskType:               HardDisk
         errorCount:             0
         freeSpace:              0
         id:                     2b5fc449-581a-431d-96ed-9fea2b1ae7dd
         interleaving:           none
         lun:                    0_2
         physicalDisk:           E15QLT
         raidLevel:              0
         size:                   557.859375G
         status:                 normal

CellCLI> list griddisk DATA_DMORL_CD_02_exadatalcel10 detail
         name:                   DATA_DMORL_CD_02_exadatalcel10
         asmDiskgroupName:       DATA_DMORL
         asmDiskName:            DATA_DMORL_CD_02_exadatalcel10
         asmFailGroupName:       exadatalcel10
         availableTo:
         cachingPolicy:          default
         cellDisk:               CD_02_exadatalcel10
         comment:
         creationTime:           2013-03-12T13:48:24-04:00
         diskType:               HardDisk
         errorCount:             0
         id:                     12a03284-bec7-4187-856b-b1aa1d8a112f
         offset:                 32M
         size:                   423G
         status:                 active

CellCLI> list griddisk DBFS_DG_CD_02_exadatalcel10 detail
         name:                   DBFS_DG_CD_02_exadatalcel10
         asmDiskgroupName:       DBFS_DG
         asmDiskName:            DBFS_DG_CD_02_exadatalcel10
         asmFailGroupName:       exadatalcel10
         availableTo:
         cachingPolicy:          default
         cellDisk:               CD_02_exadatalcel10
         comment:
         creationTime:           2013-03-12T13:47:15-04:00
         diskType:               HardDisk
         errorCount:             0
         id:                     123ba6ce-b761-4645-9d58-08ee669777e8
         offset:                 528.734375G
         size:                   29.125G
         status:                 active

CellCLI> list griddisk RECO_DMORL_CD_02_exadatalcel10 detail
         name:                   RECO_DMORL_CD_02_exadatalcel10
         asmDiskgroupName:       RECO_DMORL
         asmDiskName:            RECO_DMORL_CD_02_exadatalcel10
         asmFailGroupName:       exadatalcel10
         availableTo:
         cachingPolicy:          none
         cellDisk:               CD_02_exadatalcel10
         comment:
         creationTime:           2013-03-12T13:48:31-04:00
         diskType:               HardDisk
         errorCount:             0
         id:                     cc9c6665-822c-4f92-84b1-b6e63411c3a6
         offset:                 423.046875G
         size:                   105.6875G
         status:                 active
Here, cell disk CD_02_exadatalcel10 size is 560G, Which is divided in three grid disks DATA_DMORL_CD_02_exadatalcel10 size 424G, DBFS_DG_CD_02_exadatalcel10 size 30G and RECO_DMORL_CD_02_exadatalcel10 size 106G. Total of all three is also equal to 560G. Hence this is verified.

Some other commands:

CellCLI> list griddisk attributes name, cellDisk, diskType where disktype='harddisk'
CellCLI> list griddisk attributes name where asmdeactivationoutcome='yes' 

6. Display Exadate Alerts: As database gives give error messages into database alert log files, same way Cell server also write cell related alerts into cell alert log file.

Location of cell alert log file for Exadata Cell is /opt/oracle/cell/log/diag/asm/cell/{node name}/trace/alert.log or if the CELLTRACE parameter is set just do cd $CELLTRACE. Exadata DBA can also check cell alerts using cell commands. 

CellCLI> list alerthistory

         1       2013-03-10T14:45:37-04:00       info            "Factory defaults restored for Adapter 0"
         2       2013-03-10T14:45:39-04:00       info            "Factory defaults restored for Adapter 0"
         3_1     2013-03-10T14:50:02-04:00       critical        "Cell configuration check discovered the following problems:   Check Exadata configuration via ipconf utility Verifying of Exadata configuration file /opt/oracle.cellos/cell.conf Error. Exadata configuration file not found /opt/oracle.cellos/cell.conf [INFO] The ipconf check may generate a failure for temporary inability to reach NTP or DNS server. You may ignore this alert, if the NTP or DNS servers are valid and available. [INFO] You may ignore this alert, if the NTP or DNS servers are valid and available. [INFO] As root user run /usr/local/bin/ipconf -verify -semantic to verify consistent network configurations."
         3_2     2013-03-10T15:04:30-04:00       clear           "The cell configuration check was successful."
         4       2013-03-10T14:59:18-04:00       critical        "RS-7445 [Required IP parameters missing] [Check cellinit.ora] [] [] [] [] [] [] [] [] [] []"

CellCLI> list alerthistory 8_1 detail

         name:                   8_1
         alertMessage:           "Cell configuration check discovered the following problems:   Check Exadata configuration via ipconf utility Verifying of Exadata configuration file /opt/oracle.cellos/cell.conf Checking DNS server on 192.135.82.132                                                            : FAILED Error. Overall status of verification of Exadata configuration file: FAILED [INFO] The ipconf check may generate a failure for temporary inability to reach NTP or DNS server. You may ignore this alert, if the NTP or DNS servers are valid and available. [INFO] You may ignore this alert, if the NTP or DNS servers are valid and available. [INFO] As root user run /usr/local/bin/ipconf -verify -semantic to verify consistent network configurations."
         alertSequenceID:        8
         alertShortName:         Software
         alertType:              Stateful
         beginTime:              2013-05-03T22:46:50-04:00
         endTime:                2013-05-04T22:46:35-04:00
         examinedBy:
         metricObjectName:       checkconfig
         notificationState:      0
         sequenceBeginTime:      2013-05-03T22:46:50-04:00
         severity:               critical
         alertAction:            "Correct the configuration problems. Then run cellcli command:   ALTER CELL VALIDATE CONFIGURATION   Verify that the new configuration is correct."
DBA has to give priority for alert having "severity: critical".

Few other commands:

CellCLI>  list alerthistory where severity like '[warning|critical]'
CellCLI>  list alertdefinition detail
 

7. Restart Cell Services: Cell server runs three services cellsrv, ms and rs. Sometimes exadata Database Administrator has to restart these services and to check status

To check service status use 
CellCLI> list cell detail

         name:                   exadatalcel10
         bbuTempThreshold:       60
         bbuChargeThreshold:     800
         bmcType:                IPMI
         cellVersion:            OSS_11.2.3.2.1_LINUX.X64_130109
         cpuCount:               24
         diagHistoryDays:        7
         fanCount:               12/12
         fanStatus:              normal
         flashCacheMode:         WriteThrough
         id:                     1038FMM04N
         interconnectCount:      3
         interconnect1:          bondib0
         iormBoost:              0.0
         ipaddress1:             192.168.10.19/22
         kernelVersion:          2.6.32-400.11.1.el5uek
         locatorLEDStatus:       off
         makeModel:              Oracle Corporation SUN FIRE X4270 M2 SERVER SAS
         metricHistoryDays:      7
         offloadEfficiency:      1,000.0
         powerCount:             2/2
         powerStatus:            normal
         releaseVersion:         11.2.3.2.1
         releaseTrackingBug:     14522699
         status:                 online
         temperatureReading:     29.0
         temperatureStatus:      normal
         upTime:                 47 days, 3:25
         cellsrvStatus:          running
         msStatus:               running
         rsStatus:               running 
Here last three lines of output shows the status of services which are running. To restart the service use below command
CellCLI> alter cell restart services rs

Stopping RS services...
The SHUTDOWN of RS services was successful.
Starting the RS services...
Getting the state of RS services...  running

CellCLI> alter cell restart services ms
CellCLI> alter cell restart services cellsrv
 
To Restart all services in one command and shutdown a service use 
CellCLI> alter cell restart services all

Stopping the RS, CELLSRV, and MS services...
The SHUTDOWN of services was successful.
Starting the RS, CELLSRV, and MS services...
Getting the state of RS services...  running
Starting CELLSRV services...
The STARTUP of CELLSRV services was successful.
Starting MS services...
The STARTUP of MS services was successful.
CellCLI> alter cell shutdown services rs
CellCLI> alter cell shutdown services ms
CellCLI> alter cell shutdown services cellsrv 
These are important commands for Exadata DBA cell operations perspective, Share in comments if you know few more useful commands.


No comments:

Post a Comment

  How to Change Instance Type & Security Group of EC2 in AWS By David Taylor Updated April 29, 2023 EC2 stands for Elastic Compute Cloud...