Thursday, February 28, 2019

Consolidating Error Logs from multiple SQL Servers using SSIS

Consolidating Error Logs from multiple SQL Servers using SSIS

Problem
SQL Server stores all informational messages, warning messages, and errors in operating system and application log files. As a database administrator (DBA), it is our responsibility to review the information in the error log files on a daily basis for any errors that have occurred in our SQL Server environment. However, manually reviewing the log files daily is not an easy task in a busy SQL Server environment, and it can quickly consume a large part of your day. Therefore, most DBAs tend to skip this when they are busy with other production problems. In this tip, I'll show you how we can consolidate error log entries from multiple servers into a central location, and then report on them from central location.
Solution
Before we talk about our custom solution for consolidating SQL error log, I will first state the purpose of SQL Server error log and show different ways for viewing it on SQL Server.

Purpose of SQL Server ErrorLog

SQL Server keeps logs for Database Mail, SQL Server (database server) and SQL Server Agent (error log) in operating system and application log files. SQL Server maintains a current log and archive logs. By default, it keeps six archive logs. Only logins that are members of the sysadmin and securityadmin fixed server role are able to view SQL Server logs.

Viewing SQL Server Logs Using SQL Server Management Studio

To open the Log File Viewer in Management Studio:
  1. Launch SQL Server Management Studio and then connect to a SQL Server instance.
  2. In Object Explorer, expand Management, SQL Server Logs.
  3. Right-click the current log or an archive log and choose View SQL Server Log.
errorlog_01.jpg
To open the Log File Viewer from the SQL Server Agent Error Logs:
  1. Expand SQL Server Agent, Error Logs.
  2. Right-click the current log or an archive log and choose View Agent Log.

Log File Viewer

The Log File Viewer is shown below:
errorlog_02.jpg
The Select logs window lets you choose from:
  • SQL Server
  • SQL Server Agent
  • Database Mail
  • Windows logs
You can view the Windows Application and System log  and with sufficient permissions the Security logs too. Log entry details display in the lower pane when you select a log entry. Most of the messages in the SQL Server logs are informational only and require no corrective action.

Viewing SQL Server Logs Using T-SQL

You can use sp_readerrorlog that is an undocumented stored procedure to view SQL Server error logs. It accepts log file number as a parameter. For example, to view the current log file, execute the sp_readerrorlog as follow:
EXEC [sp_readerrorlog] 0
GO

errorlog_03.jpg

Solution for Consolidating SQL Server Logs

Now that we understand the purpose of SQL Server logs, let's begin with our solution. This solution is built using SSIS, and will help you consolidate and merge error logs from multiple SQL Servers.
For the purpose of this article, create a database called ErrorLogMonitor on a SQL Server instance, where you want to consolidate the error logs.
Execute the following T-SQL script to create this database:
CREATE DATABASE [ErrorLogMonitor]
 ON  PRIMARY 
( NAME = N'ErrorLogMonitor'
, FILENAME = N'\ErrorLogMonitor.mdf' 
, SIZE = 4160KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )
 LOG ON 
( NAME = N'ErrorLogMonitor_log'
, FILENAME = N'\ErrorLogMonitor_log.ldf' 
, SIZE = 1040KB , MAXSIZE = 2048GB , FILEGROWTH = 10%)
GO
Next, create the following two tables inside this database:
  • SSISServersList- The table stores the names of each SQL Server for which you want to read SQL Server error log.
  • ErrorLog- The table will store the error log data for all of the SQL Servers you are monitoring.
To create the tables, execute the following:
USE [ErrorLogMonitor]
GO
CREATE TABLE [dbo].[SSISServersList](
 [ServerName] [nvarchar](128) NOT NULL,
 [IsTest] [bit] NULL,
 [Active] [bit] NULL,
 [Port] [int] NULL) ON [PRIMARY]
GO
CREATE TABLE [dbo].[ErrorLog](
 [Server] [nvarchar](128) NOT NULL,
 [Date] [datetime] NULL,
 [ProcessInfo] [varchar](50) NULL,
 [Message] [nvarchar](4000) NULL,
 [AuditDate] [smalldatetime] NOT NULL,
 [LoadID] [int] NOT NULL
) ON [PRIMARY]
GO
Finally, we are ready to design our SSIS package. To do that launch SQL Server Data tools and create a new Integration Services project. Once the package is created, it will create an empty package. Rename this package to ConsolidateErrorLog_Package.dtsx.
Follow these steps to configure the package.

1: Defining Package Variables

Add the following two variables for the SSIS package:
  • Variable 1: SQLServer_Connection
    • Scope: ConsolidateErrorLog_Package
    • Data Type: String
    • Value:<You SQL Server instance name> (Specify the instance name where ErrorLogMonitor database is located).
  • Variable 2: SQLServer_ResultSet
    • Scope: ConsolidateErrorLog_Package
    • Data Type: Object
    • Value: System.Object
To do that, right-click anywhere on the empty panel on the Control Flow Tab, then select Variables and then click Add Variables icon (see below).
errorlog_04.jpg

2: Defining Package Connections

First create a new ADO.NET connection to the ErrorLogMonitor database on your SQL Server. To do this, click anywhere on the Connection Managers, and choose New ADO.NET Connection. Configure the ADO.NET connection as follow:
errorlog_05.jpg
Rename the connection to ADO_ErrorLogMonitor.
Next, create the dynamic OLE-DB connection, this connection allows us to connect to the different SQL Server instances. This connection is passed with SQLServer_Connection variable that contains SQL Server instance name from the SSISServerList table, which was created earlier. Rename the connection to Dynamic_ErrorLogMonitor.
Now right-click the Dynamic_ErrorLogMonitor connection and then choose Properties. First change the initial catalog to master, and then click (...) box next to expression and specify the following expression for ServerName property:
@[User::SQLServer_Connection]

errorlog_06.jpg

3: Defining Package Tasks

3.1: Configuring "Truncate ErrorLog table" - Execute SQL Task

Add an "Execute SQL Task" to the control flow. Double-click the task and configure properties in the "General" page of the task as follows:
  • Rename "Execute SQL Task" to "Get SQL Server ErrorLog".
  • Set "ResultSet" property to "None".
  • Set "Connection Type" property to "ADO.NET".
  • Set "Connection" property to "ADO_ErrorLogMonitor".
  • Set "BypassPrepare" to "True".
  • Set "SQLStatement" property with the error log collection script below:
TRUNCATE TABLE [dbo].[ErrorLog]

errorlog_07.jpg

3.2: Configuring "Get Server Names" - Execute SQL Task

Add "Execute SQL Task" to control flow. Double-click the task and configure properties in the "General" page of the task as follows:
  • Rename "Execute SQL Task" to "Get SQL Server Names".
  • Set "ResultSet" property to "Full result set".
  • Set "Connection Type" property to "ADO.NET".
  • Set "Connection" property to "ADO_ErrorLogMonitor".
  • Set "BypassPrepare" to "False".
  • Set "SQLStatement" to the following:
SELECT [ServerName] 
  + CASE ISNULL(CAST([Port] AS [varchar](10)), '')
   WHEN ''
    THEN ''
   ELSE ',' + CAST([Port] AS [varchar](10))
    END AS [ServerName]
FROM [dbo].[SSISServersList]
WHERE [Active] = 1
GO

errorlog_08.jpg
Click ResultSet in left pane and then click Add to add the "User::SQLServer_ResultSet" variable as shown below. Change Result Name value to 0, and then click OK to save the configuration for the task.
errorlog_09.jpg
Then in the SSIS package, connect the "Truncate ErrorLog table" task to the "Get Server Names" task.

3.3: Configuring - ForEach loop container

Drag and drop a ForEach loop container to the package control flow. Double-click the ForEach loop container, and then click Collection in left pane. Here configure the properties of this window as follows:
  • Change "Enumerator" property to "Foreach ADO Enumerator".
  • Set "Enumeration Mode" to "Rows in the first table".
  • Specify "ADO object source variable" to the "User::SQLServer_ResultSet" package variable. 
errorlog_10.jpg
Now click "Variable Mappings" and select "User::SQLServer_Connection" package variable to map to the collection value.
errorlog_11.jpg
Then in the SSIS package, connect the "Get Server Names" task with the "ForEach loop" container.

3.4: Configuring "Get SQL Server ErrorLog" - Execute SQL Task

Add an "Execute SQL Task" inside the ForEach loop container. Double-click the task and configure properties in the "General" page of the task as follows:
  • Rename "Execute SQL Task" to "Get SQL Server ErrorLog".
  • Set "ResultSet" property to "None".
  • Set "Connection Type" property to "OLE DB".
  • Set "Connection" property to "Dynamic_ErrorLogMonitor".
  • Set "BypassPrepare" to "True".
  • Set "SQLStatement" property with the error log collection script below:
SET NOCOUNT ON;
SET DATEFORMAT MDY; 
DECLARE @LoadID   [int]
    ,@FileNumber  [tinyint]
    ,@ErrorLogSize [bigint]
    ,@ExtractDate [datetime]
    ,@SQLServer  [nvarchar](128)
IF NOT EXISTS (SELECT * FROM [tempdb].[sys].[objects] 
    WHERE [name] = 'ERROR_LOG' AND [type] IN (N'U'))
BEGIN
 CREATE TABLE [tempdb].[dbo].[ERROR_LOG]
  ([Server]  [nvarchar](128) NULL
  ,[LogDate]  [datetime] NULL
  ,[ProcessInfo]  [varchar](50) NULL
  ,[Text]   [nvarchar](4000) NULL)
END
ELSE 
BEGIN
 TRUNCATE TABLE [tempdb].[dbo].[ERROR_LOG]
END
IF OBJECT_ID('tempdb..#ErrorLogs') IS NOT NULL
    DROP TABLE #ErrorLogs
CREATE TABLE #ErrorLogs 
 ([ArchiveNumber]  [int]
 ,[DateLastChanged]  [datetime] NOT NULL
 ,[LogFileSizeInBytes]  [bigint])
-- Reading the error logs files details
INSERT INTO #ErrorLogs 
EXEC [master]..[xp_enumerrorlogs]
-- Assigning values to variables
SELECT @FileNumber = [ArchiveNumber]
   ,@ErrorLogSize = [LogFileSizeInBytes]
FROM #ErrorLogs
WHERE [ArchiveNumber] = 0
-- Set extract date for 1 day
SET @ExtractDate = DATEADD(dd, -1, CURRENT_TIMESTAMP)
IF (@ErrorLogSize <= 536870912)
BEGIN
 -- Loading the latest errorlog information inside table
 INSERT INTO [tempdb].[dbo].[ERROR_LOG]
      ([LogDate]
      ,[ProcessInfo]
      ,[Text])
 EXEC [sp_readerrorlog] 0
END 
ELSE 
BEGIN
 INSERT INTO [tempdb].[dbo].[ERROR_LOG]
   ([LogDate]
   ,[ProcessInfo]
   ,[Text])
 VALUES (CURRENT_TIMESTAMP
     ,'Server'
     ,'ErrorLog is two big to load.')
END
DELETE FROM [tempdb].[dbo].[ERROR_LOG]
WHERE [LogDate] < @ExtractDate
DELETE FROM [tempdb].[dbo].[ERROR_LOG]
WHERE [ProcessInfo] = 'Backup'
DELETE FROM [tempdb].[dbo].[ERROR_LOG]
WHERE [ProcessInfo] = 'Logon'
UPDATE [tempdb].[dbo].[ERROR_LOG]
SET  [Server] = CAST(SERVERPROPERTY('ServerName') AS [nvarchar](128))
SET NOCOUNT OFF;
GO

errorlog_12.jpg

3.5: Configuring "Process ErrorLog" - Data Flow Task

Add a "Data Flow Task" inside the ForEach loop container and rename it to "Process ErrorLog". Connect the "Data Flow Task" with the "Get SQL Server ErrorLog".
Now, double-click the "Data Flow Task", and drag the "OLE DB Source" and "ADO NET Destination" to the "Data Flow Task". Next, connect the "OLE DB Source" with the "ADO NET Destination. Then double-click "OLE DB Source" and specify the settings as follows:
  • Set "OLE DB Connection Manager" to "Dynamic_ErrorLogMonitor".
  • Change "Data access mode" to "SQL command"
  • Specify the following query in "SQL command text":
SELECT [Server]
      ,[LogDate]
      ,[ProcessInfo]
      ,[Text]
      ,CURRENT_TIMESTAMP AS [AuditDate]
FROM [tempdb].[dbo].[ERROR_LOG]
GO
Once done, click OK to save the settings and return to Data Flow.
errorlog_13.jpg
Finally, double-click "ADO NET Destination" and configure as follow:
  • Set "Connection manager" to "ADO_ErrorLogMonitor".
  • Select "dbo.ErrorLog" table in our "ErrorLogMonitor" database, where data will be copied.
  • Then map source table to destination table. 
errorlog_14.jpg

errorlog_15.jpg
Click OK to save settings.

Package Review

All done, our package is fully configured, and it is now ready for us to execute the package. The package control flow and data flow look similar to the following figures:

Control Flow - ConsolidateErrorLog_Package.dtsx

errorlog_16.jpg

Data Flow - ConsolidateErrorLog_Package.dtsx

errorlog_17.jpg

Testing

To test the package, simply add a few servers in the SSISServerList table and then execute the package using the play button in SQL Server Data Tools (see below):
errorlog_18.jpg
Once successfully executed, query the data of dbo.ErrorLog inside our ErrorLogMonitor database to retrieve all SQL servers error logs (see below):
errorlog_19.jpg
Next Steps

Tuesday, February 26, 2019

Replace failed drive in Hardware Raid 1 with MegaCli

Replace failed drive in Hardware Raid 1 with MegaCli


1. First, find out, which disk has failed using the below command:
/path/to/MegaCli64 -PDList -aALL
If the physical disk is healthy, it will have a "Firmware state" of "Online, Spun Up". If any of the disk shows as failed, note down the "Enclosure Device ID" and "Slot Number".

In the below example, we will use the "Enclosure Device ID" of the failed disk as "EID" and "Slot Number" as "SN".

2. Secondly, take the failed disk offline:

/path/to/MegaCli64 -PDOffline -PhysDrv [EID:N] -a0
say, if EID is 250 and SN is 2, our command would be:
/path/to/MegaCli64 -PDOffline -PhysDrv [250:2] -a0
3. Now, mark the failed disk as missing.

/path/to/MegaCli64 -PDMarkMissing -PhysDrv [EID:N]'-a0

/path/to/MegaCli64 -PDMarkMissing -PhysDrv [250:2] -a0
4. Finally, prepare to remove the disk from the array.
/path/to/MegaCli64 -PDPrpRmv -PhysDrv [EIDN] -a0

/path/to/MegaCli64 -PDPrpRmv -PhysDrv [250:2] -a0
5. Manually unplug the failed hard disk and replace it with brand new one. The RAID controller should automatically see the new disk and start the rebuild.

6. Check the physical status of the disk again

/path/to/MegaCli64 -PDList -aALL
You should now see the status of the "Firmware state" as "Rebuild".

7. Give it some time and once the rebuild is complete, the "Firmware state" will show as "Online, Spun Up".

If the disk is marked as foreign, do a full rebuild.
/path/to/MegaCli64 -PDMakeGood -PhysDrv [EID:N] -aALL

/path/to/MegaCli64 -CfgForeign -Clear -aALL

/path/to/MegaCli64 -PDHSP -Set -PhysDrv [EID:N] -aALL
Finally, you can check the status of the rebuild using the below command:


/path/to/MegaCli64 -PDRbld -ShowProg -PhysDrv[EID:N] -a0   

Wednesday, February 6, 2019

Oracle ASM : Shell script to map physical disk devices to ASMLIB disks

Oracle ASM : Shell script to map physical disk devices to ASMLIB disks

ASMLIB is a support library for the Automatic Storage Management (ASM) feature of the Oracle Database and is available for the Linux operating system. When ASMLIB is used to manage ASM disks, the device path information is not presented in GV$ASM_DISK.PATH. This post shows how to get the physical disk device name associated with the ASMLIB disk name.

Finding asmlib disk name and asm disk name

By default when you scan disks with oracleasm scandisks, the corresponding disk device files (ASMLIB disk names) are stored into the path /dev/oracleasm/disks. For example :
# ls -lrt /dev/oracleasm/disks
total 0
brw-rw---- 1 oracle oinstall 201, 40112 Oct  3 11:08 REDO_1
brw-rw---- 1 oracle oinstall 201, 31232 Oct  3 11:08 REDO_2
brw-rw---- 1 oracle oinstall 201, 15408 Oct  3 11:09 DATA_1
brw-rw---- 1 oracle oinstall 201,  6336 Oct  3 11:09 DATA_2
brw-rw---- 1 oracle oinstall 201, 62432 Oct  3 11:09 DATA_3
...
You can use the the ASM utility kfed, to read the disk headers and get the disk name that is being used in the ASM disk group. for example :
export ORACLE_HOME=[grid home]
# cd $ORACLE_HOME/bin
# ./kfed read /dev/dm-17
....
kfdhdb.dsknum:                        0 ; 0x024: 0x0000
kfdhdb.grptyp:                        1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname:           DATADG_DISK01 ; 0x028: length=19
kfdhdb.grpname:                  DATADG ; 0x048: length=14
kfdhdb.fgname:            DATADG_DISK01 ; 0x068: length=19
....
As shown above, we can grep for dskname, to get the ASM disk name.

Finding physical disk device

From the output of command “ls -lrt /dev/oracleasm/disks” we get the major and minor number which can be used to find the physical device corresponding to the asm disk. for example :
# ls -lrt /dev/oracleasm/disks/DATA_1
brw-rw---- 1 oracle oinstall 201, 15408 Oct  3 11:09 DATA_1
From above output grep the major,minor numbers – 201, 15408 in the directory /dev to get the physical devices.
# ls -lrt /dev/ | grep "201,15408"
brw------- 1 root root 201,  15408 Oct  3 10:48 dm-17

script to map ASM disk to physical disk

Here is a small script from oracle (I modified it a little to include path of kfed command). The script can be run as root or any other user (oracle or grid).
#!/bin/bash
## ASMLIB_DISK  -- disk name in ASMLIB
## ASM_DISK -- disk name in ASM
## DEVICE -- physical disk name
GRID_HOME=`cat /etc/oratab  | grep ^+ASM | awk -F":" '{print $2}'`
for ASMLIB_DISK in `ls /dev/oracleasm/disks/*`
  do
    ASM_DISK=`$GRID_HOME/bin/kfed read $ASMLIB_DISK | grep dskname | tr -s ' '| cut -f2 -d' '`
    majorminor=`ls -l $ASMLIB_DISK | tr -s ' ' | cut -f5,6 -d' '`
    device=`ls -l /dev/ | tr -s ' ' | grep -w "$majorminor" | cut -f10 -d' '`
    echo "ASMLIB disk name : $ASMLIB_DISK"
    echo "ASM_DISK name : $ASM_DISK"
    echo "Physical disk device : /dev/$device"
done
Sample output from above script :
# ./asm_device_mapping.sh
ASMLIB disk name : DATA_01
ASM_DISK name : DATA_01
Physical disk device : /dev/dm-6476
ASMLIB disk name : DATA_02
ASM_DISK name : DATA_02
Physical disk device : /dev/dm-6473
...

script to find deleted ASMLIB disk

If an ASMLIB disk was already deleted, it will not show in /dev/oracleasm/disks. We can check for devices that are (or were) associated with ASM(LIB) with the following shell script:
#!/bin/bash
GRID_HOME=`cat /etc/oratab  | grep ^+ASM | awk -F":" '{print $2}'`
for device in `ls /dev/sd*`
  do
    asmdisk=`$GRID_HOME/bin/kfed read $device | grep ORCL | tr -s ' ' | cut -f2 -d' ' | cut -c1-4`
    if [ "$asmdisk" = "ORCL" ]
      then
      echo "Disk device $device may be an ASM disk"
    fi
done
The scripts takes a peek at sd devices in /dev. The script needs to be run as privileged user. Of course we can look at /dev/dm*, /dev/mapper, etc or all devices in /dev, although that may not be a good idea. Sample kfed read on a ASM disk :
./kfed read /dev/dm-7 | grep ORCL
kfdhdb.driver.provstr:ORCLDISKTESTDB_ARCH_08 ; 0x000: length=23
Note : The kfed binary should be available in RDBMS home (prior to version 11.2) and in the Grid Infrastructure home (in version 11.2 and later). For the purpose of this post I have assumed Grid Infrastructure version to be 11.2 or higher.
The same can be achieved without kfed with a script like this:
#!/bin/bash
for device in `ls /dev/sd*`
do
  asmdisk=`od -c $device | head | grep 0000040 | tr -d ' ' | cut -c8-11`
  if [ "$asmdisk" = "ORCL" ]
    then
    echo "Disk device $device may be an ASM disk"
  fi
done
the od command dumps the files in octal and other formats. -c option is to select ASCII characters or backslash escapes.

  How to Change Instance Type & Security Group of EC2 in AWS By David Taylor Updated April 29, 2023 EC2 stands for Elastic Compute Cloud...