The other day I was troubleshooting an issue with log shipping where the copy and restore jobs were failing. Here is an example of the error for the copy job, which was also similar to the restore job.
2010-06-29 09:41:52.80 *** Error: Could not retrieve copy settings for secondary ID '[removed]'.(Microsoft.SqlServer.Management.LogShipping) ***
2010-06-29 09:41:52.81 *** Error: The specified agent_id BECBBCC0-6867-4398-BD96-830D62D88558 or agent_type 1 do not form a valid pair for log shipping monitoring processing.(.Net SqlClient Data Provider) ***
While it is not completely clear, this means that the copy job was not able to login to the instance and query the log shipping tables in MSDB. When I was first trying to gain access to the instance I had noticed that it was configured a bit strange. Basically they had a default instance and a named instance. The default instance was used as the secondary for log shipping while the named instance was being used for dev work. Well that part isn’t so strange… However they had an alias configured to redirect the server name to the named instance. So the log shipping jobs thought it was connecting to the default instance, it was actually connecting to the named instance. To work around this I just had to add the port number to the jobs as in the example below.
"C:\Program Files\Microsoft SQL Server\90\Tools\Binn\sqllogship.exe" -Copy BECBBCC0-6867-4398-BD96-830D62D88558 -server SERVER01,1433
log shipping, troubleshooting
At work our first line of defense for our SQL Servers is our system admins and engineers. Overtime I have written a decent number of scripts to try and help out our operations people so that when an alert comes in they can diagnose issues quickly and accurately. The following script checks and reports on the configuration of the log shipping primary server. It also does a simple analysis showing the most immedaite issue.
--The Job Status shows if the backup job has failed in the last 2 hours.
select
ISNULL(db.name,'') as [Database Name],
CASE
WHEN db.name IS NULL THEN 'DB is missing.'
WHEN db.state_desc <> 'ONLINE' THEN 'DB is not online.'
WHEN db.recovery_model_desc = 'SIMPLE' THEN 'DB is using simple recovery model.'
WHEN pri.primary_database IS NULL THEN 'Missing log shipping primary config.'
WHEN sec.secondary_database IS NULL THEN 'Missing log shipping primary secondary config.'
WHEN job.name IS NULL THEN 'Backup job is missing.'
WHEN job.enabled = 0 THEN 'Backup job is disabled.'
WHEN schedule.name IS NULL THEN 'Job is missing a schedule.'
WHEN schedule.enabled = 0 THEN 'Job schedule is disabled.'
WHEN jobhist.run_status = 0 THEN 'Job has failed in the last 2 hours.'
ELSE ''
END AS [Immediate Issue],
ISNULL(db.state_desc,'') as [Database State],
ISNULL(db.recovery_model_desc,'') as [Recovery Model],
ISNULL(pri.primary_database,'') as [Primary DB],
ISNULL(sec.secondary_database,'') as [Secondary DB],
ISNULL(sec.secondary_server,'') as [Secondary Server],
ISNULL(job.name,'') as [Backup Job Name],
CASE
WHEN job.enabled = 0 THEN 'No'
WHEN job.enabled = 1 THEN 'Yes'
WHEN job.enabled IS NULL THEN ''
END as [Job Enabled],
CASE
WHEN schedule.enabled = 0 THEN 'No'
WHEN schedule.enabled = 1 THEN 'Yes'
WHEN schedule.enabled IS NULL THEN ''
ELSE 'Unknown'
END AS [Schedule Enabled],
CASE
WHEN jobhist.run_status = 0 THEN 'Failure'
WHEN jobhist.run_status = 1 THEN 'Successful'
WHEN schedule.enabled=0 OR job.enabled=0 THEN 'Disabled'
WHEN jobhist.run_status IS NULL THEN ''
ELSE 'Unknown'
END AS [Job Status]
from sys.databases as db
full outer join msdb.dbo.log_shipping_primary_databases as pri
on db.name=pri.primary_database
full outer join msdb.dbo.log_shipping_primary_secondaries as sec
on db.name=sec.secondary_database
OR pri.primary_id=sec.primary_id
left join msdb.dbo.sysjobs as job
on pri.backup_job_id=job.job_id
left join msdb.dbo.sysjobschedules as jobschedule
on job.job_id=jobschedule.job_id
left join msdb.dbo.sysschedules as schedule
on jobschedule.schedule_id=schedule.schedule_id
left join (select job_id,
min(run_status) AS [run_status] --0 is failed 1 is successful
from msdb.dbo.sysjobhistory
WHERE step_name='(Job outcome)'
AND DATEDIFF(hh,convert(datetime,
stuff(stuff(cast(run_date as nchar(8)), 7, 0, '-'), 5, 0, '-') + N' ' +
stuff(stuff(substring(cast(1000000 + run_time as nchar(7)), 2, 6), 5, 0, ':'), 3, 0, ':'),
120),GETDATE())<=2 --looks at the job history for the last 2 hours
GROUP BY job_id) AS jobhist
ON job.job_id=jobhist.job_id
WHERE
pri.primary_database is NOT NULL
OR sec.secondary_database is NOT NULL
ORDER BY
db.name ASC
log shipping, script
I wrote this script quite awhile ago and it seems like I use it pretty much every day. I figured if I used it so much, then others would probably find a good use for it to. It parses the backup history tables in MSDB and returns the relevant information like how long the backup took, what was the size of the backup, what user took the backup, is it a copy only backup, etc.
As written below it returns information for full and differential backups taken in the last week for all databases.
SELECT bu.database_name as [DB Name],
name as [Backup Name],
CASE bu.type
WHEN 'D' THEN 'FULL'
WHEN 'I' THEN 'Differential'
WHEN 'L' THEN 'Transaction Log'
END as [Backup Type],
CASE bu.is_copy_only
WHEN '1' THEN 'Copy'
ELSE ''
END as [Copy],
bf.physical_device_name as [Physcial Device Name],
bu.backup_start_date as [Start Date],
bu.backup_finish_date as [Finish Date],
CAST(DATEDIFF(s, bu.backup_start_date, bu.backup_finish_date)/3600 AS varchar) + ' hours, '
+ CAST((DATEDIFF(s, bu.backup_start_date, bu.backup_finish_date))/60%60 AS varchar)+ ' minutes, '
+ CAST((DATEDIFF(s, bu.backup_start_date, bu.backup_finish_date))%60 AS varchar)+ ' seconds'
AS [Total Time],
CAST((bu.backup_size/1024/1024) AS int) AS [Size in MB],
bu.user_name as Username
FROM msdb.dbo.backupset as bu
INNER JOIN msdb.dbo.backupmediafamily as bf ON bu.media_set_id = bf.media_set_id
WHERE
type!='L'
--AND database_name = 'AdventureWorks'
AND DATEDIFF(d, backup_start_date, GETDATE()) <= 7
ORDER BY database_name ASC, backup_start_date DESC
backups, script