The other day I was troubleshooting an issue with log shipping where the copy and restore jobs were failing. Here is an example of the error for the copy job, which was also similar to the restore job.
2010-06-29 09:41:52.80 *** Error: Could not retrieve copy settings for secondary ID '[removed]'.(Microsoft.SqlServer.Management.LogShipping) ***
2010-06-29 09:41:52.81 *** Error: The specified agent_id BECBBCC0-6867-4398-BD96-830D62D88558 or agent_type 1 do not form a valid pair for log shipping monitoring processing.(.Net SqlClient Data Provider) ***
While it is not completely clear, this means that the copy job was not able to login to the instance and query the log shipping tables in MSDB. When I was first trying to gain access to the instance I had noticed that it was configured a bit strange. Basically they had a default instance and a named instance. The default instance was used as the secondary for log shipping while the named instance was being used for dev work. Well that part isn’t so strange… However they had an alias configured to redirect the server name to the named instance. So the log shipping jobs thought it was connecting to the default instance, it was actually connecting to the named instance. To work around this I just had to add the port number to the jobs as in the example below.
"C:\Program Files\Microsoft SQL Server\90\Tools\Binn\sqllogship.exe" -Copy BECBBCC0-6867-4398-BD96-830D62D88558 -server SERVER01,1433
log shipping, troubleshooting
At work our first line of defense for our SQL Servers is our system admins and engineers. Overtime I have written a decent number of scripts to try and help out our operations people so that when an alert comes in they can diagnose issues quickly and accurately. The following script checks and reports on the configuration of the log shipping primary server. It also does a simple analysis showing the most immedaite issue.
--The Job Status shows if the backup job has failed in the last 2 hours.
select
ISNULL(db.name,'') as [Database Name],
CASE
WHEN db.name IS NULL THEN 'DB is missing.'
WHEN db.state_desc <> 'ONLINE' THEN 'DB is not online.'
WHEN db.recovery_model_desc = 'SIMPLE' THEN 'DB is using simple recovery model.'
WHEN pri.primary_database IS NULL THEN 'Missing log shipping primary config.'
WHEN sec.secondary_database IS NULL THEN 'Missing log shipping primary secondary config.'
WHEN job.name IS NULL THEN 'Backup job is missing.'
WHEN job.enabled = 0 THEN 'Backup job is disabled.'
WHEN schedule.name IS NULL THEN 'Job is missing a schedule.'
WHEN schedule.enabled = 0 THEN 'Job schedule is disabled.'
WHEN jobhist.run_status = 0 THEN 'Job has failed in the last 2 hours.'
ELSE ''
END AS [Immediate Issue],
ISNULL(db.state_desc,'') as [Database State],
ISNULL(db.recovery_model_desc,'') as [Recovery Model],
ISNULL(pri.primary_database,'') as [Primary DB],
ISNULL(sec.secondary_database,'') as [Secondary DB],
ISNULL(sec.secondary_server,'') as [Secondary Server],
ISNULL(job.name,'') as [Backup Job Name],
CASE
WHEN job.enabled = 0 THEN 'No'
WHEN job.enabled = 1 THEN 'Yes'
WHEN job.enabled IS NULL THEN ''
END as [Job Enabled],
CASE
WHEN schedule.enabled = 0 THEN 'No'
WHEN schedule.enabled = 1 THEN 'Yes'
WHEN schedule.enabled IS NULL THEN ''
ELSE 'Unknown'
END AS [Schedule Enabled],
CASE
WHEN jobhist.run_status = 0 THEN 'Failure'
WHEN jobhist.run_status = 1 THEN 'Successful'
WHEN schedule.enabled=0 OR job.enabled=0 THEN 'Disabled'
WHEN jobhist.run_status IS NULL THEN ''
ELSE 'Unknown'
END AS [Job Status]
from sys.databases as db
full outer join msdb.dbo.log_shipping_primary_databases as pri
on db.name=pri.primary_database
full outer join msdb.dbo.log_shipping_primary_secondaries as sec
on db.name=sec.secondary_database
OR pri.primary_id=sec.primary_id
left join msdb.dbo.sysjobs as job
on pri.backup_job_id=job.job_id
left join msdb.dbo.sysjobschedules as jobschedule
on job.job_id=jobschedule.job_id
left join msdb.dbo.sysschedules as schedule
on jobschedule.schedule_id=schedule.schedule_id
left join (select job_id,
min(run_status) AS [run_status] --0 is failed 1 is successful
from msdb.dbo.sysjobhistory
WHERE step_name='(Job outcome)'
AND DATEDIFF(hh,convert(datetime,
stuff(stuff(cast(run_date as nchar(8)), 7, 0, '-'), 5, 0, '-') + N' ' +
stuff(stuff(substring(cast(1000000 + run_time as nchar(7)), 2, 6), 5, 0, ':'), 3, 0, ':'),
120),GETDATE())<=2 --looks at the job history for the last 2 hours
GROUP BY job_id) AS jobhist
ON job.job_id=jobhist.job_id
WHERE
pri.primary_database is NOT NULL
OR sec.secondary_database is NOT NULL
ORDER BY
db.name ASC
log shipping, script
The other day I had a situation where the drive containing the files for a log shipping secondary database was full and the log restores were failing. Normally for an online database you modify the file locations, take the db offline, physically move the files, and then bring the db back online. However if the database is in restoring mode it will not allow you to take it offline. Luckily we can work around this. In this case modify the file locations and then stop sql. Physically move the files and then restart sql. When sql starts it will find the files in the new location.
-Jeremy
database files, databases