I had an interesting issue the other day where a customer’s maintenance plan was failing. The maintenance plan had 5 steps to it. The first three were all executing Agent Jobs using the Execute SQL Server Agent Job Task. The last two steps were a History Cleanup and Maintenance Cleanup tasks. Each step used a connector that specified to go to the next step after completion of the last.

Sometimes one or two of the Agent Jobs steps would fail stating that they were chosen as the deadlock victim. Looking further in the steps they were pretty simple, just executing some backups and restores.

I enabled the deadlock trace flags to capture the relevant info and waited for it to fail again. Once it failed again I saw that sometimes when the backup or restore was trying to update the history tables in MSDB it was deadlocking with the queries executed in the History Cleanup. At first I thought this was weird as how could multiple steps possibly be executing at the same time.

It turns out that the Execute SQL Server Agent Job Task uses the stored procedure sp_start_job. This stored procedure starts the job and immediately reports if the job started successfully or not. It does not report if the job completed successfully. Therefore the first three steps plus the History Cleanup were all being executed at the same time.

Luckily the fix in this case was quite easy. I just removed the History Cleanup task to its own maintenance plan and we scheduled it to run at a different time. In other situations you may need to write a loop that checks the status of the job executed so that you don’t move on until the job has completed.

The fact that the Execute SQL Server Agent Job Task can run multiple steps asynchronously does not seem to be well documented anywhere, so I thought this would be a good reminder for myself and anyone else that runs into similar issues.


, ,

The other day I had an issue with sql 2005 running on Server 2008 cluster. This was a new cluster that was recently updated from RTM to SP3. After SP3 was applied you could no longer view or create any Maintenance Plans. The error in SSMS was that two columns were invalid from_msx and has_targets. After some research this seems to be fairly common but the work around was difficult to find. What I found was that these two new columns are added to msdb.dbo.sysmaintplan_plans. It seems that on a Server 2008 cluster the SP3 installer does not run the sysdbupg.sql script located in C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Install folder. After manually running the script I was then able to fully utilize the maintenance plan feature in SSMS.

Of course this wasn’t the end of it and another issue sprang up. Once the maintenance plans were created the agent jobs gave an error. The message was that the SSIS subsystem failed to load. In order to fix this I ran the following code.

use msdb
go
delete from msdb.dbo.syssubsystems
exec msdb.dbo.sp_verify_subsystems 1
go

-Jeremy


, ,