Updated missing index dmv script

One tool I have in my toolbox is the great missing index script from Bart Duncan. However for me there were a few things I wanted to tweak to make it a better tool. I needed just the absolutely relevant information and I needed away to cut my time spent looking at the results.

1.) I changed the index naming convention so that it is more standard instead of the gobbly gook it normally displays. I used to spend a lot of time changing that gobbly gook to something more meaningful. Personally I generally stick to naming my indexes something like IX__TableName__Col1__Col2 etc. This way I can easily see what table and columns are used in it. I also use two underscores to separate objects as so many people seem to use underscores in their object names.

2.) Sometimes I see it return 200, 300, or more indexes. I definately don’t want to add that many indexes or even consider them. So generally I limit myself to looking at the top 20. This is just one tool in my toolbox and I don’t want to spend too much time just on it.

3.) The original script only ordered by the improvement measure. This lets you see the indexes which should provide the most improvement. That’s great but sometimes the recommendations are duplicates of each other or are similar and could be combined. Therefore I take the top 20 ordered by improvement measure and then order them again by database and table name. It makes it a little easier to work through them as I verify them against the existing indexes, queries the recommendations came from, etc.

4.) The original script has a lot of good info but generally it was too much. I also found it a pain to have to right click and copy the index create statements that I was interested in. In my script you can save the results to a text file and work on it from there.

So now that you have this great tool, make sure you use it in moderation and make sure you heed a few warnings. Bart’s blog that I linked to above lists a few important caveats that you should look into before just blindly adding everything the DMVs recommend.

WITH misind (improvement_measure, database_id, table_name, create_index_statement)
AS
(
SELECT TOP 20
CONVERT (decimal (28,1),
migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans)
) AS improvement_measure,
mid.database_id, object_name(mid.object_id,mid.database_id) as [table_name],
'CREATE INDEX [ix_' + object_name(mid.object_id,mid.database_id) + '__'
+ REPLACE(REPLACE(REPLACE(ISNULL(mid.equality_columns,''),', ','__'),'[',''),']','') +
CASE
	WHEN mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL THEN '__'
	ELSE ''
END
+ REPLACE(REPLACE(REPLACE(ISNULL(mid.inequality_columns,''),', ','__'),'[',''),']','')
+ ']'
+ ' ON ' + mid.statement
+ ' (' + ISNULL (mid.equality_columns,'')
+ CASE WHEN mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL THEN ',' ELSE
'' END
+ ISNULL (mid.inequality_columns, '')
+ ')'
+ ISNULL (' INCLUDE (' + mid.included_columns + ')', '') AS create_index_statement
FROM sys.dm_db_missing_index_groups mig
INNER JOIN sys.dm_db_missing_index_group_stats migs
ON migs.group_handle = mig.index_group_handle
INNER JOIN sys.dm_db_missing_index_details mid
ON mig.index_handle = mid.index_handle
WHERE CONVERT (decimal (28,1), migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks
+ migs.user_scans)) > 10000
AND last_user_seek>DATEADD(wk,-1,GETDATE())
ORDER BY migs.avg_total_user_cost * migs.avg_user_impact * (migs.user_seeks + migs.user_scans) DESC
)
SELECT
	--improvement_measure,
	create_index_statement
FROM misind
ORDER BY database_id, table_name

,

TSQL Tuesday
It’s time once again for that monthly dose of TSQL Tuesday. This month we have database whisperer Michael Swart hosting and he has asked everyone to blog about indexes. As Michael so keenly pointed out indexing covers a huge swath of possibilities. For my part I want to speak about getting rid of those pesky, no good indexes. OK, well really I want to speak about first considering the existing indexes before adding new ones. All too often I find people who have just gone crazy adding an index for every column or have just listed every possible combination of columns they could think of.

So how do we do this? The first thing I like to do is look at the overall number of heaps vs clustered vs non-clustered indexes. This gives me some feel for how well the database may or may not be indexed. Have they taken the time to create clustered indexes on most of their tables? Is their a disproportionately large number of non-clustered vs clustered indexes? The following script will show you the count for each of the heaps, clustered, and non-clustered indexes.

--show a count of indexes by type
select
i.type_desc,
count(*) as cnt
from sys.indexes i
inner join sys.objects o
	on i.object_id = o.object_id
WHERE o.type<>'S'
GROUP BY i.type_desc

Once I get a feel for the overall numbers and types of indexes I like to dive a little deeper and look at the individual tables. I pay particular attention to the tables with the most indexes, especially if they have gone past the single digits into 15, 20, 30, or more indexes on a table for an OLTP workload.

--show a count of indexes per table
--does not count the clustered or heap
select
	o.name as [Table Name],
	(select count(*)
from sys.indexes i
	 WHERE i.object_id = o.object_id
AND i.index_id>1)as cnt
from sys.objects o
WHERE o.type='U'
order by cnt desc

Once I am armed with a list of the “over” indexed tables I compare that to the actual index usage. Indexes come at a cost and they are not cheap. Every Time you insert, update, or delete data the indexes also have to be kept up. Then of course you have to perform regular index maintenance such as rebuilds and reorgs, not to mention keeping the statistics updated. So it is important to insure that your indexes are actually of use and not just costing you expensive cpu and disk io to maintain.

The following script is one I picked up from Glenn Berry’s excellent blog. It will show those indexes which have been written to more than they have been read. These are good candidates to consider dropping when you need to cut your index bloat. I wanted to note that I have modified Glenn’s script in a couple of ways. First I have added a column which contains the DROP INDEX statement which makes it easy to drop the index once I have VERIFIED it is actually not needed. Secondly you have to keep in mind that sys.dm_db_index_usage_stats only has statistics for indexes which are in cache. This is important for two reasons. First you may have indexes which are only used once in a while say for an end of month report. You definitely don’t want to drop indexes which prolong the CEO’s favorite report. Secondly you may have indexes that just aren’t even being used at all. Why waste disk space and time maintaining indexes that are not even being used?

-- Possible Bad Indexes (writes > reads)
SELECT
sch.name + '.' + t.name AS [Table Name], i.name AS [Index Name], i.type_desc,
ISNULL(user_updates,0) AS [Total Writes], ISNULL(user_seeks + user_scans + user_lookups,0) AS [Total Reads],
        ISNULL(user_updates,0) - ISNULL((user_seeks + user_scans + user_lookups),0) AS [Difference],
        'DROP INDEX [' + i.name + '] ON [' + sch.name + '].[' + t.name + ']' AS [Drop_CMD]
FROM sys.indexes AS i WITH (NOLOCK)
LEFT OUTER JOIN sys.dm_db_index_usage_stats AS s WITH (NOLOCK)
	ON s.object_id = i.object_id
	AND i.index_id = s.index_id
	AND s.database_id=db_id()
AND objectproperty(s.object_id,'IsUserTable') = 1
INNER JOIN sys.tables AS t WITH (NOLOCK)
	on i.object_id=t.object_id
INNER JOIN sys.schemas AS sch WITH (NOLOCK)
	on t.schema_id=sch.schema_id
WHERE
--ISNULL(user_updates,0) >= ISNULL((user_seeks + user_scans + user_lookups),0) --shows all indexes including those that have not been used
ISNULL(user_updates,0) - ISNULL((user_seeks + user_scans + user_lookups),0)>0 --only shows those indexes which have been used
AND i.index_id > 1
AND i.is_primary_key<>1
AND i.is_unique_constraint<>1
ORDER BY [Table Name], [index name]

So there you have it. Before I ever even consider adding new indexes I take a dive into the existing indexes. I need to insure that what already exists is running optimally and that I am not going to bring things to a grinding halt by piling on new indexes. As always be incredibly careful before you make any changes to production. Adding and removing indexes can both have huge effects on performance so be doubly sure before you execute those CREATE and DROP statements.

***Bonus*** Hypothetical Indexes
Occasionally I still come across databases that have hypothetical indexes. What are hypothetical indexes? These are created by the Database Engine Tuning Advisor while it makes its recommendations. While normally DTA will remove these when it is done, occasionally it will leave them in place. You can read a little more (but not much) about them over at MSDN. The following query will show if the database  has hypothetical indexes. Once the hypothetical indexes have been identified, they can be safely removed.

--shows hypothetical indexes
SELECT object_name(object_id), name AS Hypothetical FROM sys.indexes
WHERE Is_Hypothetical = 1

The last script I have for you will generate all of the drop commands for the hypothetical indexes.

SELECT 'DROP INDEX ['+i.name+'] ON ['+s.name+'].['+t.name+']' as drop_cmd from sys.indexes i
inner join sys.tables t
	ON i.object_id=t.object_id
inner join sys.schemas s
	on t.schema_id=s.schema_id
where i.Is_Hypothetical=1

, ,

It’s moving day!

Occasionally I come across a situation where someone wants to move their non-clustered indexes to a new filegroup. Why would you want to do this? Typically it has to do with disk space or performance issues. By adding filegroups you can create the additional files on separate raid groups thereby giving you access to the space and speed of the additional spindles. File groups also gives you control over which objects are created on this new file group. This control can be very good if you have identified particular hot spots for certain tables/indexes you wish to isolate.

The first step we will need to complete is creating the new file group. For the purposes of my example I am using the AdventureWorks database and creating a new filegroup named IndexData.

ALTER DATABASE [AdventureWorks] ADD FILEGROUP IndexData;

Once the file group is created you then need to create the file(s). You will need to change the filename to the path you wish to use. You may also wish to change the initial file size rather than relying on the autogrow.

ALTER DATABASE [AdventureWorks] ADD FILE
(NAME = IndexData,
FILENAME = 'F:\sqldata\AdventureWorks_IndexData.ndf',
SIZE = 5MB,
FILEGROWTH = 250MB
) TO FILEGROUP IndexData;

Finally once the prep-work of preparing the filegroup and files is complete, we can work on moving the objects to the newly created filegroup. As is the following script will generate all of the necessary statements for SQL 2008 to move all of the non clustered indexes to the new filegroup. With a bit of tweaking it can work with 2005 or be used to move specific indexes or even the clustered indexes.

WITH index_cte (table_name, index_name, type_desc, index_columns, index_included_columns,
filter_definition, is_unique, is_primary_key, is_unique_constraint,fill_factor,
is_padded, allow_row_locks, allow_page_locks, ignore_dup_key, data_compression_desc,
no_recompute)
as(
select
sch.name+'.'+obj.name [table_name],
i.name [index_name],
i.type_desc,
REPLACE(
(select '['+col.name+'] '+
	CASE WHEN k.is_descending_key = 1 THEN 'DESC'
	ELSE 'ASC'
	END as [data()]
from sys.index_columns as k
inner join sys.columns col
		on k.object_id=col.object_id
		and k.column_id=col.column_id
	where k.object_id = i.object_id
	and k.index_id = i.index_id
	and k.index_id<>0
	and k.is_included_column<>1
	order by key_ordinal, k.column_id
for xml path(''))
,'C [','C, [') as [index_columns],
isnull(
REPLACE(
	(select '['+col.name+']' as [data()]
from sys.index_columns as k
inner join sys.columns col
		on k.object_id=col.object_id
		and k.column_id=col.column_id
	where k.object_id = i.object_id
	and k.index_id = i.index_id
	and k.index_id<>0
	and k.is_included_column<>0
	order by key_ordinal, k.column_id
for xml path(''))
,'] [','], [')
,'') as [index_included_columns],
ISNULL(i.filter_definition,'') as filter_definition,
is_unique,
is_primary_key,
is_unique_constraint,
fill_factor,
is_padded,
allow_row_locks,
allow_page_locks,
ignore_dup_key,
data_compression_desc,
no_recompute
from sys.indexes as i
inner join sys.objects obj
	on i.object_id=obj.object_id
inner join sys.schemas sch
	on obj.schema_id=sch.schema_id
inner join sys.partitions part
	on i.object_id=part.object_id
	and i.index_id=part.index_id
inner join sys.stats stats
	on i.object_id=stats.object_id
	and i.index_id=stats.stats_id
where i.name is not null
and obj.type<>'S'
and sch.name<>'sys'
and i.index_id>1
--and sch.name='HumanResources'
--and obj.name='Employee'
)
SELECT
'CREATE ' +
CASE is_unique
	WHEN 1 THEN 'UNIQUE '
	ELSE ''
END
+ type_desc
+' INDEX '
+ index_name COLLATE DATABASE_DEFAULT
+' ON '
+table_name
+' ('+index_columns+')'+
CASE index_included_columns
	WHEN '' THEN ''
	ELSE ' INCLUDE (' + index_included_columns + ') '
END+
CASE filter_definition
	WHEN '' THEN ''
	ELSE ' WHERE ' + filter_definition
END+
' WITH ('+
'FILLFACTOR = ' + CASE fill_factor WHEN 0 THEN '100' ELSE CAST(fill_factor as varchar) END+
', PAD_INDEX = ' + CASE is_padded WHEN 1 THEN 'ON' ELSE 'OFF' END+
', ALLOW_PAGE_LOCKS = ' + CASE allow_page_locks WHEN 1 THEN 'ON' ELSE 'OFF' END+
', ALLOW_ROW_LOCKS = ' + CASE allow_row_locks WHEN 1 THEN 'ON' ELSE 'OFF' END+
', IGNORE_DUP_KEY = ' + CASE ignore_dup_key WHEN 1 THEN 'ON' ELSE 'OFF' END+
', STATISTICS_NORECOMPUTE = ' + CASE no_recompute WHEN 1 THEN 'ON' ELSE 'OFF' END+
', DATA_COMPRESSION = ' + data_compression_desc +
', DROP_EXISTING = ON'+
')' + ' ON IndexData'
as [create_statement]
from index_cte
order by table_name, type_desc, index_name

And there you have it. Once you run the newly generated script the indexes will be moved to the new filegroup. Make sure you test, test, test before running this in production so you are completely certain of what you are moving and where you are moving it.


, ,