SQL Server 2005 --How to move 10 Million rows in 1 Millisecond, even better without incurring any I/O load !!

Most developers / administrators would use SELECT INTO or a INSERT INTO statement to load a destination table.  This is a still a great way of accomplishing the task at hand, but it is not nearly fast as what I am about to show you.

The method I am about to show you is not for all scenarios, but it can be very useful sometimes, This method derives it power based on the partitioning functionality in SQL Server 2005 / 2008.
SQL Server 2005 has built-in functionality that allows tables to be split or divided into what I will call virtual tables, whose values are dependent on predefined boundaries. When the partitioning function column is used in a query predicate the optimizer knows which partition the data resides in, which makes queries more IO efficient. 

This is amazing functionality because it does not require application changes and significantly reduces the amount of data SQL Server has to sift through.

The partitioning feature I will be focusing on is the feature that allows SQL Server to switch or trade partitions out, named SWTICH.  This is commonly used for situations where you want to move data to a different partition either because the boundaries have changed or you need to phase data out.  The real benefit in using the SWITCH function is SQL Server does not actually move the data, it updates the meta data pointers to the data.  Because I am not actually moving data, I am able to move data around nearly instantaneously, regardless of the number of rows.

Example:

Create a table and populate it with 10000000 rows

CREATE TABLE [dbo].[TestingIO-1](
RowNum INT PRIMARY KEY,
SomeId INT,
SomeCode CHAR(2);
GO

Insert 10 million rows into the new table

INSERT INTO [dbo].[ TestingIO-1]
SELECT TOP 10000000
ROW_NUMBER() OVER (ORDER BY t1.NAME) AS RowNumber,
ABS(CHECKSUM(NEWID()))%2500+1 AS SomeId,
CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65) AS SomeCode
FROM
Master.dbo.SysColumns t1,
Master.dbo.SysColumns t2
GO

--Create New Table To Move Data To

CREATE TABLE [dbo].[TestingIO-2](
RowNum INT PRIMARY KEY,
SomeId INT,
SomeCode CHAR(2));
GO

Now comes the interesting part !!

--Move data to the new table
SET STATISTICS TIME ON;
SET STATISTICS IO ON;

ALTER TABLE [dbo].[ TestingIO-1] SWITCH to [dbo].[ TestingIO-2];

SET STATISTICS TIME OFF;
SET STATISTICS IO OFF;
GO

/*
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 1 ms.

*/
Next, I will verify the results.

SELECT COUNT(*) FROM [dbo].[TestData]; --0
SELECT COUNT(*) FROM [dbo].[NewTestData]; --10,000,000

/*
-----------
0

(1 row(s) affected)


-----------
10000000

(1 row(s) affected)
*/


That’s it.  I have successfully moved 10 million rows into a new table in 1 MS and incurred no IO through IO stats; however, IO must have incurred to update meta data, although it should be minimal.  For a comprehensive list of requirements please refer to this link,
http://technet.microsoft.com/en-us/library/ms191160.aspx . 

Thanks to ADAM HAINES, I got this tip by reading one of his articles.


Thanks
Huzeifa Bhai

No comments:

Post a Comment