Sunday, May 4, 2014

Copy master-detail data when destination master table has an 'identity' column in SQL Server

This is the one of good old tricks, which become possible in SQL Server 2008 with new MERGE operator.
I want to insert the set of records into 2 tables: master and detail, and master table has IDENTITY column as a primary key and detail table references this PK.
Define target tables as follows and put some data in them:
CREATE TABLE #destination_order (
	OrderId	INT NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED,
	OrderName VARCHAR(100) NOT NULL
)
CREATE TABLE #destination_orderitem (
	ItemId INT NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED,
	OrderId	INT NOT NULL,
	ItemName VARCHAR(100) NOT NULL
)
INSERT #destination_order (OrderName)
VALUES
	('Existing 1')
,	('Existing 2')
And here is the data we want to insert into the destination master and detail tables:
CREATE TABLE #source_order (
	OrderId	INT NOT NULL PRIMARY KEY CLUSTERED,
	OrderName VARCHAR(100) NOT NULL
)
CREATE TABLE #source_orderitem (
	ItemId INT NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED,
	OrderId	INT NOT NULL,
	ItemName VARCHAR(100) NOT NULL
)

INSERT #source_order (OrderId, OrderName)
VALUES
	(1, 'Source 1')
,	(2, 'Source 2')
,	(3, 'Source 3')
INSERT #source_orderitem (OrderId, ItemName)
VALUES
	(1, 'Source 1 Item 1')
,	(1, 'Source 1 Item 2')
,	(1, 'Source 1 Item 3')
,	(2, 'Source 2 Item 1')
,	(2, 'Source 2 Item 2')
,	(3, 'Source 3 Item 1')
So we want to insert data from #source_order and #source_orderitem tables into #destination_order and #destination_orderitem. The straightforward approach is to use cursor, insert master records one by one, get SCOPE_IDENTITY() and insert detail records for each master. But this can be done in more elegant and faster way.
Firstly, suppose we can insert all master records in one pass. The problem is that we cannot find out which new primary key #destination_order.Orderid corresponds to the #source_order.OrderId. The following construct is invalid, as OUTPUT statement can have only INSERTED pseudo-table fields when applied to INSERT operator:
INSERT #destination_order (OrderName)
OUTPUT #source_order.OrderId, INSERTED.OrderId 
SELECT OrderName
FROM #source_order
But, we can use MERGE which allow to specify source fields in the OUTPUT list, and we can simply emulate INSERT with MERGE:
DECLARE @ids TABLE (
	sourceOrderId INT NOT NULL,
	destinationOrderId INT NOT NULL)

;WITH source AS (
	SELECT OrderId, OrderName 
	FROM #source_order
)
MERGE #destination_order target
USING source ON 1=0
WHEN NOT MATCHED THEN
	INSERT (OrderName)
	VALUES (source.Ordername)
OUTPUT source.OrderId, INSERTED.OrderId 
INTO @ids (sourceOrderId, destinationOrderId)
;

INSERT #destination_orderitem (OrderId, ItemName)
SELECT ids.destinationOrderId, source.ItemName
FROM #source_orderitem source
INNER JOIN @ids ids ON ids.sourceOrderId = source.OrderId
This is simple and effective way and no cursor is required.