Update or insert rows in SQL
Some operations are not as easy in SQL as in procedural or imperative languages. For example while it is very intuitive to make an update or insert (also known as “upsert”) operation on a std::vector in C++, it is tricky in SQL. The upsert operation is: update data if already in the set and insert it otherwise.
As for recursion in SQL, the way to perform an upsert depends on the DBMS and its version.
This article introduces different methods to upsert data in a table. Many solutions are introduced because not all DBMS are compatible with SQL:2008 and the MERGE statement. As of today, it is compatible from these versions:
MySQL does not provide this statement but its own SQL extentions:
Microsoft SQL Server 2000 and 2005 don't provide this feature but upsert is possible using the output clause for the UPDATE statement.
Finally, the only solution I am aware of for PostgreSQL is to use a cursor and test each line: rows existing in the target table are updated and the others are inserted.
Beware: all examples were written under Microsoft SQL Server 2008 (except the MySQL part). They may use some specific code but should remain fully understandable by everyone used to SQL.
Dataset
Before introducing all methods, let's set up the sample data.
@tUsers is the working table. It contains a list of users with one particular field: active. When inserting data, this field is not set.
declare @tUsers table ( id int NOT NULL, firstName varchar (30) NOT NULL, lastName varchar (30) NOT NULL, active tinyint NULL ) insert into @tUsers (id, firstname, lastname) values (1, 'Clément', 'Perrod') insert into @tUsers (id, firstname, lastname) values (2, 'Thomas', 'Dupont') insert into @tUsers (id, firstname, lastname) values (3, 'John', 'Doe')
The table contains the following rows:
id firstName lastName active 1 Clément Perrod NULL 2 Thomas Dupont NULL 3 John Doe NULL
@tStaging will be the source table for upsert in @tUsers. Some rows are new and others are just an update for the active field.
declare @tStaging table ( id int NOT NULL, firstName varchar (30) NOT NULL, lastName varchar (30) NOT NULL, active tinyint NULL ) insert into @tStaging (id, firstname, lastname, active) values (1, 'Clément', 'Perrod', 1) insert into @tStaging (id, firstname, lastname, active) values (3, 'John', 'Doe', 0) insert into @tStaging (id, firstname, lastname) values (4, 'Elise', 'Duffin') insert into @tStaging (id, firstname, lastname) values (5, 'Anonie', 'Mousse')
The table contains following rows:
id firstName lastName active 1 Clément Perrod 1 3 John Doe 0 4 Elise Duffin NULL 5 Anonie Mousse NULL
Methods
MERGE statement
The easiest way to perform upsert is to use the MERGE statement introduced by SQL:2008. Its syntax is pretty intuitive: different treatments are applied on the target table using data from the source table depending on a merging clause. In this article it is used for upsert, but it is also possible to use it to delete data in target table.
MERGE
[ TOP ( expression ) [ PERCENT ] ]
[ INTO ] <target_table> [ WITH ( <merge_hint> ) ] [ [ AS ] table_alias ]
USING <table_source>
ON <merge_search_condition>
[ WHEN MATCHED [ AND <clause_search_condition> ]
THEN <merge_matched> ] [ ...n ]
[ WHEN NOT MATCHED [ BY TARGET ] [ AND <clause_search_condition> ]
THEN <merge_not_matched> ]
[ WHEN NOT MATCHED BY SOURCE [ AND <clause_search_condition> ]
THEN <merge_matched> ] [ ...n ]
[ <output_clause> ]
[ OPTION ( <query_hint> [ ,...n ] ) ]
;
The upsert is as easy as this:
merge @tUsers as us using @tStaging as st on us.id = st.id /* Update */ when matched then update set us.active = st.active /* Insert */ when not matched then insert (id, firstName, lastName, active) values (st.id, st.firstName, st.lastName, st.active) /* Mandatory semi colonn */ ;
ON DUPLICATE KEY clause of the INSERT statement
This upsert method is based on the ON DUPLICATE KEY clause for the INSERT statement, which is available in MySQL 5 and above.
This solution only works if the table @tUsers has a primary key. Thus, for this example to work, the table declaration is:
CREATE TABLE `tUsers` ( `id` int(11) NOT NULL, `firstName` varchar(30) NOT NULL, `lastName` varchar(30) NOT NULL, `active` tinyint(4) DEFAULT NULL, PRIMARY KEY (`id`,`firstName`,`lastName`) ) ; INSERT INTO tUsers (id, firstname, lastname) VALUES (1, 'Clément', 'Perrod') ; INSERT INTO tUsers (id, firstname, lastname) VALUES (2, 'Thomas', 'Dupont') ; INSERT INTO tUsers (id, firstname, lastname) VALUES (3, 'John', 'Doe') CREATE TABLE `tStaging` ( `id` int(11) NOT NULL, `firstName` varchar(30) NOT NULL, `lastName` varchar(30) NOT NULL, `active` tinyint(4) DEFAULT NULL ) ; INSERT INTO tStaging (id, firstname, lastname, active) VALUES (1, 'Clément', 'Perrod', 1) ; INSERT INTO tStaging (id, firstname, lastname, active) VALUES (3, 'John', 'Doe', 0) ; INSERT INTO tStaging (id, firstname, lastname) VALUES (4, 'Elise', 'Duffin') ; INSERT INTO tStaging (id, firstname, lastname) VALUES (5, 'Anonie', 'Mousse')
Using the clause is as simple as follows:
insert into tUsers select st.id, st.firstName, st.lastName, st.active from tStaging st ON DUPLICATE KEY UPDATE active = st.active
OUTPUT clause of the UPDATE statement
This upsert method is based on the OUTPUT clause of the UPDATE statement, which is available in Microsoft SQL Server 2000 and 2005 and above.
First, an update is performed on @tUsers, using the rows of @tStaging. The OUTPUT clause keeps all updated rows in the @tUpdated table. We are now able to see which rows of @tStaging have been used and which have to be inserted.
declare @tUpdated table ( id int ) update @tUsers set active = st.active output inserted.id into @tUpdated from @tStaging st, @tUsers us where us.id = st.id
The second step is a simple INSERT statement, in which rows that are not in @tUpdated are selected from @tStaging using a LEFT OUTER JOIN (or a NOT EXISTS statement).
insert into @tUsers select st.id, st.firstName, st.lastName, st.active from @tStaging st left join @tUpdated up on up.id = st.id WHERE up.id is null
Final points
In this article, I have been through the most famous solutions to update or insert rows in a table (except the cursor one). Even if some methods are easier to use, it all depends on the installed DBMS.
MERGE statement |
ON DUPLICATE KEY clause for INSERT statement |
output clause for UPDATE statement |
cursors |
|
|
|
|
Stay tuned: we will discuss later why there are so many differences between DBMS.
Read more about merge
- MS SQL Server 2008: MERGE
- IBM DB2 v9: MERGE
- Oracle 11g: MERGE
- MS SQL Server 2000 and 2005: output
clause - MySQL 5: ON DUPLICATE KEY clause

Your post is top notch.Will bookmark your web page for additional visits,As an aside,where did you get this remarkable theme for your blog site?