INSERT ... ON DUPLICATE KEY UPDATE behavior in an emergency
- Transfer
A few weeks ago, I was working on a client problem that faced a drop in database performance and even its failures, which occurred approximately every 4 weeks. Nothing special in the environment, in hardware or inquiries. In essence, most of the database was one table in which, among other things, were present
The queries working with this table were almost all of type
For discussion purposes, we will use the following table as an illustration of the situation:
Now imagine the following sequence of events:
Nothing out of the ordinary, right? We inserted one row into an empty table and if we do
And now, even if we did not insert a new row, our counter has
At this point you might think, “So what?” Let's get back to our customer. 1500
So, we tried to insert a row and it did not work out, because
Looks fine, huh? 2 rows changed, it is obvious that for a number that corresponded to the condition
Oops, the last row was updated, which
Now we can easily understand the problem of the client whose database served as the inspiration for this post. 1500 requests per second, trying to block and update the same row, will not lead to anything good. Of course, there is a simple solution - change the data type of the
It turns out that this behavior is documented . The manual says that ours
INT AUTO_INCREMENT PRIMARY KEY
and UNIQUE KEY
. The queries working with this table were almost all of type
INSERT ... ON DUPLICATE KEY UPDATE
(hereinafter - INSERT ODKU
), where the columns listed in INSERT
corresponded to columns c UNIQUE KEY
. And they were executed with a frequency of approximately 1500-2000 requests per second, continuously 24 hours a day. If you are good at math, then you probably already guessed what was the matter.For discussion purposes, we will use the following table as an illustration of the situation:
CREATETABLE update_test (
idINTUNSIGNEDNOTNULL AUTO_INCREMENT,
username VARCHAR(20) NOTNULL,
host_id TINYINT UNSIGNEDNOTNULL,
last_modified TIMESTAMPNULLDEFAULTNULL,
PRIMARY KEY(id),
UNIQUEKEY(username)
) ENGINE=InnoDB;
Now imagine the following sequence of events:
(root@localhost) [test]> INSERTINTO update_test (username, host_id, last_modified) VALUES ('foo',3,NOW());
Query OK, 1 row affected (0.00 sec)
(root@localhost) [test]> select * from update_test;
+----+----------+---------+---------------------+
| id | username | host_id | last_modified |
+----+----------+---------+---------------------+
| 1 | foo | 3 | 2012-10-05 22:36:30 |
+----+----------+---------+---------------------+
Nothing out of the ordinary, right? We inserted one row into an empty table and if we do
SHOW CREATE TABLE
, then we will see that the counter AUTO_INCREMENT
now matters 2
. If we do INSERT ODKU
this table, we will see the following:(root@localhost) [test]> insertinto update_test (username,host_id) values ('foo',1) onduplicatekeyupdate last_modified=NOW();
Query OK, 2 rows affected (0.00 sec)
(root@localhost) [test]> select * from update_test;
+----+----------+---------+---------------------+
| id | username | host_id | last_modified |
+----+----------+---------+---------------------+
| 1 | foo | 3 | 2012-10-05 22:58:28 |
+----+----------+---------+---------------------+
1 row in set (0.00 sec)
And now, even if we did not insert a new row, our counter has
AUTO_INCREMENT
grown to 3
. This is, in fact, the expected behavior. InnoDB checks the constraints in the order in which they were defined, and PRIMARY KEY
always comes first. Therefore, MySQL checks ours INSERT
, sees that the next value is AUTO_INCREMENT
available and uses it, but then, it checks UNIQUE KEY
and finds a violation, so instead INSERT
it does it UPDATE
. If we look at the counters handler status
, we can see that there was one insert request that failed, and one update request that was successful (this explains why 2 rows were changed, not 1).(root@localhost) [test]> showstatuslike'handler%';
*** some rows omitted ***
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Handler_commit | 1 |
| Handler_rollback | 0 |
| Handler_update | 1 |
| Handler_write | 1 |
+----------------------------+-------+
At this point you might think, “So what?” Let's get back to our customer. 1500
INSERT ODKU
per second, continuously 24 hours a day. PRIMARY KEY
their tables are the same as I used in the demo table - INT UNSIGNED
. We count. The maximum value for INT UNSIGNED
is 4294967295
. We divide it by 1500 requests per second and divide by 86400, which is the number of seconds in a day, and we get 33.1 days, or a little more than 4 weeks. Coincidence? I do not think so. So what exactly happens when we go beyond the meaning? Some behavior may surprise you. We return to our demo table and insert a row with the maximum value for column c into it AUTO_INCREMENT
, and then insert another.(root@localhost) [test]> insertinto update_test (id,username,host_id) values (4294967295, 'bar', 10);
Query OK, 1 row affected (0.00 sec)
(root@localhost) [test]> flushstatus;
(root@localhost) [test]> insertinto update_test (username,host_id) values ('baz', 10);
ERROR 1062 (23000): Duplicate entry '4294967295' for key 'PRIMARY'
(root@localhost) [test]> showstatuslike'handler%';
*** some rows omitted ***
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Handler_rollback | 1 |
| Handler_write | 1 |
+----------------------------+-------+
So, we tried to insert a row and it did not work out, because
AUTO_INCREMENT
already had the maximum value and the request did not pass. But what happens if we try to do INSERT ODKU
? First, let's see what we have in the table:(root@localhost) [test]> select * from update_test;
+------------+----------+---------+---------------------+
| id | username | host_id | last_modified |
+------------+----------+---------+---------------------+
| 1 | foo | 3 | 2012-10-05 22:58:28 |
| 4294967295 | bar | 10 | NULL |
+------------+----------+---------+---------------------+
2 rows in set (0.00 sec)
(root@localhost) [test]> INSERTINTO update_test (username, host_id) VALUES ('foo', 7) ONDUPLICATEKEYUPDATE host_id=7, last_modified=NOW();
Query OK, 2 rows affected (0.00 sec)
Looks fine, huh? 2 rows changed, it is obvious that for a number that corresponded to the condition
username = "foo"
have been updated host_id
and last_modified
, and we can be happy. Unfortunately, this is not the case:(root@localhost) [test]> select * from update_test;
+------------+----------+---------+---------------------+
| id | username | host_id | last_modified |
+------------+----------+---------+---------------------+
| 1 | foo | 3 | 2012-10-05 22:58:28 |
| 4294967295 | bar | 7 | 2012-10-05 23:24:49 |
+------------+----------+---------+---------------------+
2 rows in set (0.00 sec)
Oops, the last row was updated, which
id
is equal to the maximum value of ours AUTO_INCREMENT
, but was ignored UNIQUE KEY
on the column username
. Now we can easily understand the problem of the client whose database served as the inspiration for this post. 1500 requests per second, trying to block and update the same row, will not lead to anything good. Of course, there is a simple solution - change the data type of the
AUTO_INCREMENT
-column c INT
from BIGINT
. It turns out that this behavior is documented . The manual says that ours
INSERT ODKU
on a table with several unique indexes will be equivalent to the query UPDATE update_test SET host_id = 7, last_modified = NOW() WHERE id = 4294967295 OR username = "foo" LIMIT 1
and of course the optimizer will choose PRIMARY
rather than the secondary one UNIQUE
.What have we learned here?
- Exhausting is
AUTO_INCREMENT
much easier than it sounds. The real customer table contained less than 500k rows. - Using
SIGNED
types forAUTO_INCREMENT
is almost always a bad idea. You lose half the range of available values. - Intuition, like the laws of physics, often fails in extreme situations.