Tuesday, January 03, 2006

Grow Your Own Concurrency Problem

What's that? the sound of ORA-00001: approaching...
...
 FUNCTION key_not_in_table(pkey IN INT) RETURN BOOLEAN
 IS
  countkey INT;
 BEGIN
  SELECT count(key) INTO countkey
  FROM key_values WHERE key = pkey;

  IF countkey > 0 THEN
   RETURN FALSE;
  END IF;
  RETURN TRUE;

 END key_not_in_table;
 
 PROCEDURE insert_or_update(pkey IN INT,
   pval IN INT)
 IS
 BEGIN
  IF key_not_in_table(pkey) THEN
   INSERT INTO key_values
   VALUES (key, value, 0);
  ELSE
   UPDATE key_values
   SET value =  pval
   WHERE key = pkey;
  END IF;
 END insert_or_update;

14 comments:

Tony Andrews said...

FIRST!

Oh sorry, I was thinking this was the Daily WTF!

Bob B said...

And the correct way:

PROCEDURE insert_or_update(
pkey IN INT,
pval IN INT
)
IS
BEGIN
BEGIN
INSERT INTO key_value(
KEY,
VALUE
) VALUES (
pkey,
pvalue
);
EXCEPTION
WHEN DUP_VAL_ON_INDEX THEN
UPDATE KEY_VALUES kv
SET kv.VALUE = p_val
WHERE kv.KEY = pkey;

END;

END insert_or_update;

or even more simply (10g and up, maybe even 9ir1 or 9ir2):

PROCEDURE insert_or_update(
pkey IN INT,
pval IN INT
)
IS
BEGIN
MERGE INTO KEY_VALUES old_kv
USING (
SELECT pkey KEY, pvalue VALUE
FROM DUAL
) new_kv
ON (
old_kv.KEY = new_kv.KEY
)
WHEN NOT MATCHED
THEN INSERT(
KEY,
VALUE
) VALUES (
new_kv.KEY,
new_kv.VALUE
)
WHEN MATCHED
THEN UPDATE
SET
old_kv.KEY = new_kv.KEY,
old_kv.VALUE = new_kv.VALUE;

END insert_or_update;

Tony Andrews said...

Actually your first procedure looks simpler than the MERGE version!

And just for completeness:

procedure insert_or_update
( pkey in int
, pval in int
)
is
begin
update key_values kv
set kv.value = p_val
where kv.key = pkey;
if sql%rowcount = 0 then
insert into key_value( key, value ) values ( pkey, pvalue);
end if;
end;

Adrian said...

Ok, careful now contributors.

Only the first example provided by Bob b works.

Either you're being too ironic for me, or you've WTFed yourselves.

Tony Andrews said...

Thai, if you are referring to typos (pval and p_val for example) in my code then they are also in Bob B's first example (which I just copied and rearranged a little). Once those are corrected, it works:

SQL> create table key_value (key int, value int);

Table created.

SQL> create or replace procedure insert_or_update
2 ( pkey in int
3 , pval in int
4 )
5 is
6 begin
7 update key_value kv
8 set kv.value = pval
9 where kv.key = pkey;
10 if sql%rowcount = 0 then
11 insert into key_value( key, value ) values ( pkey, pval);
12 end if;
13 end;
14 /

Procedure created.

SQL> exec insert_or_update(1,77);

PL/SQL procedure successfully completed.

SQL> select * from key_value;

KEY VALUE
---------- ----------
1 77

SQL> exec insert_or_update(1,88);

PL/SQL procedure successfully completed.

SQL> select * from key_value;

KEY VALUE
---------- ----------
1 88

Adrian said...

Nope. Skip the typos.

Open two sessions.

first session:
exec insert_or_update(1,88)
second session:
exec insert_or_update(1,77)
first session:
commit;
second session:
[ fill in the gap ]

Tony Andrews said...

Since the table will of course have a primary key, second session will fail with:

SQL> exec insert_or_update(1,77);
BEGIN insert_or_update(1,77); END;

*
ERROR at line 1:
ORA-00001: unique constraint (TANDREWS.KEY_VALUE_PK) violated
ORA-06512: at "TANDREWS.INSERT_OR_UPDATE", line 11
ORA-06512: at line 1

I'm not giving up on my version without a fight ;-) If I'm wrong then so is Tom Kyte here:

http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:6618304976523

Tony Andrews said...

Actually, I'll concede that I am wrong (but in good company!) The whole point is that the procedure should either insert or update the row; to fail to insert or update the row because it already exists is a nonsense.

Bob B said...

Yeah, sorry for the typos. Was just throwing up the general idea for "correct" solutions without testing them. Called the table key_values in one half and key_value in the other, among other typos. I will say that my first solution is the only one of the 3 that replicates the same behavior.

The merge will do whatever it wants. Merge's performance could be better, neutral or worse, but its a hell of a lot more straightforward on what its doing and will likely be the best performer over all proportions of insert to update. The third solution changes the nature of the algorithm and could be a huge performance bottleneck if the rate of updates relative to inserts is low (as could the original solution if the rate of updates is high).

Adrian said...

Personally I prefer my code to do "what I want" rather than "whatever it wants".

Adrian said...

Nope. You're totally wrong.

"If the transactions overlap you'll know" - I don't care if the transactions overlap, just like I don't care if two updates occur at the same time. That's the point.

"look at the PK value - if NULL then INSERT" What are you talking about? that's just nonsense.

I would argue further, but I lost confidence in anything else you might have to say when you I got to:

"...to create a file of INSERTs and a file of UPDATEs."

That's just plain spooky.

Read consistency is not a 10g feature. It was there in 7.3

What is it that you think is giving you "guaranteed serialised access"?

Adrian said...

Sigh.

If you've used a surrogate key and you're inserting rows based on whether the application supplied surrogate value is [null] then you're going to end up with duplicate natural keys with different surrogate keys.
Unless you also define a unique constraint on the natural key, in which case the [null]ness of the supplied surrogate is irrelevant.

So I don't get your point at all.

And what guarantees serialised access is that the two jobs run sequentially and nothing else writes to the database.

That doesn't "guarantee" anything.

Lots quicker

I was taking issue with "generating a file of inserts and a file of updates".

Adrian said...

Dear Pauln,

Great. So you return the error to the client.

What then?

Lets take the case where your client is a person in the call centre. Ever worked in/near a call centre operation? Upon being notified that "the user being created already exists" ( or as some posters to this blog would have it "SQL exception ocured[sic]", or even just "") the call centre user looks at the screen blankly for a few seconds and then thinks "what the hell is the system on about?" and re-submits the instruction, at which time the update occurs.

How does this help?

Or in the case of the batch job running in the basement, it has to deal with the exception some how. How might this be? Either discard the change, or process the exception with an update. Granted you could flag an exception at this point, which you could do in the update code if you so chose.

So imagine for a second that we did create a list of exceptions where two conflicting copies of a user's credentials were processed. To be consistent you'd have to do this when both a user and the batch job performed at update at the same time: how would you identify that? what would you gain?

And once you had the information what would you do with it? Ring the client? and say what? "excuse me Mr. X, but our batch processing system thinks your phone number is XYZ whereas you called the help centre to day and said it was HJG? what's going on?" - which number would you even call?

Lets avoid the issue of the correct identification of a natural key for a client. I think this has been done to death on certain other forums not so long ago. However the "good" natural key you identified is actually a bad natural key. It limits you to dealing with clients who have a valid NI or SSN number. Additionally neither of these are actually unique.

Adrian said...

"My point was only that you can't universally condemn the code without knowing more about its environment."

I can when the code is universally wrong for any environment.

You DON'T count the number of rows in a table with a key value and then insert if you get "0".

That's just wrong.

If you wanted to report the exception back to the user then you don't write a procedure called "insert_or_update" you write one called "add_user" and allow the exception to propagate back.

If you DO want to process an insert_or_update then you write an INSERT and bounce off the dup_val_on_index to do the update.

There is no other option.

The code mentioned also suffers from a concurrent delete issue. "updated 0 records" which cannot be handled as it stands.

It's wrong. Did I mention it? Just wrong.