Tuesday, January 03, 2006

Grow Your Own Concurrency Problem

What's that? the sound of ORA-00001: approaching...
 FUNCTION key_not_in_table(pkey IN INT) RETURN BOOLEAN
  countkey INT;
  SELECT count(key) INTO countkey
  FROM key_values WHERE key = pkey;

  IF countkey > 0 THEN

 END key_not_in_table;
 PROCEDURE insert_or_update(pkey IN INT,
   pval IN INT)
  IF key_not_in_table(pkey) THEN
   INSERT INTO key_values
   VALUES (key, value, 0);
   UPDATE key_values
   SET value =  pval
   WHERE key = pkey;
 END insert_or_update;


Tony Andrews said...


Oh sorry, I was thinking this was the Daily WTF!

Bob B said...

And the correct way:

PROCEDURE insert_or_update(
pkey IN INT,
pval IN INT
INSERT INTO key_value(
SET kv.VALUE = p_val
WHERE kv.KEY = pkey;


END insert_or_update;

or even more simply (10g and up, maybe even 9ir1 or 9ir2):

PROCEDURE insert_or_update(
pkey IN INT,
pval IN INT
SELECT pkey KEY, pvalue VALUE
) new_kv
ON (
old_kv.KEY = new_kv.KEY
old_kv.KEY = new_kv.KEY,
old_kv.VALUE = new_kv.VALUE;

END insert_or_update;

Tony Andrews said...

Actually your first procedure looks simpler than the MERGE version!

And just for completeness:

procedure insert_or_update
( pkey in int
, pval in int
update key_values kv
set kv.value = p_val
where kv.key = pkey;
if sql%rowcount = 0 then
insert into key_value( key, value ) values ( pkey, pvalue);
end if;

Thai Rices said...

Ok, careful now contributors.

Only the first example provided by Bob b works.

Either you're being too ironic for me, or you've WTFed yourselves.

Tony Andrews said...

Thai, if you are referring to typos (pval and p_val for example) in my code then they are also in Bob B's first example (which I just copied and rearranged a little). Once those are corrected, it works:

SQL> create table key_value (key int, value int);

Table created.

SQL> create or replace procedure insert_or_update
2 ( pkey in int
3 , pval in int
4 )
5 is
6 begin
7 update key_value kv
8 set kv.value = pval
9 where kv.key = pkey;
10 if sql%rowcount = 0 then
11 insert into key_value( key, value ) values ( pkey, pval);
12 end if;
13 end;
14 /

Procedure created.

SQL> exec insert_or_update(1,77);

PL/SQL procedure successfully completed.

SQL> select * from key_value;

---------- ----------
1 77

SQL> exec insert_or_update(1,88);

PL/SQL procedure successfully completed.

SQL> select * from key_value;

---------- ----------
1 88

Thai Rices said...

Nope. Skip the typos.

Open two sessions.

first session:
exec insert_or_update(1,88)
second session:
exec insert_or_update(1,77)
first session:
second session:
[ fill in the gap ]

Tony Andrews said...

Since the table will of course have a primary key, second session will fail with:

SQL> exec insert_or_update(1,77);
BEGIN insert_or_update(1,77); END;

ERROR at line 1:
ORA-00001: unique constraint (TANDREWS.KEY_VALUE_PK) violated
ORA-06512: at "TANDREWS.INSERT_OR_UPDATE", line 11
ORA-06512: at line 1

I'm not giving up on my version without a fight ;-) If I'm wrong then so is Tom Kyte here:


Tony Andrews said...

Actually, I'll concede that I am wrong (but in good company!) The whole point is that the procedure should either insert or update the row; to fail to insert or update the row because it already exists is a nonsense.

Bob B said...

Yeah, sorry for the typos. Was just throwing up the general idea for "correct" solutions without testing them. Called the table key_values in one half and key_value in the other, among other typos. I will say that my first solution is the only one of the 3 that replicates the same behavior.

The merge will do whatever it wants. Merge's performance could be better, neutral or worse, but its a hell of a lot more straightforward on what its doing and will likely be the best performer over all proportions of insert to update. The third solution changes the nature of the algorithm and could be a huge performance bottleneck if the rate of updates relative to inserts is low (as could the original solution if the rate of updates is high).

Thai Rices said...

Personally I prefer my code to do "what I want" rather than "whatever it wants".

pauln said...

Of course all this techie discussion hides a much higher problem.. thet you have concurrent access to the data, but aren't managing it.. in some ways the WTF code is better than most of the solutions, at least it stands a chance of spotting when the data has changed underneath you and fails.

It may be that the code doesn't need to care about concurrent access, but this is a minority case.

I work on a system that does the WTF code - in two separate batch jobs.. to create a file of INSERTs and a file of UPDATEs.. works fine because we're guaranteed serialised access. (oracle 7.3 and it's about 8 years old.. so don't tell me about 10g features...)

a step up is to look at the incoming PK value - if NULL then INSERT, else update. Then if the transactions overlap you'll know. Even better is row versioning.. no hassle as your framework can cope with that.. if you have one. You still have to look at what's causing the concurrent access... and decide if really it should be happening at all.

Thai Rices said...

Nope. You're totally wrong.

"If the transactions overlap you'll know" - I don't care if the transactions overlap, just like I don't care if two updates occur at the same time. That's the point.

"look at the PK value - if NULL then INSERT" What are you talking about? that's just nonsense.

I would argue further, but I lost confidence in anything else you might have to say when you I got to:

"...to create a file of INSERTs and a file of UPDATEs."

That's just plain spooky.

Read consistency is not a 10g feature. It was there in 7.3

What is it that you think is giving you "guaranteed serialised access"?

pauln said...

"Nope. You're totally wrong."
No I'm not (oh yes you are...oh no I'm not)

"I don't care if the transactions overlap, just like I don't care if two updates occur at the same time. That's the point."
OK, fine, but in lots of cases you WILL care.
The problem is that in the failure mode (two overlapping calls), the two "actors" calling with key,value pairs to this function may both have used the fact that there was no value there already to decide what value to set. (Consider a web page with some settings, or where "KEY" is a batch filename to load and "Value" Is the processing state)
If you can cope with longer transactions this can be solved with "Serializable" isolation level.. but it's not very friendly.
Anyway I'm not saying his code is right, or yours is wrong, I'm saying that you can't know whether either is right or wrong unless you know the (probably Non-functional) requirements. (and that applies to your spooky comment too)

"look at the PK value - if NULL then INSERT" What are you talking about? that's just nonsense."
OK, that wasn't clear - look at the value in the PK field of the passed record. So you know if the passed (say Customer) is NEW (id=null, 0 whatever) or existing (>1 not null etc). It doesn't apply in this case because you have a natural PK.. but in the general case you have artificial keys.

"...to create a file of INSERTs and a file of UPDATEs."
"That's just plain spooky."
Nope, perfectly reasonable GIVEN THE REQUIREMENTS. we have lots of jobs that load a few million records coming in that will either update an existing aggregation record (5%) or create a new one (95%). by splitting into two you can direct path load the inserts, and do the updates in the normal way - Lots quicker. That's a 7.3 database. And what guarantees serialised access is that the two jobs run sequentially and nothing else writes to the database. Read consistency has nothing to do with this.

"Read consistency is not a 10g feature. It was there in 7.3" - I know - annoyingly though it's not in Sql Server 2000!

Thai Rices said...


If you've used a surrogate key and you're inserting rows based on whether the application supplied surrogate value is [null] then you're going to end up with duplicate natural keys with different surrogate keys.
Unless you also define a unique constraint on the natural key, in which case the [null]ness of the supplied surrogate is irrelevant.

So I don't get your point at all.

And what guarantees serialised access is that the two jobs run sequentially and nothing else writes to the database.

That doesn't "guarantee" anything.

Lots quicker

I was taking issue with "generating a file of inserts and a file of updates".

pauln said...

yes, you might end up with duplicate natural keys - is this necessarily any less right than replacing the value when the agent at the far end has passed data based on the fact that there was no data with the same natural key in the database?

Call centre gets call

operator can't find person in DB (<- txn should start here ideally, but it'd be too long)
puts users details in - takes minute or so
(meanwhile..in the cellar)
job runs to load data from mail processing centre and creates a record for the customer calling
call centre op presses save to create the customer record

If you HAVE a good natural key (NI Number/SSN) your code might work - exception raised Duplicate Key SSN, update info instead - but the DB gets "random" set - either the Operator's set or the mail data depending on timing - not good (bit unpredictable for my liking)

In this case, your code will never error the client, but populate the DB with an arbitrary set
the WTF case will error the client if the requests hit very close together
In all other cases either the WTF code or your code will do the same
But the right thing to do in this case would be to reject back to the operator - make him search again (ie by using a surrogate key in addition to the natural key or having the client call "add" or "Update") or the batch - if the batch should be new customers only

With NO natural key then in this case with "my" code you'd get two rows for the same customer.. haha! you say - that's the problem.. yes, but there's no way to prevent that in the scenario above

insert and update files? well that's for two reasons: the splitter job is then decoupled from the database: you feed a set of existing ids and the new aggregations in (in files from ETL tool) and get two files out. Then you have two jobs, one to sqlldr the inserts and one to "sqlplus" the updates. Usual goals: loose coupling, atomicity (the splitter can easily be coded to succeed or fail completely - not leave half a mess etc etc.

It's guaranteed as far as any software is.. db access is controlled, there's a scheduling engine doing all the job dependencies to make sure the jobs don't overlap. Yes, someone could pop in and screw around with the data, but then we can't help that

Thai Rices said...

Dear Pauln,

Great. So you return the error to the client.

What then?

Lets take the case where your client is a person in the call centre. Ever worked in/near a call centre operation? Upon being notified that "the user being created already exists" ( or as some posters to this blog would have it "SQL exception ocured[sic]", or even just "") the call centre user looks at the screen blankly for a few seconds and then thinks "what the hell is the system on about?" and re-submits the instruction, at which time the update occurs.

How does this help?

Or in the case of the batch job running in the basement, it has to deal with the exception some how. How might this be? Either discard the change, or process the exception with an update. Granted you could flag an exception at this point, which you could do in the update code if you so chose.

So imagine for a second that we did create a list of exceptions where two conflicting copies of a user's credentials were processed. To be consistent you'd have to do this when both a user and the batch job performed at update at the same time: how would you identify that? what would you gain?

And once you had the information what would you do with it? Ring the client? and say what? "excuse me Mr. X, but our batch processing system thinks your phone number is XYZ whereas you called the help centre to day and said it was HJG? what's going on?" - which number would you even call?

Lets avoid the issue of the correct identification of a natural key for a client. I think this has been done to death on certain other forums not so long ago. However the "good" natural key you identified is actually a bad natural key. It limits you to dealing with clients who have a valid NI or SSN number. Additionally neither of these are actually unique.

pauln said...

What then? well maybe your suggestion, although I'd knock the user back to the search point - then when he searched again he'd be found, and the client logic can operate on the correct assumption that the user exists
In the batch- again, maybe it depends on the circumstances, again just resubmitting may be enough

My point was only that you can't universally condemn the code without knowing more about its environment. It may be that it's perfectly acceptable to blindly update the value that you didn't know was there, maybe it's not.

Ok, NI Number may not be a good key (only over 16s have one and people can have two if they're defrauding the govt) , IN THIS CASE - if you were building a social security system DSS it might be - but again you've made a sweeping generalisation without knowing the requirements... (unless you've built a system for the DSS and know that NI number isn't unique to amongst DSS customers, in which case I cede to your experience)

Thai Rices said...

"My point was only that you can't universally condemn the code without knowing more about its environment."

I can when the code is universally wrong for any environment.

You DON'T count the number of rows in a table with a key value and then insert if you get "0".

That's just wrong.

If you wanted to report the exception back to the user then you don't write a procedure called "insert_or_update" you write one called "add_user" and allow the exception to propagate back.

If you DO want to process an insert_or_update then you write an INSERT and bounce off the dup_val_on_index to do the update.

There is no other option.

The code mentioned also suffers from a concurrent delete issue. "updated 0 records" which cannot be handled as it stands.

It's wrong. Did I mention it? Just wrong.

Anonymous said...

Attention Blog Owners...!!! WARNING: A new immensely powerful viral marketing system has been released, which allows marketers to broadcast messages directly to people's desktops. Learn more about it here: network marketing site. It pretty much covers network marketing related stuff and it is FREE to join.