Monday, February 27, 2006

Hey, we do the WTFs Part II...

At this rate, we might need to make Connor an honorary member. Here's another gem from his web site that he's picked up on his travels. PL/SQL doesn't get much "better" than this...

Sunday, February 26, 2006

Tip: add DISTINCT to every query

We all know that SQL can be a harsh mistress, and relational theorists such as Chris Date have long argued that the language is fundamentally flawed and that vendors have been misapplying relational theory from the start.

Now a Perl developer on perlmonks.org has been reading Date's book and finds that it explains all of his frustrations with databases. One tip for addressing their shortcomings is to add the handy DISTINCT keyword to every single query, because stupid old SQL doesn't automatically apply the degree of uniqueness you might have in mind:

In fact, one of the founders of relational theory, C.J. Date, recommends that every SQL SELECT statement be written as SELECT DISTINCT ... Unfortunately, many folks get upset with this for a variety of reasons. First, DBMS do a horrible job of optimizing DISTINCT queries. There are many cases where the DISTINCT can be safely ignored but in practice, using DISTINCT with large sets of data and complicated queries can have an abysmal impact on performance. I think this can be put into the "premature optimization" category, though. Until we know that DISTINCT is causing an impact, isn't it better to improve the chance of getting correct results and worry about performance afterwards?

I had to read that a couple of times to realise that adding a DISTINCT to every single query in the hope that it might mask some unknown deficiency in your model, your query, or the SQL language itself is not the "premature optimization" being referred to here - he means the idea that doing so might affect performance. After all, it might not, right?

Read the rest of the discussion at perlmonks.org: "Why SQL Sucks (with a little Perl to fix it)". It also appears on Database Debunkings, "On the sins of SQL".

Friday, February 24, 2006

Hey, we do the WTFs...

I found this beauty on Connor McDonald's web site. It's perfect fodder for Oracle WTF and Connor is happy for us to include it here.

Stop Press: Oracle Granted License To Extend February

Yes, it's official. Oracle has been granted permission to extend February by 3 days. Shame no-one told the developers responsible for INTERVAL arithmetic.

SQL> SELECT DATE '2006-01-31' + INTERVAL '1' MONTH
  2  FROM   dual;
SELECT DATE '2006-01-31' + INTERVAL '1' MONTH
                           *
ERROR at line 1:
ORA-01839: date not valid for month specified

Sunday, February 19, 2006

Wanna Date?

Dates are known to be exceedingly difficult and avoiding them at all costs is something of a skill. This function, in the spirit of Never do in SQL what you can do in PL/SQL, calculates a date range before calling another procedure that also avoided using dates for its input date parameters. So despite its absence, it is at least partially responsible for this mess.

I stripped the code down to its date handling which studiously avoids using date calculations wherever possible, and uses string handling instead, leaving in the comments because they are also the documentation.

create or replace function start_date (
p_range         in      varchar2,
p_in_date       in      varchar2    -- DD-MON-YYYY format String
)
return varchar2
as
l_out_date varchar2(11);
month varchar2(10) := to_char(to_date(p_in_date,'DD-MON-YYYY'),'MON');
year varchar2(10)  := to_char(to_date(p_in_date,'DD-MON-YYYY'),'YYYY');
v_cnt_yr        number;
v_end_date      date    := to_date(p_in_date,'DD-MON-YYYY');
v_start_date    varchar2(11);
begin
if p_range = 'QTD' THEN

  if month in ('JAN','FEB','MAR') then
     -- if given month = march and date is 31
     -- then data for jan,feb and march.
     l_out_date := '01-JAN-'||year;
  elsif month in ('APR','MAY','JUN')  then
     l_out_date := '01-APR-'||year;
  elsif month in ('JUL','AUG','SEP')  then
     l_out_date := '01-JUL-'||year;
  elsif month in ('OCT','NOV','DEC')  then
     l_out_date := '01-OCT-'||year;
  end if;

elsif p_range = 'YTD' then

  -- beginning of the year.
   l_out_date := '01-JAN-'||year;

elsif p_range = 'M' or p_range='MTD' then

  -- beginning of month
   l_out_date := '01-'||month||'-'||year;

elsif p_range like 'B%' then
   v_cnt_yr     := substr(ltrim(rtrim(p_range)),2);
   -- We take the start date as the first day after trailing
   -- back the required no. of months
   l_out_date   := to_char(last_day(add_months(
                      last_day(v_end_date), -v_cnt_yr)) + 1,
                      'DD-MON-YYYY');
end if;
return l_out_date;
end;

After reading it I thought, "So what does this do that TRUNC doesn't?" Apparently not a lot when you need to get the month, quarter or year to date. If you try this at home remember to format the return of START_DATE for readability since it usefully returns a 4000 character string.

SQL> exec :d := '17-JUN-2006'

PL/SQL procedure successfully completed.

SQL> select start_date('QTD',:d) start_date,
2    trunc(to_date(:d),'Q') from dual;

START_DATE  TRUNC(TO_DA
----------- -----------
01-APR-2006 01-APR-2006

SQL> select start_date('YTD',:d) start_date,
2    trunc(to_date(:d),'Y') from dual;

START_DATE  TRUNC(TO_DA
----------- -----------
01-JAN-2006 01-JAN-2006

SQL> select start_date('MTD',:d) start_date,
2    trunc(to_date(:d),'MM') from dual;

START_DATE  TRUNC(TO_DA
----------- -----------
01-JUN-2006 01-JUN-2006

But what about the mysterious 'B%' format mask? This calculates the first day of the month, where 'Bn' is n-1 months ago, tricky eh? In SQL we are forced to call two functions instead of having START_DATE call ADD_MONTHS for us with two bonus LAST_DAYS thrown in for good measure. The n-1 bit could even be a bug but who knows?

SQL> select start_date('B12',:d) start_date,
2    trunc(add_months(to_date(:d),-11),'MM') from dual;

START_DATE  TRUNC(ADD_M
----------- -----------
01-JUL-2005 01-JUL-2005

The convenience obviously outweighs the problem of having to deal with an undocumented date function that accepts and returns strings. Also with this function I have the luxury of substituting 'M' for 'MTD', but not 'Y' for 'YTD' or 'Q' for 'QTD' though I suspect these could be improvements for versions 2.0 and 3.0.

Tuesday, February 14, 2006

The Phantom's Gonna Git Ya

I know I'm asking for trouble here by offering an AskTom page for a WTF, but I couldn't resist. If there was ever a time you wanted your spelling to be spot on, it would be when posting a link to a spell-checker...

Thursday, February 09, 2006

EAV Returns: The Concrete Elephant approach

Anyone who has read Tales Of The Oak Table, not to mention Tony Andrews' blog or any of the countless articles and discussions on the subject on AskTom and elsewhere, will know two things about the fabled "Entity-Attribute-Value" approach to database design, in which you model all "things" in one table with a "thing ID" and a "thing type", plus a second table holding one row per "attribute", and thus create an application that can model any conceivable type of thing, ever:

  1. It seems like a clever idea at first.
  2. It isn't.

But wait. A poster on OTN forums ("SIMPLE Database Design Problem") has solved the major problems inherent in the original Entity-Attribute-Value approach, by simply denormalising away the Attribute-Value part.

Now the ENTITIES table will have all the columns for every entity type. Maybe a lot of them will be null because "INVOICE" rows will use mainly different columns to "TROPICAL_DISEASE" rows, but disk space is cheap, and look at the simplification we have achieved by not having to babysit all those old-fashioned tables. And it's not a generic design any more, is it? It's concrete.

The table would look something like this:

ENTITYID ENTITYTYPE NAME      PRICE DIET  COLOUR ANNUAL_TURNOVER
-------- ---------- --------- ----- ----- ------ ---------------
1        PERSON     William
2        FRUIT      Banana                Yellow
3        COMPANY    Megacorp                     100000000
4        ANIMAL     Fruitbat        Fruit
5        SNACK      Snickers  0.4

accompanied by a generic RELATIONS table like this:

ENTITY1 ENTITY2 RELATIONSHIP
------- ------- ------------
3       1       EMPLOYS
1       2       EATS
1       5       EATS

Want to list the snacks eaten by Megacorp employees? Simple:

SELECT emp.entityid, emp.name, snack.name, snack.price
FROM   entities emp
       JOIN relations emprel
       ON  emprel.entity2 = emp.entityid
       AND emprel.relationship = 'EMPLOYS'
 
       JOIN entities com
       ON com.entityid = emprel.entity1
       AND com.entitytype = 'COMPANY'
 
       JOIN relations snrel
       ON  snrel.entity1 = emp.entityid
       AND snrel.relationship = 'EATS'
 
       JOIN entities snack
       ON  snack.entityid = snrel.entity2
       AND snack.entitytype = 'SNACK'
 
WHERE  emp.entitytype = 'PERSON'
AND    com.name = 'Megacorp';

Want to make FRUITBAT an employee of SNICKERS?

INSERT INTO relations VALUES (5, 4, 'EMPLOYS');

The thread becomes increasingly surreal as more and more posters suggest likely issues, from performance (he's prototyped it and the slowdown is insignificant) to complexity (the code will be generated dynamically from an object library) and the limited number of columns per table in Oracle (he might go with MySQL) while Erdem remains cheerfully confident that it will work (it won't).

My thanks to 3360 for sharing this. Send your WTFs to us at OracleWTF@bigfoot.com.

Monday, February 06, 2006

Counting Sheep

I'd not seen this Oracle Forums thread before, though it started in 2002 and now has 198 replies. Somebody once asked for some PL/SQL coding standards, someone else offered to email some, and then for ever after gets bombarded with requests from other people saying "Please send same to me at another-idiot-sheep@nobrain.com" Every now and then someone kindly posts a URL to some PL/SQL standards on the web, or points out that these people are just getting their email addresses onto a lot of spam mailing lists, but on and on they go asking for a copy to be sent direct to them. It's surprisingly funny.

Sunday, February 05, 2006

I Object, Your Honour...

Erm, excuse me for interrupting, but what exactly is this?
There are 2 ways to construct an ANYDATA. The CONVERT* calls enable construction of the ANYDATA in its entirety with a single call. They serve as explicit CAST functions from any type in the Oracle ORDBMS to ANYDATA.
(Found here in the 10.1 documentation).