Monday, February 27, 2006
Hey, we do the WTFs Part II...
Sunday, February 26, 2006
Tip: add DISTINCT to every query
We all know that SQL can be a harsh mistress, and relational theorists such as Chris Date have long argued that the language is fundamentally flawed and that vendors have been misapplying relational theory from the start.
Now a Perl developer on perlmonks.org has been reading Date's book and finds that it explains all of his frustrations with databases. One tip for addressing their shortcomings is to add the handy DISTINCT keyword to every single query, because stupid old SQL doesn't automatically apply the degree of uniqueness you might have in mind:
In fact, one of the founders of relational theory, C.J. Date, recommends that every SQL SELECT statement be written as SELECT DISTINCT ... Unfortunately, many folks get upset with this for a variety of reasons. First, DBMS do a horrible job of optimizing DISTINCT queries. There are many cases where the DISTINCT can be safely ignored but in practice, using DISTINCT with large sets of data and complicated queries can have an abysmal impact on performance. I think this can be put into the "premature optimization" category, though. Until we know that DISTINCT is causing an impact, isn't it better to improve the chance of getting correct results and worry about performance afterwards?
I had to read that a couple of times to realise that adding a DISTINCT to every single query in the hope that it might mask some unknown deficiency in your model, your query, or the SQL language itself is not the "premature optimization" being referred to here - he means the idea that doing so might affect performance. After all, it might not, right?
Read the rest of the discussion at perlmonks.org: "Why SQL Sucks (with a little Perl to fix it)". It also appears on Database Debunkings, "On the sins of SQL".
Friday, February 24, 2006
Hey, we do the WTFs...
Stop Press: Oracle Granted License To Extend February
SQL> SELECT DATE '2006-01-31' + INTERVAL '1' MONTH
2 FROM dual;
SELECT DATE '2006-01-31' + INTERVAL '1' MONTH
*
ERROR at line 1:
ORA-01839: date not valid for month specified
Sunday, February 19, 2006
Wanna Date?
Dates are known to be exceedingly difficult and avoiding them at all costs is something of a skill. This function, in the spirit of Never do in SQL what you can do in PL/SQL, calculates a date range before calling another procedure that also avoided using dates for its input date parameters. So despite its absence, it is at least partially responsible for this mess.
I stripped the code down to its date handling which studiously avoids using date calculations wherever possible, and uses string handling instead, leaving in the comments because they are also the documentation.
create or replace function start_date (
p_range in varchar2,
p_in_date in varchar2 -- DD-MON-YYYY format String
)
return varchar2
as
l_out_date varchar2(11);
month varchar2(10) := to_char(to_date(p_in_date,'DD-MON-YYYY'),'MON');
year varchar2(10) := to_char(to_date(p_in_date,'DD-MON-YYYY'),'YYYY');
v_cnt_yr number;
v_end_date date := to_date(p_in_date,'DD-MON-YYYY');
v_start_date varchar2(11);
begin
if p_range = 'QTD' THEN
if month in ('JAN','FEB','MAR') then
-- if given month = march and date is 31
-- then data for jan,feb and march.
l_out_date := '01-JAN-'||year;
elsif month in ('APR','MAY','JUN') then
l_out_date := '01-APR-'||year;
elsif month in ('JUL','AUG','SEP') then
l_out_date := '01-JUL-'||year;
elsif month in ('OCT','NOV','DEC') then
l_out_date := '01-OCT-'||year;
end if;
elsif p_range = 'YTD' then
-- beginning of the year.
l_out_date := '01-JAN-'||year;
elsif p_range = 'M' or p_range='MTD' then
-- beginning of month
l_out_date := '01-'||month||'-'||year;
elsif p_range like 'B%' then
v_cnt_yr := substr(ltrim(rtrim(p_range)),2);
-- We take the start date as the first day after trailing
-- back the required no. of months
l_out_date := to_char(last_day(add_months(
last_day(v_end_date), -v_cnt_yr)) + 1,
'DD-MON-YYYY');
end if;
return l_out_date;
end; After reading it I thought, "So what does this do that TRUNC doesn't?" Apparently not a lot when you need to get the month, quarter or year to date. If you try this at home remember to format the return of START_DATE for readability since it usefully returns a 4000 character string.
SQL> exec :d := '17-JUN-2006'
PL/SQL procedure successfully completed.
SQL> select start_date('QTD',:d) start_date,
2 trunc(to_date(:d),'Q') from dual;
START_DATE TRUNC(TO_DA
----------- -----------
01-APR-2006 01-APR-2006
SQL> select start_date('YTD',:d) start_date,
2 trunc(to_date(:d),'Y') from dual;
START_DATE TRUNC(TO_DA
----------- -----------
01-JAN-2006 01-JAN-2006
SQL> select start_date('MTD',:d) start_date,
2 trunc(to_date(:d),'MM') from dual;
START_DATE TRUNC(TO_DA
----------- -----------
01-JUN-2006 01-JUN-2006 But what about the mysterious 'B%' format mask? This calculates the first day of the month, where 'Bn' is n-1 months ago, tricky eh? In SQL we are forced to call two functions instead of having START_DATE call ADD_MONTHS for us with two bonus LAST_DAYS thrown in for good measure. The n-1 bit could even be a bug but who knows?
SQL> select start_date('B12',:d) start_date,
2 trunc(add_months(to_date(:d),-11),'MM') from dual;
START_DATE TRUNC(ADD_M
----------- -----------
01-JUL-2005 01-JUL-2005 The convenience obviously outweighs the problem of having to deal with an undocumented date function that accepts and returns strings. Also with this function I have the luxury of substituting 'M' for 'MTD', but not 'Y' for 'YTD' or 'Q' for 'QTD' though I suspect these could be improvements for versions 2.0 and 3.0.
Tuesday, February 14, 2006
The Phantom's Gonna Git Ya
Thursday, February 09, 2006
EAV Returns: The Concrete Elephant approach
Anyone who has read Tales Of The Oak Table, not to mention Tony Andrews' blog or any of the countless articles and discussions on the subject on AskTom and elsewhere, will know two things about the fabled "Entity-Attribute-Value" approach to database design, in which you model all "things" in one table with a "thing ID" and a "thing type", plus a second table holding one row per "attribute", and thus create an application that can model any conceivable type of thing, ever:
- It seems like a clever idea at first.
- It isn't.
But wait. A poster on OTN forums ("SIMPLE Database Design Problem") has solved the major problems inherent in the original Entity-Attribute-Value approach, by simply denormalising away the Attribute-Value part.
Now the ENTITIES table will have all the columns for every entity type. Maybe a lot of them will be null because "INVOICE" rows will use mainly different columns to "TROPICAL_DISEASE" rows, but disk space is cheap, and look at the simplification we have achieved by not having to babysit all those old-fashioned tables. And it's not a generic design any more, is it? It's concrete.
The table would look something like this:
ENTITYID ENTITYTYPE NAME PRICE DIET COLOUR ANNUAL_TURNOVER -------- ---------- --------- ----- ----- ------ --------------- 1 PERSON William 2 FRUIT Banana Yellow 3 COMPANY Megacorp 100000000 4 ANIMAL Fruitbat Fruit 5 SNACK Snickers 0.4
accompanied by a generic RELATIONS table like this:
ENTITY1 ENTITY2 RELATIONSHIP ------- ------- ------------ 3 1 EMPLOYS 1 2 EATS 1 5 EATS
Want to list the snacks eaten by Megacorp employees? Simple:
SELECT emp.entityid, emp.name, snack.name, snack.price
FROM entities emp
JOIN relations emprel
ON emprel.entity2 = emp.entityid
AND emprel.relationship = 'EMPLOYS'
JOIN entities com
ON com.entityid = emprel.entity1
AND com.entitytype = 'COMPANY'
JOIN relations snrel
ON snrel.entity1 = emp.entityid
AND snrel.relationship = 'EATS'
JOIN entities snack
ON snack.entityid = snrel.entity2
AND snack.entitytype = 'SNACK'
WHERE emp.entitytype = 'PERSON'
AND com.name = 'Megacorp';
Want to make FRUITBAT an employee of SNICKERS?
INSERT INTO relations VALUES (5, 4, 'EMPLOYS');
The thread becomes increasingly surreal as more and more posters suggest likely issues, from performance (he's prototyped it and the slowdown is insignificant) to complexity (the code will be generated dynamically from an object library) and the limited number of columns per table in Oracle (he might go with MySQL) while Erdem remains cheerfully confident that it will work (it won't).
My thanks to 3360 for sharing this. Send your WTFs to us at OracleWTF@bigfoot.com.
Monday, February 06, 2006
Counting Sheep
Sunday, February 05, 2006
I Object, Your Honour...
There are 2 ways to construct an ANYDATA. The CONVERT* calls enable construction of the ANYDATA in its entirety with a single call. They serve as explicit CAST functions from any type in the Oracle ORDBMS to ANYDATA.(Found here in the 10.1 documentation).