Oracle WTF

Friday, October 26, 2007

Frameworkia

We thought long and hard about possible titles for this new PL/SQL development standard proposed on OTN, but we couldn't improve on the one it came with.

I want share a new IDEA to create a new standard PL/SQL developing:
Function ( Standard Buffer) return number
    variable Number;
    variable1 Varchar2;
begin
    variable := get from Standard Buffer;
    variable1 := get from Standard Buffer;

    { make your business }

    put in standard buffer your results
end;
Give me feedback if you are interested at the new STANDARD called "FRAMEWORKIA".

A lot possibilities are ready.

Do you see the genius of it?

Er, no.

Sensing that there were people who still needed convincing, user601181 posted some sample code developed using the new Frameworkia:

CREATE OR REPLACE FUNCTION iacore
    ( eobufferia IN OUT typeeodata )
    RETURN NUMBER
IS
    CURSOR cur_getroutingcondition
        ( idc_workflow IN VARCHAR2
        , idc_operation_node IN VARCHAR2 ) IS
        SELECT *
        FROM   wf_condition
        WHERE  id_workflow = idc_workflow
        AND    id_operation_node = idc_operation_node;

    rec_getroutingcondition cur_getroutingcondition%ROWTYPE;

    CURSOR cur_dobufferiaassign
        ( idc_workflow IN VARCHAR2
        , idc_operation_node IN VARCHAR2 ) IS
        SELECT *
        FROM   wf_assignement
        WHERE  id_workflow = idc_workflow
        AND    id_operation_node = idc_operation_node;

    rec_dobufferiaassign cur_dobufferiaassign%ROWTYPE;

    next_node         NUMBER;
    next_node_ck      NUMBER;
    stop_node         NUMBER;
    operation         VARCHAR2(256);
    operation_call    VARCHAR2(256);
    type_node         VARCHAR2(32);
    workflow          VARCHAR2(32);
    line              VARCHAR2(256);
    status_wf_v       VARCHAR2(3);
    pid_chain_node    NUMBER;
    ia_tid            VARCHAR2(64);
    ia_tid_micro      VARCHAR2(64);
    ret_code_default  NUMBER;
    ret_code          NUMBER;
    retval1           NUMBER;
    statementexc      VARCHAR2(256);
    schema_function   VARCHAR2(32);
    package_function  VARCHAR2(32);
    dblink_function   VARCHAR2(32);
    first_node_flag   VARCHAR2(2) := 'NO';
    id_debug_source   NUMBER;
    mapin_keyp        VARCHAR2(1024);

    headerbufferia typebufferia;
    assignbufferia typebufferia;
    checkbufferia  typebufferia;

    rec_wfnode               wf_node%ROWTYPE;
    rec_wffunctionsourcecode wf_function_source_code%ROWTYPE;
    rec_wflogger             wf_logger%ROWTYPE;
    rec_wfbusiness           wf_business%ROWTYPE;
    rec_wffieldmapping       wf_fieldmapping%ROWTYPE;
BEGIN
    headerbufferia := eobufferia(1);

    workflow := frameworkia.getvalue(headerbufferia,'ID_WORKFLOW');

    ---- DETERMINO QUALE NODO INVOCARE
    pid_chain_node := frameworkia.getvalue(headerbufferia,'WF_NODE_ID');

    ----- SE IL NODO E' NULL ALLORA E' IL PRIMO NODO
    IF pid_chain_node IS NULL
    THEN
        -------DETERMINO HANDLER E FILENAME PER IL LOGGER
        SELECT *
        INTO   rec_wflogger
        FROM   wf_logger
        WHERE  id_workflow = workflow;
        -- rec_WfLogger.ID_WORKFLOW
        -- rec_WfLogger.ID_DEBUG_LEVEL
        -- rec_WfLogger.ID_DIRHANDLER
        -- rec_WfLogger.ID_FILENAME

        --------INSERISCO NELL'HEADER
        frameworkia.setvalue
        ( headerbufferia
        , 'ID_DEBUG_WF'
        , rec_wflogger.id_debug_level );

        frameworkia.setvalue
        ( headerbufferia
        , 'ID_DIRHANDLER'
        , rec_wflogger.id_dirhandler );

        frameworkia.setvalue
        ( headerbufferia
        , 'ID_FILENAME'
        , rec_wflogger.id_filename );

        frameworkia.setvalue
        ( headerbufferia
        , 'CHARACTER_EVIDENCE'
        , '§§§§§§§§§§§§§§§§§§§§' );

        -------DETERMINO L'ID NODE
        SELECT wf_node_id
        INTO   pid_chain_node
        FROM   wf_node
        WHERE  id_workflow = workflow
        AND    wf_first_node = 'YES';

        SELECT *
        INTO   rec_wfnode
        FROM   wf_node
        WHERE  id_workflow = workflow
        AND    wf_first_node = 'YES';

        frameworkia.setvalue
        ( headerbufferia
        , 'WF_NODE_ID'
        , rec_wfnode.wf_node_id );

        SELECT b.status
        INTO   status_wf_v
        FROM   wf_node a
             , wf_name b
        WHERE  a.id_workflow = workflow
        AND    a.wf_node_id = rec_wfnode.wf_node_id
        AND    a.id_workflow = b.id_workflow;

        IF status_wf_v = 'OFF'
        THEN
            RETURN -1;
        END IF;

        ia_tid := frameworkia.getvalue(headerbufferia,'IA_TID');
        ret_code_default := 0;
        ret_code         := 0;

        frameworkia.setvalue
        ( headerbufferia
        , 'RET_CODE_DEFAULT'
        , ret_code_default );

        frameworkia.setvalue
        ( headerbufferia
        , 'RET_CODE'
        , ret_code);

        IF ia_tid IS NULL
        THEN
            ia_tid := 'TIA' || dbms_random.STRING('U',1 * 1 + 6) ||
                      TO_CHAR(SYSTIMESTAMP,'YYYYMMDDHH24MISSFF6');

            frameworkia.setvalue
            ( headerbufferia
            , 'IA_TID'
            , ia_tid );
        END IF;

That's just the first hundred lines, and I've formatted it. The complete iacore function was well over 600 lines long.

I for one welcome the new standard.

Friday, August 10, 2007

Welcome back

Our guest administrator "Splogger" has now left the building, along with his page of helpful links to items on Amazon.com and a range of gentlemen's health products.

Suspiciously, a couple of days before he arrived we were taken off air by Blogger's spambots, presumably alerted by the amount of irrelevant, repetitive, and nonsensical text and links to Viagra sites they found here. From what I read, it seems possible that the Blogger automated suspension to prevent blog spam might have actually left the account vulnerable to blog spammers. As ironies go, that is up there with rain on your wedding day and good advice that you just didn't take.

Thursday, May 03, 2007

One liner

I was untangling a query when I came across eight variations of this code where only the literals 'a' to 'e' changed. Two of these were nested within an additional NVL so that the second would execute if the first returned null.

nvl(decode(2, 1, 'a', 2, 'b', 3, 'c', 4, 'd', 5, 'e', ' '), 'na')

Sunday, April 08, 2007

DATE comparisons: the scenic route

Need to calculate the number of minutes between two dates? Yes, the dull way is to subtract one from the other and multiply by 1440. But why do that, when you can simply convert each date into Julian format by converting it into a string and then back to a date, and converting the resulting date into a Julian string, and (in a separate step) applying TO_NUMBER to the result to get an integer, so that you can simply subtract one from the other and multiply by 1440.

Except that rounds to the nearest day, so an additional step is to calculate the minutes since midnight for each of the two dates, which you can do by simply converting each date into an 'HH:MI AM' string, then back to a date, then back to an 'SSSSS' string, converting it to a number, dividing by 60 and adding it to the result of the first calculation. I think.

Anyway, consider the following library of handy date functions our Oracle WTF Easter gift to you, the online development community.

CREATE PACKAGE dates_pkg
AS
    FUNCTION julian_date
        ( date_to_convert DATE )
        RETURN NUMBER;

    FUNCTION minutes_since_midnight
        ( timevalue DATE )
        RETURN NUMBER;

    FUNCTION minutes_elapsed
        ( lowdate DATE
        , highdate DATE )
        RETURN NUMBER;

END dates_pkg;
/

CREATE PACKAGE BODY dates_pkg
AS
    FUNCTION julian_date
        ( date_to_convert DATE)
        RETURN NUMBER
    IS
        varch_value VARCHAR (10);
        num_value NUMBER (20);
    BEGIN
        SELECT TO_CHAR
               ( TO_DATE(TO_CHAR(date_to_convert,'MM/DD/YYYY'),'MM/DD/YYYY')
               , 'J')
        INTO   varch_value
        FROM   dual;

        SELECT TO_NUMBER (varch_value)
        INTO   num_value
        FROM   dual;

        RETURN (num_value);
    END julian_date;


    FUNCTION minutes_since_midnight (
        timevalue DATE)
        RETURN NUMBER
    IS
        secs_elapsed NUMBER (20);
        mins_elapsed NUMBER (20);
    BEGIN
        SELECT TO_NUMBER
               ( TO_CHAR(TO_DATE(TO_CHAR(timevalue,'HH:MI AM'),'HH:MI AM')
               , 'SSSSS') )
        INTO   secs_elapsed
        FROM   dual;

        SELECT (secs_elapsed / 60)
        INTO   mins_elapsed
        FROM   dual;

        RETURN (mins_elapsed);
    END minutes_since_midnight;


    FUNCTION minutes_elapsed
        ( lowdate DATE
        , highdate DATE )
        RETURN NUMBER
    IS
        final_number NUMBER (20);
        low_julian NUMBER (20);
        high_julian NUMBER (20);
        num_days NUMBER (20);
        num_minutes NUMBER (20);
        temp_mins NUMBER (20);
        min_low NUMBER (20);
        min_high NUMBER (20);
    BEGIN
        SELECT julian_date (lowdate)
        INTO   low_julian
        FROM   dual;

        SELECT julian_date (highdate)
        INTO   high_julian
        FROM   dual;

        SELECT (high_julian - low_julian)
        INTO   num_days
        FROM   dual;

        SELECT (num_days * 1440)
        INTO   num_minutes
        FROM   dual;

        SELECT minutes_since_midnight (lowdate)
        INTO   min_low
        FROM   dual;

        SELECT minutes_since_midnight (highdate)
        INTO   min_high
        FROM   dual;

        SELECT (min_high - min_low)
        INTO   temp_mins
        FROM   dual;

        SELECT (num_minutes + temp_mins)
        INTO   final_number
        FROM   dual;

        RETURN (final_number);

    END minutes_elapsed;
END dates_pkg;

Just for fun, let's test it:

CREATE TABLE wtf_test (start_date NOT NULL, end_date NOT NULL) AS
SELECT DATE '2006-12-25' + DBMS_RANDOM.VALUE(1,365)
     , DATE '2007-12-25' + DBMS_RANDOM.VALUE(1,365)
FROM   dual CONNECT BY LEVEL <= 1000;

-- ...several runs here to allow for caching etc, last set of results shown...

SQL> set timing on autotrace traceonly stat

SQL> SELECT dates_pkg.minutes_elapsed(start_date,end_date) FROM wtf_test;

1000 rows selected.

Elapsed: 00:00:03.96

Statistics
----------------------------------------------------------
  16000  recursive calls
      0  db block gets
     74  consistent gets
      0  physical reads
      0  redo size
   9330  bytes sent via SQL*Net to client
    809  bytes received via SQL*Net from client
     68  SQL*Net roundtrips to/from client
      0  sorts (memory)
      0  sorts (disk)
   1000  rows processed

SQL> SELECT (end_date - start_date) * 1440 FROM wtf_test;

1000 rows selected.

Elapsed: 00:00:00.16

Statistics
----------------------------------------------------------
      0  recursive calls
      0  db block gets
     74  consistent gets
      0  physical reads
      0  redo size
  25485  bytes sent via SQL*Net to client
    809  bytes received via SQL*Net from client
     68  SQL*Net roundtrips to/from client
      0  sorts (memory)
      0  sorts (disk)
   1000  rows processed

So the handy package version takes 25 times as long as the 1-line SQL version.

And in the interests of fairness, in case you're thinking perhaps that is just the normal overhead of calling PL/SQL functions in SQL, let's try our own function:

CREATE FUNCTION minutes_elapsed
    ( lowdate DATE
    , highdate DATE )
    RETURN NUMBER
AS
BEGIN
    RETURN (highdate - lowdate) * 1440;
END minutes_elapsed;
/

SQL> SELECT minutes_elapsed(start_date,end_date) FROM wtf_test;

1000 rows selected.

Elapsed: 00:00:00.26

Statistics
----------------------------------------------------------
     21  recursive calls
      0  db block gets
    107  consistent gets
      0  physical reads
      0  redo size
  25496  bytes sent via SQL*Net to client
    809  bytes received via SQL*Net from client
     68  SQL*Net roundtrips to/from client
      0  sorts (memory)
      0  sorts (disk)
   1000  rows processed

Still 15 times faster.

Many thanks to Padders for sharing this one.

Friday, March 23, 2007

Quote For The Day

"PL/SQL development can be a tedious and time-consuming job – often monopolizing the valuable time and efforts of Oracle developers." - SQL Navigator page, Quest Software

Spare a thought today for PL/SQL developers, having their precious time tediously monopolised by, umm, PL/SQL development.

Monday, March 19, 2007

423,551 Invalid Indexes

A friend received an alarming email from a commendably proactive DBA who was concerned that he had discovered a potentially serious performance issue on the production system:

The following result shows there are many invalid indexes. For example, in the first row we see that there are nearly 423551 invalid indexes for just one row of a table. So if we rebuilt the index it would improve the performance.

Wow, that sure is a lot of invalid indexes for just one row of a table. For the sake of anonymity, I've run his query against the SCOTT/TIGER schema, where fortunately there are only 14 invalid indexes (or perhaps 10, if that's how you count them). Perhaps you can see what the problem might be:

SELECT DISTINCT
       t.table_name
     , i.index_name
     , t.num_rows table_rows
     , i.num_rows index_rows
FROM   dba_tables t
     , dba_indexes i
WHERE  t.table_name = 'DEPT'
AND    t.owner = 'SCOTT'
AND    i.owner = t.owner
AND    i.index_type = 'NORMAL';

TABLE_NAME                     INDEX_NAME                     TABLE_ROWS INDEX_ROWS
------------------------------ ------------------------------ ---------- ----------
DEPT                           PK_EMP                                  4         14
DEPT                           EMP_DEPT_IX                             4         14
DEPT                           PK_DEPT                                 4          4
DEPT                           PK_SALGRADE                             4          5

4 rows selected.

There is worse to come. It's not just DEPT that has invalid indexes. He goes on:

The following result shows there are many invalid indexes, so if we rebuild the index, it would improve the performance. At the same time from the first and last rows, indexes are not being used, so we need to look into it.

TABLE_NAME                     INDEX_NAME                     TABLE_ROWS INDEX_ROWS
------------------------------ ------------------------------ ---------- ----------
SALGRADE                       PK_DEPT                                 5          4
SALGRADE                       PK_EMP                                  5         14
SALGRADE                       EMP_DEPT_IX                             5         14
SALGRADE                       PK_SALGRADE                             5          5

TABLE_NAME                     INDEX_NAME                     TABLE_ROWS INDEX_ROWS
------------------------------ ------------------------------ ---------- ----------
EMP                            PK_DEPT                                14          4
EMP                            PK_EMP                                 14         14
EMP                            EMP_DEPT_IX                            14         14
EMP                            PK_SALGRADE                            14          5

I just can't help wondering what he thinks an index is.

Wednesday, January 17, 2007

Performance Tuning and The Big O

Oracle Performance Tuning is a big subject, as anyone will appreciate who has read any of a number of books that set out to help you to understand what problem you are trying to solve, what factors may affect the performance you are seeing, what strategies are available to the query optimizer, and so on.

While Cary Millsap's Optimizing Oracle Performance focusses on finding, tracing and prioritising specific problems in the face of vague reports that the system seems a bit slow this week, and Jonathan Lewis' Cost-Based Oracle Fundamentals takes us on a tour of the CBO to help answer such questions as Why isn't my EXISTS query using an index? (and why isn't it faster than the IN version?) a poster on OraFAQ has an approach we've not seen before:

Hi,

The following is a problem I need help with and I am willing to pay for help if necessary. Any info would be greatly appreciated.

Two tables in a database:

Table1 contains a list of phone numbers
Table2 contains a list of phone numbers as well

I would like to create a Table3, in which Table3 contains all numbers from Table1 that is not in Table2. I am looking for the shortest runtime possible, keeping in mind that you can use whatever method(s) you deem necessary.

Table1 contains 30 Million rows,
Table2 contains 2000 rows.

given a regular SQL expression, it will yield Big O(m*n)

Where m = rowcount of Table1
and n = rowcount of Table2

Generate for me, a method in which, runtime will yield Big O (m log2 n).

I don't need code, I want to hear your logic. Table1 is customers, Table2 is a list of prepaid phone numbers. Table3 is list of people to bill.

As usual no database version is given. The first suggestion, as you might expect, is the quite reasonable:

select phone_no from table1
minus
select phone_no from table2;

accompanied by a comment that it doesn't seem like a great piece of schema design to have two tables in the first place. However, the reply comes back:

The guy who wrote this problem just told me that this answer is incorrect.

His response to me was:

it's not too simplistic, but it is incorrect. This will still yield a big O(m*n).

Give it one more try, you are thinking too much in terms of DB.

Ask yourself, what are the only structures that would yield BigO(nlog2n)? Answer that, and you will get your answer.

Any ideas?

Resident mathematician Ross Leishman tried to explain to me what Big O (m log2n) means, and I can confirm that it is not after all a When Harry Met Sally reference as most of us would probably assume. Apparently the version with an EXISTS subquery was what was wanted, which seemed odd to me on a number of levels, not least that an IN subquery would probably produce the same plan, especially in 10g where the new HASH JOIN RIGHT ANTI allows the database to build its hash table from the 2,000-row table2 rather than the 30 million-row table1. But of course we don't know the database version, do we? Or whether the columns are nullable, unique or indexed, or how values are distributed, or really anything about the actual environment that would help in solving a real-world performance problem. I can see where I'm going wrong though. I am thinking too much in terms of DB.

Saturday, January 13, 2007

Short loop

It's good to see a GOTO every now and then. However, the sender of this one (thanks!) was most impressed by the loop that goes from 1 to v_totalcntr, and indeed the fact that there is a loop and a v_totalctr variable at all, with the variable carefully set from the cursor's %ROWCOUNT, when it can only ever have one value:

DECLARE
  CURSOR cur_pricing IS
      SELECT col1, col2
      FROM   sometable;

  var_pricing cur_pricing%ROWTYPE;

BEGIN
  OPEN cur_pricing;
  FETCH cur_pricing INTO var_pricing;

  IF cur_pricing%NOTFOUND THEN
      GOTO continue;
  END IF;

  v_totalcntr := cur_pricing%ROWCOUNT;

  FOR r IN 1..v_totalcntr
  LOOP
      -- loads of stuff here
      -- but no fetch from cur_pricing
      -- not even for the one time this loop will execute :-)
  END LOOP;

  <<continue>>
  NULL;
END;

Of course you could just fetch the value you want and process it, but where would be the fun in that?

PS The person who sent this in emailed me with a point I must admit hadn't occurred to me:

Saw you posted this one - thanks. But did you pick up on what they probably thought they were doing? I think whoever wrote it thought that %rowcount would have the TOTAL number of rows that the cursor would return - so they thought they would be looping around ALL the records in the cursor. The fact that they also forgot to fetch again in the loop just adds to the problem of course.