Dave Webster's Blog

Tinkering with Oracle

Why don’t more people know about the benefits of Scalar Subqueries?

Posted by anotherdavewebster on August 24, 2010

Every now and then, I get involved in interviewing Oracle developers when we’re specifically after “performance” skills. I ask a range of fairly expansive questions but I always ask if they know what Scalar Subqueries are and why they can deliver amazing performance benefits. Not many people seem to have heard of Scalar Subqueries are and even fewer are aware of the caching benefits.

I originally planned to go through all the usual stuff to demonstrate how the caching works. I was then going to link you to a page I found a few years back which contains some detail on how effective the caching is. On closer inspection of the page, I realised there was no point in writing it up myself as Mr Chadderton covers it all rather nicely.

Have a read of this and welcome to the wonderful world of Scalar Subqueries!

I would just like to add that the article from Martin focusses on caching the results of functions (see here for another option) but I tend to use Scalar Subqueries when I’m repeatedly looking up the same data over and over again:

CREATE TABLE fact_table
AS
SELECT 'some data' text_column
     , mod(rownum, 10) a_fk_id
  FROM DUAL CONNECT BY ROWNUM <= 5000;

CREATE TABLE lookup_table
AS
SELECT ROWNUM an_id
     , 'something to lookup: ' || TO_CHAR(ROWNUM) text_column
  FROM DUAL
CONNECT BY ROWNUM <= 100000;

alter table lookup_table add constraint lookup_table_pk primary key (an_id);

analyze table fact_table compute statistics;
analyze table lookup_table compute statistics;

DECLARE
   CURSOR c_standard
   IS
      SELECT fct.text_column
           , lup.text_column lup_text_column
        FROM fact_table fct, lookup_table lup
       WHERE fct.a_fk_id = lup.an_id(+);

   CURSOR c_ssq
   IS
      SELECT fct.text_column
           , ( SELECT lup.text_column
                FROM lookup_table lup
               WHERE fct.a_fk_id = lup.an_id ) lup_text_column
        FROM fact_table fct;
BEGIN
   runstats_pkg.rs_start;

   FOR i IN 1 .. 100 LOOP
      FOR rec IN c_standard LOOP
         NULL;
      END LOOP;
   END LOOP;

   runstats_pkg.rs_middle;

   FOR i IN 1 .. 100 LOOP
      FOR rec IN c_ssq LOOP
         NULL;
      END LOOP;
   END LOOP;

   runstats_pkg.rs_stop;
END;

So let’s just look at the plans quickly. The standard version is a Hash Join:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------
| Id  | Operation          | Name         | Rows  | Bytes | Cost (%CPU)|
------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |              |  5000 |   200K|   102   (3)|
|*  1 |  HASH JOIN OUTER   |              |  5000 |   200K|   102   (3)|
|   2 |   TABLE ACCESS FULL| FACT_TABLE   |  5000 | 55000 |     5   (0)|
|   3 |   TABLE ACCESS FULL| LOOKUP_TABLE |   100K|  2929K|    95   (2)|
------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("FCT"."A_FK_ID"="LUP"."AN_ID"(+))

But the Scalar Subquery is quite different:

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------
| Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)|
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                 |  5000 | 55000 |     5   (0)|
|   1 |  TABLE ACCESS BY INDEX ROWID| LOOKUP_TABLE    |     1 |    30 |     2   (0)|
|*  2 |   INDEX UNIQUE SCAN         | LOOKUP_TABLE_PK |     1 |       |     1   (0)|
|   3 |  TABLE ACCESS FULL          | FACT_TABLE      |  5000 | 55000 |     5   (0)|
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("LUP"."AN_ID"=:B1)

The runstats timing confirms the big win

Run1 ran in 149 hsecs
Run2 ran in 48 hsecs
run 1 ran in 310.42% of the time

and a big lump of that comes from reducing the IO (just about my favourite tuning aim):

Name                                                      Run1        Run2        Diff
STAT...session logical reads                            53,727       8,535     -45,192
About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

%d bloggers like this: