Intermediate SQL Color Coded SQL, UNIX and Database Essays

12Sep/1030

What are SQL Profiles and why do we need them ?

If you use DBMS_XPLAN package to analyze execution plans for your SQL statements (and you really should these days), you might have noticed that at times the following line might be displayed along with your execution plan:

Note
-----
   - SQL profile "SYS_SQLPROF_012ad8267d9c0000" used FOR this statement

This seems to mean that:

  1. An external ‘helper’ object was used during evaluation of your SQL statement
  2. This ‘helper’ object changed (or, at least, influenced) its execution plan

While having some extra help is nice, one has to wonder: what exactly is this “profile” object ? What does it do ? And, in a bigger picture, why do we need “external” help evaluating SQL statements at all ?


The problem with ORACLE optimizer

Traditionally (if we forget about ancient rule based days), ORACLE optimizer relies on data statistics to generate execution plans.

Statistics, in a nutshell, are very simplified descriptions of the real data. I.e. column statistics for the very large multi Gb table include just a few items, such as i.e. total number of values, number of distinct values, min/max values etc. In other words, statistics capture the overall shape of the data, but many low level details are lost.

To compensate for this loss of details and still provide reasonably accurate estimations, optimizer employs a number of assumptions about the data. In particular, optimizer usually believes that:

  1. Data values are distributed uniformly (in other words, value: 2 is as frequent as value: 5)
  2. Records are distributed uniformly (in other words, there is no physical clustering or ordering of data)
  3. The range of values is continuous (in other words, there are no “holes” in the range)
  4. Etc

This “statistical analysis” tends to work surprisingly well for the majority of data, but inevitably, there are always exceptions that break the rules … I.e. while 98 percent of your data might be distributed in a perfectly random way across the segment, the remaining 2 percent might be concentrated in only a few data blocks (skewing index clustering factors) … or it might happen that the records that you are looking for have uncharacteristically dependent values among your predicates …

Unfortunately, statistics are simply too crude an instrument to record this level of details.

Paraphrasing Stephen Hawking:

Statistics have no hair

and that means that the optimizer will likely miss out on a potentially better execution plans for this exceptional data.

Finally, there are cases where all the statistics in the world will NOT help optimizer make a better decision. For example, up until ORACLE 11g, optimizer could not properly estimate the number of records that are coming out of PL/SQL table function and, instead, took a wild guess, essentially creating execution plans that had no basis in reality whatsoever.

So, the question becomes, what can be done to correct that ? To make optimizer work efficiently not just for “generic” but “special” data as well?

What are SQL Profiles

The answer that ORACLE came up with was twofold:

  1. Create and record individual execution plans for “special case” SQLs
  2. Use actual runtime statistics to help design these individual plans

SQL Profiles are an implementation (more correctly, the end result) of this approach and the way they work is actually pretty simple.

First of all, you need to tell ORACLE which SQLs hit “special data” and can be potentially executed better. This can be done either explicitly (specifying SQLs by hand in dbms_sqltune.create_tuning_task) or implicitly by letting ORACLE pick “most active” SQLs for analysis (this is done by one of 11g auto tasks).

Second, you need to prove to the optimizer that its estimations are “out of whack” and “it can do better”. The way to do it is to run dbms_sqltune.execute_tuning_task (or let auto task do it for you). This is the “meat” of the process and it takes 3 major steps:

  • Step 1: Sample the data and get “the real numbers” (most importantly, Cardinality)
  • Step 2: Supply these numbers to the optimizer and let it reevaluate execution plan
  • Step 3: If optimizer comes up with a different execution plan, run SQL with original and a new plan (multiple times to be sure) and compare performance

If the results of this experiment are that:

  • a) The optimizer came up with a new execution plan
  • b) Performance while using this new plan is substantially better

then we have a proof that for this particular SQL, default statistics “lie” and cause the optimizer to miss. Moreover, we now know what the true numbers are and, of course, it makes sense to save them for the future.

How can we save the new (“real cardinality”) numbers ? Well, we can’t “update” regular statistics as, again, they are too crude for this level of details. So, instead, we will use a separate object (“SQL Profile”) that will attach to our SQL and store this kind of information.

How exactly are the new cardinality values stored ? Well, there is no black magic here: they are stored in the form of Hints. You might be thinking of a tried and true /*+ cardinality */ hint that can be used to specify a fixed cardinality number, but ORACLE actually uses a new (and slightly more advanced) /*+ opt_param */ hint that scales default cardinality UP or DOWN instead, i.e.

/*+ opt_estimate(table, t, scale_rows=100) */
-- OR
/*+ opt_estimate(index_scan, t, t_n_idx, scale_rows=0.001) */

So, now, if SQL Profiles are used, estimated default cardinality will be multiplied by these numbers and, as a result, optimizer will operate with (hopefully) a more realistic view of our data during SQL evaluation.

This is, in a nutshell, how SQL Profiles work in theory. Let’s now see if we can use this knowledge to fight some real SQL problems …

A very real case of mishandled cardinality

This problem that we are going to “fight” today is, although somewhat artificial, yet is also something that can be seen fairly frequently. It is known as optimizer’s “range estimation deficiency” and it goes a bit something like this:

Let’s say that we have a table with a DATE column that registers, say “orders shipped” …

CREATE TABLE orders (ord_no, shipped)
AS SELECT level, CAST (sysdate-level/24 AS DATE)
FROM dual
CONNECT BY level <= 500000
/

We also have an index on the shipped column:

CREATE INDEX orders_shipped_idx ON orders (shipped);

so that we can, very quickly, get to the latest orders by running this simple SQL:

SELECT * FROM orders WHERE shipped >= trunc(sysdate, 'DD');

Except, our data has one caveat in it: some of our orders are not yet shipped. To mark them as such developers decided to use a “special” date far out in the future: January 1st 9999 as, clearly, we do not expect that ‘real’ orders will ever ship on that date …

Let’s say that we have just one order that is unshipped as of right now:

INSERT INTO orders VALUES (-1, DATE '9999-01-01');
COMMIT;

Finally to provide optimizer with fresh information about the data, we are going to collect statistics:

EXEC dbms_stats.gather_table_stats(USER, 'orders', cascade => TRUE);

We are now ready to request the latest orders. Since we have 500k records in our table and only up to ~ 24 of them were shipped today (just see how the table was created), obviously the most efficient way to get to the data is to use the index:

SELECT * FROM orders WHERE shipped >= trunc(sysdate, 'DD');

----------------------------------------------------------------------------
| Id  | Operation         | Name   | ROWS  | Bytes | Cost (%CPU)| TIME     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |   496K|  6302K|   452  (24)| 00:00:06 |
|*  1 |  TABLE ACCESS FULL| ORDERS |   496K|  6302K|   452  (24)| 00:00:06 |
----------------------------------------------------------------------------

Predicate Information (IDENTIFIED BY operation id):
---------------------------------------------------
   1 - FILTER("SHIPPED">=TRUNC(SYSDATE@!,'fmdd'))

Well, it didn’t work – we hit a FULL TABLE SCAN instead. And, if you look at AUTOTRACE output the reason becomes clear as our cardinality estimation (how many records we expect to be returned) is obviously “out of whack”: 496K (instead of what we know should be true “<= 24”) !

Why did we get this obviously miscalculated grotesque cardinality ? The answer is simple: Optimizer’s “Continuous Range” assumption.

This is what is going on:

We know that the data in the SHIPPED column really looks like this:

But to the optimizer, it actually looks like this:

Optimizer does not care that our “unshipped” records are special, it simply sees a (much) bigger range. And just like that, just one ‘innocent’ record and optimizer’s precision (some would say: sanity) went completely down the toilet.

Of course, the root cause here is a bad data design and the correct way to fix it is to mark “unshipped” records differently (i.e. by adding a separate column that would record shipping status).

But, suppose we are stuck with it, so, let’s see if we can “fix” this problem by requesting ORACLE to create SQL Profile for this “very special” SQL.

Increasing optimizer precision by SQL Profiles

The first thing we need to do is: tell ORACLE that we have a potentially special SQL on our hands that requires better precision from the optimizer:

var task_name varchar2(30)
EXEC :task_name := dbms_sqltune.create_tuning_task(
  sql_text => 'SELECT * FROM orders WHERE shipped >= trunc(sysdate, ''DD'')'
);

Next, we will let ORACLE analyze the SQL. To see, what exactly is going on, we will also enable SQL trace:

ALTER SESSION SET EVENTS '10046 trace name context forever, level 8';
EXEC dbms_sqltune.execute_tuning_task(:task_name);

Once tuning task is complete, let’s see if ORACLE found a better execution plan for our SQL (you can see the full listing here):

SET linesize 180
SET longchunksize 180
SET pagesize 900
SET long 1000000
SELECT dbms_sqltune.report_tuning_task(:task_name) FROM dual;

-------------------------------------------------------------------------------
FINDINGS SECTION (1 finding)
-------------------------------------------------------------------------------

1- SQL Profile Finding (see EXPLAIN plans SECTION below)
--------------------------------------------------------
  A potentially better execution plan was found FOR this statement.

  Recommendation (estimated benefit: 99.53%)
  ------------------------------------------
  - Consider accepting the recommended SQL profile.
    EXECUTE dbms_sqltune.accept_sql_profile(task_name => 'TASK_910',
            task_owner => 'MAXYM1', REPLACE => TRUE);

  Validation results
  ------------------
  The SQL profile was tested BY executing BOTH its plan AND the original plan
  AND measuring their respective execution statistics. A plan may have been
  ONLY partially executed IF the other could be run TO completion IN less TIME.

                           Original Plan  WITH SQL Profile  % Improved
                           -------------  ----------------  ----------
  Completion STATUS:            COMPLETE          COMPLETE
  Elapsed TIME(us):              234447               444      99.81 %
  CPU TIME(us):                  236666                 0        100 %
  USER I/O TIME(us):                  0                 0
  Buffer Gets:                      838                 4      99.52 %
  Physical READ Requests:             0                 0
  Physical WRITE Requests:            0                 0
  Physical READ Bytes:                0                 0
  Physical WRITE Bytes:               0                 0
  ROWS Processed:                    13                13
  Fetches:                           13                13
  Executions:                         1                 1

  Notes
  -----
  1. The original plan was FIRST executed TO warm the buffer cache.
  2. Statistics FOR original plan were averaged OVER NEXT 2 executions.
  3. The SQL profile plan was FIRST executed TO warm the buffer cache.
  4. Statistics FOR the SQL profile plan were averaged OVER NEXT 9 executions.

And, if we look further at the EXPLAIN PLANS section:

2- USING SQL Profile
--------------------
Plan hash VALUE: 1899245095

--------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name               | ROWS  | Bytes | Cost (%CPU)| TIME     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                    |    21 |   273 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ORDERS             |    21 |   273 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | ORDERS_SHIPPED_IDX |    13 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (IDENTIFIED BY operation id):
---------------------------------------------------
   2 - access("SHIPPED">=TRUNC(SYSDATE@!,'fmdd'))

We can not only see the correct cardinality but also the INDEX SCAN that we wanted to achieve.

How exactly ORACLE came up with these correct numbers ? Let’s look at ORACLE SQL trace to find out:

-- Step 1: Sample the data (this is very similar to what dynamic sampling does)

-- Sample all data in the column
SELECT /* DS_SVC */ SUM(C1) FROM (
  SELECT /*+ NO_INDEX_FFS( "ORDERS")  */ 1 AS C1
  FROM "ORDERS" SAMPLE BLOCK(62.5, 8) SEED(1)
)

-- Sample NOT NULL data in the column
SELECT /* DS_SVC */ SUM(C1) FROM (
  SELECT /*+ INDEX_FFS( "ORDERS" "ORDERS_SHIPPED_IDX")  */ 1 AS C1
  FROM "ORDERS" SAMPLE BLOCK(58.7803, 8) SEED(1)
  WHERE ( "ORDERS".SHIPPED IS NOT NULL)
)

-- Sample "Relevant" data in the column
SELECT /* DS_SVC */ SUM(C1) FROM (
  SELECT /*+ NO_INDEX_FFS( "ORDERS")  */ 1 AS C1
  FROM "ORDERS" SAMPLE BLOCK(62.5, 8) SEED(1)
  WHERE ("ORDERS"."SHIPPED">=TRUNC(SYSDATE@!,'fmdd'))
)

-- Step 2: "Scale" rows based on the results of "sampling"
SELECT /* DS_SVC */
  /*+ OPT_ESTIMATE(TABLE, "ORDERS", SCALE_ROWS=4.189686512e-05) */ C1, C2, C3
FROM (
  SELECT /*+ INDEX( "ORDERS" "ORDERS_SHIPPED_IDX")  */ COUNT(*) AS C1,
    4294967295 AS C2, COUNT(*) AS C3  
  FROM "ORDERS"
  WHERE ("ORDERS"."SHIPPED">=TRUNC(SYSDATE@!,'fmdd'))
)

-- Step 3: Verify whether the new plan is indeed better

-- Execute with the ORIGINAL plan: 3 times
/* SQL Analyze(144,0) */ SELECT * FROM orders WHERE shipped >= trunc(sysdate, 'DD')

-- Then, execute with the NEW plan 10 times
/* SQL Analyze(144,0) */ SELECT * FROM orders WHERE shipped >= trunc(sysdate, 'DD')

So, ORACLE sampled the data, came up with the new plan and verified that the new plan is better than the old. It then saved the results of this work in SQL Profile.

The final step for us is to accept the profile so that all the next runs of the same SQL can reuse this superior execution plan.

EXEC dbms_sqltune.accept_sql_profile(:task_name);

Conclusion

SQL Profiles are NOT panacea to fix all performance problems, but they can be a big help with SQLs that hit the data that “does not fit” into optimizer’s simplistic expectations.

One might think that these “special” SQLs are rare, but, surprisingly, we have found that not to be the case. Even on our mostly OLTP systems, we can find a fair number of SQL Profile suggestions and much more of those in environments with lots of ad-hoc reporting.

And considering that SQL Profiles can only be created if performance is proven better, it is no wonder that SQLs where profiles are created and accepted usually perform better.

Do SQL Profiles have any downsides ? It is too early to tell, but there are a couple of minor ones, I suppose: you’ll need to pay attention to them and make sure that “special” data that caused them to exist in the first place did not change.

In addition to that, there is also a licensing issue: you can only use SQL Profiles if you license EM Tuning (and, by extension EM Diagnostics) pack … But that is a completely different story.

Comments (30) Trackbacks (1)
  1. A very good article. Appreciate your work.

  2. While you allude to the potential maintenance downside to profiles, you forget to mention how they could become a permanent crutch. Rather than fix the application or design, you’ll now be doing a standardized task of maintaining would could turn into hundreds of specialized instructions which will have to be supported release to release, platform to platform. Also, their behavior COULD change due to environmental or software changes by the vendors.

  3. Good article, by the way.

  4. Good article clearly written – also good point re long term support

  5. John, thanks for your kind words

  6. This article was very helpful……….thanks for sharing……

  7. short and simple. Straight to the point.

  8. Nice article…. very helpful….

  9. Very good article and the example provided solved many of my doubts about sql profiles.

  10. Thanks Rags. I’m glad that you found it useful.

  11. Wow.. I landed here searching for SQL Profile. I decided to go for this instead of Asktom. And i found this one to be very informative and useful. Thanks for your time.

  12. Thanks Sreenivas. Glad you liked it!

  13. it will be useful for many developers who is looking for do new things with sql .

    thanks.

  14. Hi ,

    It is one the best article I found for sql profiles. However just want to clear my doubt. In production database we will have big sql text which are more than 1000 of lines, how can we go about those and also we will have 1000 number of sql queries hitting the database, so do we have to create the so many sql profiles and also how do we know about the special data.

    Sorry for asking this many doubts.

    Thanks and Regards,
    Sagar Dayama

  15. Hello Sagar,

    Not sure what you mean by “special data”.

    To your other questions: SQL profile is basically just a bunch of hints that gets attached to a SQL statement. So, it is irrelevant how big your SQL is. Granted, ORACLE can behave in weird ways sometimes when statements are large, but I have not noticed anything that affects profiles. Btw, 1000 characters in a SQL is not that large …

    As for 1000s of SQLs in your database, I think you misunderstand a bit what profiles are for. Profiles are designed to fix problems with SQL optimizer when statistics and optimizer assumptions do not work. This should not happen often as optimizer can generally do a pretty decent job creating execution plans (and is getting better at it).
    If in your databases system, there are 1000s of queries that need SQL profiles, there is something definitely wrong with this picture. I would take a hard look at the statistics and make sure they are collected properly, all the necessary histograms are in place etc

    Regards,
    Maxym Kharchenko

  16. Wonderful article and you have clearly showed how SQL profile works..Please keep on doing your good work and enhance the vision of DBA’s and which will lead in correcting miss-understanding of concepts and helps life’s of DBA’s.

  17. Thanks Prashanth, glad you found my writings useful.

  18. Excellent descriptioin. However, if the profile is accepted, how long is it persisted for, indefinitely? What if bind parameters change on subsequent calls to the sql? I was hoping the tool would provide the ‘alternative sql’ so I, as a developer, can see what it did to enhance the performance. Then I could incorporate those changes into my code for permanent use. Am I missing something here? Thanks for your help.

  19. Hello Halty,

    When SQL profile is accepted, it persists until removed, so, for all intents and purposes – indefinitely.

    Profiles are “unconditional” (unlike SQL plan baselines), they will attach to the statements regardless (i.e. of the actual bind values). This might not be the ideal scenario, of course.

    As for seeing how profiles change query plans – there are several ways to do it.

    To see all the hints that can reproduce execution plan (including the ones from profile), run:

    SELECT * FROM table(dbms_xplan.display_cursor(‘&SQL_ID’, &CHILD_NUMBER, ‘typical +outline’));

    Of, if you want to see only the hints from profile:

    select hint from (
    select s.sql_id, sd.obj_type,
    row_number() over (partition by sd.signature, sd.category order by sd.signature) row_num,
    extractValue(value(t), ‘/hint’) hint
    from sqlobj$data sd, v$sql s,
    table(xmlsequence(extract(xmltype(sd.comp_data), ‘/outline_data/hint’))) t
    where sd.obj_type = 1
    and s.exact_matching_signature = sd.signature
    and s.sql_id = ‘&sql_id’
    )
    order by row_num
    /

  20. Thank you, good work. Good article, helped

  21. Oh my goodness! Amazing article dude! Thank you so much,
    However I am experiencing issues with your RSS. I don’t understand the reason why I can’t
    join it. Is there anybody else having similar RSS problems?

    Anyone who knows the answer can you kindly respond?
    Thanks!!

  22. Hi Maxym,

    This is one of the understandable article I have come across on performance tuning!!
    Nice work. Looking up to read more stuff.

    In regards to a question asked by Sagar (in Aug 2012), by the term ‘special data’, i believe he did meant “special sql” which has the data in unorganized fashion. i.e., data fetched by that sql not as per Optimizer’s assumption. So I have the same question. Say more than 1K sqls are running on the database then how do we find out which SQL has weird data pattern ?!

    As optimizer’s assumption may go false if the data is not distributed evenly as we comprehended above. Thus SQL profile is coming into picture taking multiple samples of the fetched records and thus using hints for accurate execution plan.

    Kindly correct me if I am incorrect in my understanding.

    Regards,
    Pratik

  23. Hi,

    this is very interesting post, indeed.
    But! When I tried to follow the steps,
    Oracle used the INDEX RANGE SCAN
    Executions: 1 | is_bind_sensitive:N | is_bind_aware: N | Parsing schema name: HR
    SQL_ID gcz2wwj2c2c60, child number 0
    ————————————-
    SELECT * FROM orders WHERE shipped >= trunc(sysdate, ‘DD’)

    Plan hash value: 234899233

    ————————————————————————————————————
    | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
    ————————————————————————————————————
    | 0 | SELECT STATEMENT | | 1 | | 2 |00:00:00.01 | 5 |
    | 1 | TABLE ACCESS BY INDEX ROWID| ORDERS | 1 | 1968 | 2 |00:00:00.01 | 5 |
    |* 2 | INDEX RANGE SCAN | ORDERS_SHIPPED_IDX | 1 | 1968 | 2 |00:00:00.01 | 3 |
    ————————————————————————————————————

    Predicate Information (identified by operation id):
    —————————————————

    2 – access(“SHIPPED”>=TRUNC)

    Operation_id:1 last ouput
    rows:2
    Operation_id:2 last ouput
    rows:2
    ================================================================================================

    Even I created the tuning task and executed it, there were no recommendation.
    I issued an
    alter system flush shared_pool;
    at the beginning, of course
    My version is
    Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – Production
    PL/SQL Release 11.2.0.3.0 – Production
    CORE 11.2.0.3.0 Production
    TNS for Linux: Version 11.2.0.3.0 – Production
    NLSRTL Version 11.2.0.3.0 – Production
    What could be happened!
    Any ideas?
    Thanks
    Laszlo

  24. Great article, very informative for a newb DBA such as myself.
    Obviously the flattery is to allow me to pose a question 😉

    If you were seeing a lot of these recommendations on a specific instance, what would you be thinking?
    I often see these alongside a ‘create an index’ recommendation in OEM, and generally speaking I’ve been going for the index option as it seems a more rounded way to approach the performancve issue – does that sound sensible to you?

  25. Thanks.

    Even though I am not a big believer of auto tuning, If I saw a lot of “sql profile” recommendations, I would be thinking that my statistics are not properly collected 🙂 Or that for whatever reason they are not “good enough” to calculate correct cardinality. Then I would try to figure out a way why this is happening.

    SQL profiles are a good way to tune edge cases (they are also an awesome way to add (any random) hints to your sql statements), but I would not use them to tune the system overall (I think SPM baselines are a much better tool).

    Regards,
    Maxym Kharchenko

  26. Hi

    Excellent article on SQL profiles. Was using it for long now but always wondered if some post would help me understand what magic the profiles actually do. Thanks a ton.

    Regards
    Shreas SR

  27. Very very nice article , it cleared most of my doubt regarding SQL profiler which I was seeing after running tuning advisor. Thank you for sharing this great information.

  28. Thanks, Pankaj. I’m glad you found it useful.

  29. Thanks a lot. Seems the mystery has been resolved.

  30. it is wonderful. Keep up the good work. Thanks you so much.


Leave a comment