Hybrid Histograms & Unpadded NCHAR Predicates

EDIT:2018-03-16 Seems WordPress or SQLPlus messed up my testcase, I have updated tihs post to include unedited testcase as well as output.

EDIT:2018-03-22 Oracle Support have created a new bug for this issue, Bug 27730925 – WRONG CARDINALITY ESTIMATION FROM HYBRID HISTOGRAM ON NCHAR COLUMN

Paraphrasing Jonathan Lewis it’s very easy to start ‘breaking’ Oracle as soon as you start combining a few features that most people are not running. The following demonstrates that if you are using NCHAR datatypes (in our case this is forced on us because due to our JDEdwards application) and if you have a hybrid histogram then cardinality estimates vary greatly depending on whether or not you pad your predicates.

Of course with NCHAR datatypes, the results and hence cardinality should be unchanged whether or not predicates are padded.  The issue does not reproduce the symptoms with CHAR datatypes.

SQL> CREATE TABLE HH_NC_TAB (COL NCHAR(10));

Table created.

SQL> insert /*+APPEND*/ into HH_NC_TAB
  2  SELECT 'AAA'||mod(rownum, 10) FROM DUAL CONNECT BY LEVEL < 100000
  3  union all
  4  SELECT 'AAA'||rownum FROM DUAL CONNECT BY LEVEL < 100000; -- > comment to avoid WP format issue

199998 rows created.

SQL> COMMIT;

Commit complete.

SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(NULL, 'HH_NC_TAB', METHOD_OPT=>'FOR COLUMNS COL SIZE 254');

PL/SQL procedure successfully completed.

SQL> EXPLAIN PLAN FOR SELECT * FROM HH_NC_TAB WHERE COL = 'AAA1';

Explained.

SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2981390417

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |     1 |    21 |   206   (2)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| HH_NC_TAB |     1 |    21 |   206   (2)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

   1 - filter("COL"=U'AAA1')

13 rows selected.

SQL> EXPLAIN PLAN FOR SELECT * FROM HH_NC_TAB WHERE COL = RPAD('AAA1', 10);

Explained.

SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2981390417

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           | 10236 |   209K|   204   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| HH_NC_TAB | 10236 |   209K|   204   (1)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

   1 - filter("COL"=U'AAA1      ')

13 rows selected.

SQL>

Issue reproduces on 12.1, 12.2 and I’ve uploaded to LiveSQL so you can verify for yourself on 18.1

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s