Problems with Binding Arrays

Recently some of our developers have moved forwards with fixing some code that has been using literals, instead of bind variables.  A complication is that they are using lists of values, so the SQL they were generating was of format

WHERE COL1 IN (1,2,3,4)

They resolved this by using an array in Java and construct similar to the following:

WHERE COL1 IN (SELECT * FROM TABLE(?))

However when moving to the new method performance degraded, they were getting full table scans where previously they were getting index access.
Initially I thought we could resolve this by using a cardinality hint on the TABLE select ie:

WHERE COL1 IN (SELECT /*+CARDINALITY(t 1) */ * FROM TABLE(?) t)

However this didn’t help much. I’ve managed to reproduce the problem to the testcase below (running on 18c):

SQL> CREATE TABLE t1
  2      AS
  3          SELECT
  4              ROWNUM id,
  5              rpad('x', 100) padding
  6          FROM
  7              dual
  8          CONNECT BY
  9              level  <= 4000; -- comment to avoid WordPress format issue  

Table T1 created.

SQL> create index t1_i1 on t1(id);

Index T1_I1 created.

SQL> exec dbms_stats.gather_table_stats(null, 't1');

PL/SQL procedure successfully completed.

SQL> create or replace type n_t as table of number;
  2  /

Type N_T compiled

SQL> SELECT /*+ gather_plan_statistics */  NULL
  2      FROM
  3        (SELECT DISTINCT a.id
  4              FROM t1   a)
  5      WHERE
  6          id IN (1)
  7  /
NULL



SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));
PLAN_TABLE_OUTPUT
SQL_ID  a18bwsyqx37gs, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */  NULL     FROM       (SELECT
DISTINCT a.id             FROM t1   a)     WHERE         id IN (1)

Plan hash value: 405044659

---------------------------------------------------------------------------------------
| Id  | Operation           | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |      1 |        |      1 |00:00:00.01 |       2 |
|   1 |  VIEW               |       |      1 |      1 |      1 |00:00:00.01 |       2 |
|   2 |   SORT UNIQUE NOSORT|       |      1 |      1 |      1 |00:00:00.01 |       2 |
|*  3 |    INDEX RANGE SCAN | T1_I1 |      1 |      1 |      1 |00:00:00.01 |       2 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("A"."ID"=1)



21 rows selected.

SQL> SELECT /*+ gather_plan_statistics */  NULL
  2      FROM
  3        (SELECT DISTINCT a.id
  4              FROM t1   a)
  5      WHERE
  6          id IN (SELECT /*+cardinality(nt 1) */ column_value FROM TABLE ( n_t(1) ) nt)
  7  /
NULL



SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));
PLAN_TABLE_OUTPUT
SQL_ID  cwcsdhm543ph2, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */  NULL     FROM       (SELECT
DISTINCT a.id             FROM t1   a)     WHERE         id IN (SELECT
/*+cardinality(nt 1) */ column_value FROM TABLE ( n_t(1) ) nt)

Plan hash value: 1445712880

----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |         |      1 |        |      1 |00:00:00.01 |      64 |       |       |          |
|*  1 |  HASH JOIN RIGHT SEMI                   |         |      1 |      1 |      1 |00:00:00.01 |      64 |  2546K|  2546K|  303K (0)|
|   2 |   JOIN FILTER CREATE                    | :BF0000 |      1 |      1 |      1 |00:00:00.01 |       0 |       |       |          |
|   3 |    COLLECTION ITERATOR CONSTRUCTOR FETCH|         |      1 |      1 |      1 |00:00:00.01 |       0 |       |       |          |
|   4 |   VIEW                                  |         |      1 |   4000 |      1 |00:00:00.01 |      64 |       |       |          |
|   5 |    HASH UNIQUE                          |         |      1 |   4000 |      1 |00:00:00.01 |      64 |  2294K|  2294K|  514K (0)|
|   6 |     JOIN FILTER USE                     | :BF0000 |      1 |   4000 |      1 |00:00:00.01 |      64 |       |       |          |
|*  7 |      TABLE ACCESS FULL                  | T1      |      1 |   4000 |      1 |00:00:00.01 |      64 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ID"=VALUE(KOKBF$))

PLAN_TABLE_OUTPUT
   7 - filter(SYS_OP_BLOOM_FILTER(:BF0000,"A"."ID"))



27 rows selected.

SQL>

Fundmentally I think the problem is the optimzer is unable to push the TABLE function into the inner select.
For the moment we’re having to revert to a hybrid solution where they generate SQL such as the following:

WHERE COL1 IN (?,?,?,?)

Each value has to be bound seperately. It’s not ideal as shared pool is somewhat full of each variation depending on how many values there are.  Also they are having to do a soft parse of the cursor each time, rather than parsing once and re-using.

However other than getting the developers to rewrite all their queries I don’t see any better solution at the moment.

2 thoughts on “Problems with Binding Arrays”

  1. HI Patrick,

    What’s wrong with moving “distinct” a bit?

    SELECT distinct NULL
    FROM
    (SELECT a.id
    FROM t1 a)
    WHERE
    id IN (SELECT /*+cardinality(nt 1) */ column_value FROM TABLE ( n_t(1) ) nt)
    /

    1. Hi Stew, Sure, but that’s not really the point, the actual SQL statements are horrifically more complex than this example, I have just simplified down to most trivial testcase I can devise. I think fundamentally in this instance the point of the distinct is to prevent the subquery from simply being transformed away.

Leave a Reply to stewashton Cancel Reply

Your email address will not be published. Required fields are marked *

Scroll to Top