Adaptive Cursor Sharing not working as expected

We’ve been experiencing some issues with multiple child cursors for a given SQL statement, and I’ve just spent some time working on building a reproducible testcase of one problem that I thought I’d share in the hopes of documenting the behavior.

It is a python script connecting to an Oracle Database (tested against 12.2 and 19.6), so requires the cx_Oracle Python module.

import cx_Oracle
import argparse 

def setup_t1(con):
    cursor = con.cursor()
    try:
        cursor.execute("drop table t1")
    except Exception:
        pass
    cursor.execute("create table t1 (c1 number, n1 nchar(2), n2 nchar(2))")
    cursor.execute("insert into t1 select 1, to_char(round(rownum/10)),  to_char(mod(rownum, 2)) from all_objects where rownum <= 20")
    con.commit
    cursor.callproc("sys.dbms_stats.gather_table_stats", (None, 't1'))
    cursor.close()                    

def setup_t2(con):
    cursor = con.cursor()
    try:
        cursor.execute("drop table t2")
    except Exception:
        pass
    cursor.execute("create table t2(c1 number)")
    cursor.execute("create index i2 on t2(c1)")
    cursor.execute("insert into t2(c1) select rownum from all_objects where rownum <= 1000")
    con.commit
    cursor.callproc("sys.dbms_stats.gather_table_stats", \
                    keywordParameters = dict(ownname = None, tabname = "t2", method_opt = "for all columns size auto"))
    cursor.close()                    

def run_query(con, p1, p2):
    cursor = con.cursor()
    query = "select  count(*) from t1, t2 where t1.n1 = :1 and t1.n2 = :2 and t1.c1 = t2.c1"
    cursor.execute(query,[p1, p2])
    cursor.fetchall()
    cursor.close()

def run_query_0(con):
    run_query(con, '0','0')    

def run_query_0_space(con):
    run_query(con, '0 ','0 ')  

def get_sql_id(con):
    cursor = con.cursor()
    query = "select prev_sql_id from v$session where sid = userenv('SID')"
    cursor.execute(query)     
    row = cursor.fetchone()    
    cursor.close()
    return row[0]

def dump_sql(con, sql_id):
    cursor = con.cursor()    
    cursor.execute("select plan_hash_value, count(*) from v$sql where sql_id = :1 and is_shareable = 'Y' and is_obsolete = 'N' group by plan_hash_value", [sql_id])     
    rows = cursor.fetchall()    
    for row in rows:
        print (row)
    cursor.close()

parser = argparse.ArgumentParser()
parser.add_argument('--connection', type=str)
args = parser.parse_args()

con = cx_Oracle.connect(args.connection)
setup_t1(con)
setup_t2(con)
run_query_0(con)
run_query_0_space(con)
run_query_0(con)
sql_id = get_sql_id(con)
con.close

con = cx_Oracle.connect(args.connection)
for x in range(0, 80):
    run_query_0(con)
    run_query_0_space(con)

dump_sql(con, sql_id)
con.close

Results are as below, showing 81 active child cursors.

('66dbrch2wu34f', 3482066175, 1)
('66dbrch2wu34f', 2913850814, 80)

I’m not sure why it’s necessary to close and re-open the connection between the first three executions (when it becomes bind aware) and the subsequent executions, but it is. If both the nchar columns are char (or nvarchar2) the issue doesn’t reproduce.

It seems the database is not correctly handling the Cursor Selectivity Cubes:

SELECT
    predicate,
    range_id,
    low,
    high,
    COUNT(*)
FROM
    v$sql_cs_selectivity
WHERE
    sql_id = '66dbrch2wu34f'
GROUP BY
    predicate,
    range_id,
    low,
    high

PREDICATE                                  RANGE_ID LOW        HIGH         COUNT(*)
---------------------------------------- ---------- ---------- ---------- ----------
=1                                                0 0.300000   0.366667          317

SQL>
It also seems like the fact that the plan is adaptive is a factor. If I hint the query to not allow adaptive plans, with OPT_PARAM('_optimizer_adaptive_plans','false') I don't see the problem.
select * from dbms_xplan.display_cursor('66dbrch2wu34f', null, format=>'+adaptive')

-----------------------------------------------------------------------------------------
|   Id  | Operation                      | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT               |      |       |       |     4 (100)|          |
|     1 |  SORT AGGREGATE                |      |     1 |    17 |            |          |
|  *  2 |   HASH JOIN                    |      |     3 |    51 |     4   (0)| 00:00:01 |
|-    3 |    NESTED LOOPS                |      |     3 |    51 |     4   (0)| 00:00:01 |
|-    4 |     STATISTICS COLLECTOR       |      |       |       |            |          |
|  *  5 |      TABLE ACCESS STORAGE FULL | T1   |     3 |    39 |     2   (0)| 00:00:01 |
|- *  6 |     INDEX RANGE SCAN           | I2   |     1 |     4 |     2   (0)| 00:00:01 |
|     7 |    INDEX STORAGE FAST FULL SCAN| I2   |  1000 |  4000 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T1"."C1"="T2"."C1")
   5 - storage(("T1"."N1"=SYS_OP_C2C(:1) AND "T1"."N2"=SYS_OP_C2C(:2)))
       filter(("T1"."N1"=SYS_OP_C2C(:1) AND "T1"."N2"=SYS_OP_C2C(:2)))
   6 - access("T1"."C1"="T2"."C1")

Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)

ORDS Issues in Oracle 19c? Check your Service Names

We upgraded our first database from 12.2 to 19c last week, and encountered a nasty issue with ORDS.  Credit goes to my colleagues Mingda Lu and Au Chun Kei for doing the hard work in understanding what was causing the issue.

The issue can be demonstrated with the Oracle DB Developer VM.  I have created a RESTful Web Service following the oracle-base guide.

A quick test with wget shows that everything is OK with the default settings:

[oracle@localhost ~]$ wget http://localhost:8080/ords/hr/hrmod/employees/100
--2020-02-15 04:23:46-- http://localhost:8080/ords/hr/hrmod/employees/100
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:8080... connected.
HTTP request sent, awaiting response... 200 OK

Note however that ORDS is configured with the PDB default service name ‘orcl’

[oracle@localhost ~]$ grep servicename /u01/userhome/oracle/ords/vmconfig/ords/defaults.xml
<entry key="db.servicename">orcl</entry>

What happens if use a different service name for ORDS?  I’ll create a couple of services to demonstrate the issue.

[oracle@localhost ~]$ sql system/oracle@localhost:1521/orcl

SQLcl: Release 19.1 Production on Sat Feb 15 04:58:57 2020
Copyright (c) 1982, 2020, Oracle. All rights reserved.
Last Successful login time: Sat Feb 15 2020 04:59:02 -05:00
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> exec sys.dbms_service.create_service('orcl_ords', 'orcl_ords');
PL/SQL procedure successfully completed.

SQL> exec sys.dbms_service.start_service('orcl_ords');
PL/SQL procedure successfully completed.

SQL> exec sys.dbms_service.create_service('orcl.ords', 'orcl.ords');
PL/SQL procedure successfully completed.

SQL> exec sys.dbms_service.start_service('orcl.ords');
PL/SQL procedure successfully completed.

If I change db.servicename entry to ‘orcl_ords’ in defaults.xml, restart ORDS and retest, all is OK.

[oracle@localhost wget http://localhost:8080/ords/hr/hrmod/employees/100
--2020-02-15 05:03:17-- http://localhost:8080/ords/hr/hrmod/employees/100
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:8080... connected.
HTTP request sent, awaiting response... 200 OK

However if I change the service name to the other new service, orcl.ords, and restart ORDS, then it starts up without any problems, but when testing the service we get an error.

[oracle@localhost ~]$ wget http://localhost:8080/ords/hr/hrmod/employees/100
--2020-02-15 05:05:57-- http://localhost:8080/ords/hr/hrmod/employees/100
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:8080... connected.
HTTP request sent, awaiting response... 503 Service Unavailable
2020-02-15 05:06:01 ERROR 503: Service Unavailable.

The error stacks from ORDS seems to gives some clues as to what’s going on.

WARNING: The database user for the connection pool named |apex|pu|, is not authorized to proxy to the schema named HR
oracle.dbtools.common.jdbc.ConnectionPoolConfigurationException: The database user for the connection pool named |apex|pu|, is not authorized to proxy to the schema named HR
at oracle.dbtools.common.jdbc.ConnectionPoolExceptions.from(ConnectionPoolExceptions.java:46)
at oracle.dbtools.common.jdbc.ConnectionPoolExceptions.from(ConnectionPoolExceptions.java:53)
Caused by: oracle.dbtools.common.ucp.ConnectionLabelingException: Error occurred when attempting to configure url: jdbc:oracle:thin:@//localhost:1521/orcl.ords with labels: {oracle.dbtools.jdbc.label.schema=HR}
at oracle.dbtools.common.ucp.LabelingCallback.handle(LabelingCallback.java:147)
at oracle.dbtools.common.ucp.LabelingCallback.proxyToSchema(LabelingCallback.java:210)
at oracle.dbtools.common.ucp.LabelingCallback.configure(LabelingCallback.java:76)
Caused by: java.sql.SQLException: ORA-01017: invalid username/password; logon denied
at oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:494)
at oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:441)
at oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:436)
at oracle.jdbc.driver.T4CTTIfun.processError(T4CTTIfun.java:1027)

This seems to imply there’s some issue with the proxy authentication mechanism when using the orcl.ords service, however testing from sqlplus, all seems to be OK.

[oracle@localhost ords]$ sql ords_public_user[hr]/oracle@localhost:1521/orcl.ords

SQLcl: Release 19.1 Production on Sat Feb 15 05:09:46 2020

Copyright (c) 1982, 2020, Oracle. All rights reserved.

Last Successful login time: Fri May 31 2019 16:29:03 -04:00

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0


SQL>

From our testing, any service with a name with the format <pdb_name>.<any_text> exhibits the problem. We have used service names with such a format in 12.2 without issues, so it seems this is new behaviour introduced in 18c or 19c.

We’ve also noticed that when checking v$services, the value for con_id for the ‘problem’ service is 1 which may give a clue as to what’s going on, although it only seems to cause a problem for ORDS.

SQL> select con_id, network_name from v$services where network_name in ('orcl_ords', 'orcl.ords');
CON_ID NETWORK_NAME
_________ _______________
1 orcl.ords
3 orcl_ords

ords.enable_schema fails with “ORA-06598: insufficient INHERIT PRIVILEGES privilege”

This is issue I always hit settitng up a test environment using Oracle REST Data Services (ORDS).  Using SYSTEM user to call ords.enable_schema throws ORA-06598.  According to the documentation it should succeed (note the SYSTEM user has DBA role) .

Only database users with the DBA role can enable or disable a schema other than their own.

[oracle@localhost ~]$ sql system/oracle

SQLcl: Release 19.1 Production on Sat Feb 15 02:20:33 2020
Copyright (c) 1982, 2020, Oracle.  All rights reserved.
Last Successful login time: Sat Feb 15 2020 02:20:36 -05:00
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> SHOW USER
USER is "SYSTEM"

SQL> select role from session_roles where role = 'DBA';
   ROLE
_______
DBA
SQL> BEGIN
  2     ords.enable_schema(
  3        p_enabled             => TRUE,
  4        p_schema              => 'HR',
  5        p_url_mapping_type    => 'BASE_PATH',
  6        p_url_mapping_pattern => 'hr',
  7        p_auto_rest_auth      => FALSE);
  8
  9    COMMIT;
 10  END;
 11  /
BEGIN
*
ERROR at line 1:
ORA-06598: insufficient INHERIT PRIVILEGES privilege
ORA-06512: at "ORDS_METADATA.ORDS", line 1
ORA-06512: at line 2

The solution is simple, grant inherit privileges on the current user (SYSTEM) to the ORDS_METADATA user.

SQL> show user
USER is "SYSTEM"
SQL> grant inherit privileges on user SYSTEM to ORDS_METADATA;
Grant succeeded.

Once this is completed, the call to ords.enable_schema is successful.

SQL> BEGIN
2 ords.enable_schema(
3 p_enabled => TRUE,
4 p_schema => 'HR',
5 p_url_mapping_type => 'BASE_PATH',
6 p_url_mapping_pattern => 'hr',
7 p_auto_rest_auth => FALSE);
8
9 COMMIT;
10 END;
11 /

PL/SQL procedure successfully completed.

ORA-01017 starting DB with srvctl? Consider firing your (oracle) agent!

Warning please don’t blindly follow the steps here without doing your own analysis of the risks involved, and ideally without getting Oracle support involved, I was hesitant to publish this, but as I’ve been in contact with someone else and it’s helped them workaround (to some extent) an issue they have been having I think it’s worth putting out there.

We had a problem during a dataguard switch-over (luckily planned switch-over for patching rather than a disaster situation) where Grid Infrastructure (clusterware) was unable to bring up one of the databases, it kept throwing “ORA-01017: invalid username/password”. Starting the database the ‘traditional way’ using “sqlplus / as sysdba” had no such problems.

Reviewing Oracle Support, particularly Doc ID 2313555.1 we identified some non-standard configuration in the Oracle home used for this database, but even after resolving them, the error persisted.

At times like these you realize (or at least I did) how little is published about the internals of how clusterware and the oracle databases it manages interact.

I suspected that restarting the entire clusterware stack would resolve the issue but that was difficult as this node also managed a production database which we didn’t want to take down.

However I guessed that restarting the clusterware agent for the oracle user might fix the problem. The executable is oraagant.bin and the process owner is oracle. I believe this is the process clusterware uses to actually start the database (You’ll also probably notice a similar process owned by grid and orarootagent.bin running as root).

I killed the oracle agent process and crossed my fingers. Luckily clusterware re-spawned this process and afterwards we were able to restart the problem instance without any problems.

Please re-read the first paragraph if you are considering to apply this work-around, and don’t blame me if you break anything, if it helps though I’m happy to take the credit!

Adding covering fields to a Primary Key Index

This is something to file under the (admittedly rather large) category of things that I wasn’t aware that the Oracle database could do.

While tuning a query, I wanted to use a common technique of adding fields to an index to eliminate a “Table Access by Index RowID” operation, however this particular case was a complicated  by the fact that the index was supporting the primary key, and the table was large and frequently accessed.

This is probably easiest demonstrated by the (much simplified) example below:

SQL> create table singles(id number generated always as identity,
2                       artist varchar2(255),
3                       title varchar2(255),
4                       constraint singles_pk primary key (id));

Table SINGLES created.

SQL>
SQL> insert into singles (artist,
2                        title)
3              values ('Chesney Hawkes',
4                      'The One And Only');

1 row inserted.

SQL> commit;

Commit complete.

SQL> select index_name from user_indexes where table_name = 'SINGLES';
INDEX_NAME
_____________
SINGLES_PK

SQL> select artist from singles where id = 1;
ARTIST
_________________
Chesney Hawkes

SQL> select * from dbms_xplan.display_cursor(format=>'BASIC');
PLAN_TABLE_OUTPUT
_____________________________________________________
EXPLAINED SQL STATEMENT:
------------------------
select artist from singles where id = 1

Plan hash value: 3923658952

--------------------------------------------------
| Id  | Operation                   | Name       |
--------------------------------------------------
|   0 | SELECT STATEMENT            |            |
|   1 |  TABLE ACCESS BY INDEX ROWID| SINGLES    |
|   2 |   INDEX UNIQUE SCAN         | SINGLES_PK |
--------------------------------------------------

14 rows selected.

Note that adding a new index on (id, artist) makes the plan more efficient:

SQL> create index i_singles_covering on singles(id, artist);

Index I_SINGLES_COVERING created.

SQL> select artist from singles where id = 1;
ARTIST
_________________
Chesney Hawkes

SQL> select * from dbms_xplan.display_cursor(format=>'BASIC');
PLAN_TABLE_OUTPUT
__________________________________________________
EXPLAINED SQL STATEMENT:
------------------------
select artist from singles where id = 1

Plan hash value: 1012019734

-----------------------------------------------
| Id  | Operation        | Name               |
-----------------------------------------------
|   0 | SELECT STATEMENT |                    |
|   1 |  INDEX RANGE SCAN| I_SINGLES_COVERING |
-----------------------------------------------

13 rows selected.

However we’ve now got two indexes, SINGLES_PK on (id) and I_SINGLES_COVERING on (id, artist).  SINGLES_PK is redundant, but being used to support the Primary Key:

SQL> select index_name from user_indexes where table_name = 'SINGLES';
           INDEX_NAME
_____________________
SINGLES_PK
I_SINGLES_COVERING

Now it is possible for a primary key to be supported by I_SINGLES_COVERING, but initially I thought I’d have to choose between dropping and re-creating the primary key to use the new index, or leaving the system in the non-optimal state of having the two indexes.

However I came across this blog post from Richard Foote, which referenced another post from Jonathan Lewis.  It describes the following technique of modifying the constraint to use the new index without needing to re-recreate it.  It’s worth noting that the index SINGLES_PK that the database automatically created to initially support the primary key gets dropped during this operation.

SQL> alter table singles
  2        modify constraint singles_pk
  3        using index i_singles_covering;
Table SINGLES altered.


SQL> select index_name from user_indexes where table_name = 'SINGLES';
INDEX_NAME
_____________________
I_SINGLES_COVERING

One thing I observed my testing was that if I created i_singles_covering as a unique index (id is unique as it’s the primary key, so obviously combination of id & artist must also be unique) then the database was unwilling to use this index to support the primary key:

SQL> create unique index i_singles_covering on singles(id, artist);

Index I_SINGLES_COVERING created.

SQL> alter table singles
2          modify constraint singles_pk
3          using index i_singles_covering;

ORA-14196: Specified index cannot be used to enforce the constraint.
14196. 00000 - "Specified index cannot be used to enforce the constraint."
*Cause: The index specified to enforce the constraint is unsuitable
for the purpose.
*Action: Specify a suitable index or allow one to be built automatically.

This case is documented by Oracle Support Document ID 577253.1 which states:

We cannot use a prefix of a unique index to enforce a unique constraint. We can use a whole unique index or a prefix of a non-unique index to do that. This is the way Oracle was designed.

However I can’t off-hand think of any technical reason for this limitation.

 

SQL Plan Directives: Gotta Purge ‘Em All

A tip I picked up from Nigel Bayliss regards purging SQL Plan Directives, I’ve been using it a lot recently and can’t see it documented elsewhere.

As some background these records, exposed via the DBA_SQL_PLAN_DIRECTIVES view, are cardinality corrections created and used when running with Adaptive Statistics enabled.  There is a job that should automatically purge all records unused for longer than the value of SPD_RETENTION_WEEKS, but we’ve experienced occasions when this job doesn’t work as expected.

The records can be individually purged by calling DBMS_SPD.DROP_SQL_PLAN_DIRECTIVE, but that’s a pain if you’ve got a lot of them

However what the documentation doesn’t mention is that you can call the procedure, passing in NULL for the mandatory directive_id parameter:

exec sys.DBMS_SPD.DROP_SQL_PLAN_DIRECTIVE (NULL);

This will purge all records based on retention rules that the auto-purge job follows.  If you really want to Purge ‘Em All then you can set the retention to 0 before calling the procedure.

exec sys.DBMS_SPD.SET_PREFS('SPD_RETENTION_WEEKS', '0');

 

OGB Appreciation Day: Implicit Cursors

Recently I’ve been working with SQL Server and while it’s not all bad, sometimes it does helps to highlight some of the neat features available in Oracle.  One of these is IMPLICIT cursors, which I shall demonstrate.

First I’ll show how to populate some data in a table and then loop over it in TSQL (SQL Server’s equivalent of PL/SQL):

1> CREATE TABLE demo (col1 TINYINT, col2 TINYINT, col3 TINYINT);
2> GO

1> INSERT INTO demo (col1, col2, col3)
2>           VALUES (11,   12,   13),
3>                  (21,   22,   23),
4>                  (31,   32,   33);
5> GO
(3 rows affected)

1>  DECLARE @col1 tinyint, @col2 tinyint, @col3 tinyint;
2>  DECLARE explicit_cursor CURSOR LOCAL FAST_FORWARD FOR
3>     SELECT col1, col2, col3 FROM dbo.demo;
4>
5>  OPEN explicit_cursor;
6>  FETCH NEXT FROM explicit_cursor INTO @col1, @col2, @col3;
7>
8>  WHILE @@FETCH_STATUS = 0
9>  BEGIN
10>    PRINT CONCAT(@col1, ':', @col2, ':', @col3);
11>    FETCH NEXT FROM explicit_cursor INTO @col1, @col2, @col3;
12> END
13>
14> CLOSE explicit_cursor;
15> DEALLOCATE explicit_cursor;
16>
17> GO
11:12:13
21:22:23
31:32:33

By the way, note the neat way it’s possible to insert 3 records with a single INSERT statement. I didn’t say there weren’t some things that SQL Server does a little better 🙂

Next check out the equivalent SQL statements and PL/SQL code in the Oracle Database. Note my Oracle demos are running on an Autonomous Transaction Processing Database in Oracle Cloud although should work in all versions including Oracle XE, the free to use database.

SQL> CREATE TABLE demo (col1 NUMBER, col2 NUMBER, col3 NUMBER);
Table DEMO created.

SQL> INSERT INTO demo (col1, col2, col3) VALUES (11, 12, 13);
1 row inserted.

SQL> INSERT INTO demo (col1, col2, col3) VALUES (21, 22, 23);
1 row inserted.

SQL> INSERT INTO demo (col1, col2, col3) VALUES (31, 32, 33);
1 row inserted.

SQL> COMMIT;
Commit complete.

SQL> SET SERVEROUTPUT ON SIZE UNLIMITED;
SQL>
SQL> DECLARE
  2      CURSOR explicit_cursor IS
  3         SELECT col1, col2, col3 FROM demo;
  4      explicit_record explicit_cursor%ROWTYPE;
  5  BEGIN
  6      OPEN explicit_cursor;
  7      LOOP
  8          FETCH explicit_cursor INTO explicit_record;
  9          EXIT WHEN explicit_cursor%NOTFOUND;
 10          sys.dbms_output.put_line(explicit_record.col1 || ':' ||
 11                                   explicit_record.col2 || ':' ||
 12                                   explicit_record.col3          );
 13      END LOOP;
 14      CLOSE explicit_cursor;
 15  END;
 16  /

11:12:13
21:22:23
31:32:33

PL/SQL procedure successfully completed.

Already I prefer a few things about the Oracle solution.  The ability to use a cursor %ROWTYPE rather than having to define and use variables for individual columns, the fact there is only one fetch command required and the use of the %NOTFOUND cursor attribute rather than the somewhat arbitrary @@FETCHSTATUS == 0 check.

However Oracle offers an even better method, namely an implicit cursor.

SQL> BEGIN
  2      FOR implicit_record IN (SELECT col1, col2, col3 FROM demo)
  3      LOOP
  4          sys.dbms_output.put_line(implicit_record.col1 || ':' ||  
  5                                   implicit_record.col2 || ':' ||  
  6                                   implicit_record.col3          );
  7      END LOOP;
  8  END;
  9  /

11:12:13
21:22:23
31:32:33

PL/SQL procedure successfully completed

A few things to note.  We’re down from 15 to 8 lines of code which makes this easier to write, and just as importantly with less chance of bugs.  No need to worry about defining rowtypes, or opening or closing cursors, Oracle just does the right thing under the covers including tidying up in case exceptions are thrown.

 

 

 

 

ORDS Under Siege: Introduction

I’ve been playing around with researching ORDS over the summer particularly trying to optimize performance on Tomcat. Trying something a little different I’ve created a Vagrant box that should allow anybody interested to verify my findings, find mistakes I’ve made or identify performance optimizations I’ve missed.

If you’re new to Vagrant, Tim Hall provides a good introduction for the Oracle DBA.

You can clone or download the Vagrant box from my github page hopefully the instructions should be clear, you need to download Oracle 18cXE and ORDS releases and put into the software directory.  I’ve allocated 6GB RAM and 4 CPUs to the virtual machine, you may need to adjust these values depending on your test machine resources.  Doing “vagrant up” should automatically configure the database, and configure ORDS running in Tomcat and with some reverse proxies.  It will also generate a self-signed certificate and configure the SSL handling in both Tomcat and the reverse proxies.  Most of the database and ORDS configuration scripts were taken from the Oracle Vagrant boxes or Tim’s Vagrant boxes.

The bench-marking tool I am using is Siege.  There are many alternatives available but I chose Siege for a few reasons.  Firstly it is Free and Open Source software.  Secondly it is easy to configure, simply populate a file, urls.txt, with the URLs to hit and then run the executable with suitable parameters.  Finally it is lightweight, being written in C, whereas many other similar tools are written in Java, as I am running the bench-marking tool on the same virtual machine that hosts the software components I’m trying to measure this is important.

Once the vagrant machine is up, you can connect to it via “vagrant ssh” and then type “ords-demo” to run the entire test-suite. I’ll go through the individual tests in the following blog posts and share my findings.

 

 

 

 

mysql_clear_password & authentication_ldap_simple password encryption

Preparing for my talk at the HK Open Source Conference I wanted to confirm some of the things I had read about these plugins, mostly from Matthias Crauwels excellent article on the subject.

My lab environment consists of Windows 2016 Domain controller and client machines, with MySQL 8 running on Oracle Linux 7.

First I configure the database to use the server-side plugin, configure the plugin to point to the domain controller, and create a database user associated via the plugin with my Windows account.

mysql> INSTALL PLUGIN authentication_ldap_simple SONAME 'authentication_ldap_simple.so';
Query OK, 0 rows affected (0.05 sec)

mysql> SET GLOBAL authentication_ldap_simple_server_host='win-dc.windows.domain';
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE USER 'patrick'@'%'
    ->        IDENTIFIED WITH authentication_ldap_simple
    ->        BY 'CN=patrick,CN=Users,DC=WINDOWS,DC=DOMAIN';
Query OK, 0 rows affected (0.08 sec)

Next I successfully connect from my Windows client to this database account passing in my Windows credentials.

[patrick@WIN-CLIENT] C:\> mysql --host=lnx-mysql8.windows.domain `
>>                              --user=patrick                   `
>>                              --password=Password123           `
>>                              --enable-cleartext-plugin
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 63
Server version: 8.0.13-commercial MySQL Enterprise Server - Commercial

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

Checking the network packets between the Windows client and database server, observe that (almost all) the traffic is encrypted with a self-signed certificate.

[root@lnx-mysql8 ~]# tcpflow -c port 3306
tcpflow: listening on enp0s3
192.168.002.004.03306-192.168.002.002.49801: U
<UdYcaching_sha2_passwordLR%
192.168.002.002.49801-192.168.002.004.03306:
192.168.002.002.49801-192.168.002.004.03306: ANi<l[}y~yn+,/0#'$(g@kj
239-.<>%=h&/05612?)i*17
U


0H1212013500Z0@1>0<U5MySQL_Server_8.0.13_Auto_Generated_Server_Certificate0"0
DH<@_Cn%VR9A)[QLMVjAYe1Kju2\)Wk7SqOG02Xl;>n)i&gjV:/,J^f("qBDH8kW:lKQ+B 3;K^!.$5BxJ=0XWD,00U00
^uBey@/e;m4sQ
lJDiU{?s;[72FeLkS
p{WFXdr**yLPp'ij_(z`E"{Lxu|1DX$Jp`w;Ti<}BlLA@?
0H1212013500Z0<1:08U1MySQL_Server_8.0.13_Auto_Generated_CA_Certificate0"0 2 |s?jupq&GG]5`-2$1'$,AT"`OA/^d((~:n0Z'~O?$+az]y(De"5Klwiv(B"ST~rE0'7qIWZc%R$D8v9MzF|\blAK00U00 9()~I?-Nq(#LRACVU>eOWB["IOm$\]fvNa7W\m?
dYt^dT)-Y&UWUBlnmOA}?%YW>D
C*<*f,OD^^G"
'#-F*/=@=]mnDRVQ,RG(a|la^!fH)5"{EbynK4{q:CiV%#(f_hr_/-X~S/:
(@C{jB&{%ddU-F0fG/2t_aPUw\A%&+:{K"t(+}Q~+|#XxJNS\XhDz);=79.o{Q<sp??D^0\v:6[a|$ooBZ(K0Zt}. WjeoWOi4AE]YkQXUH0E2;'US-/Dw0[4@&3]]/c`GCEEzmU@oG2{%Z&`0!A}]A[.:m 7q@w]Rv3-nKZP$N''}yjKssvFvc:O$'rU)f(=@,Wup,>b+xF[Lv6;
192.168.002.004.03306-192.168.002.002.49801: ,
{xsGdN>
XEUyH`?T'd7fI JwN%:eq2#;y'Nh(hm}c$dG'zs
zT@(W$]#Wm4nw2t7(`X-5lK'SXwk0qS3
192.168.002.002.49801-192.168.002.004.03306: =@,Wa{q%)}aH9;.%~k$hoKI+a8\B}@NR@Dp`JFDwK\(1%9 %5XqO f:Pgmvi|>N^&=k/~egl]i@s;p
\8&?4AKg>r63E
192.168.002.004.03306-192.168.002.002.49801: H`XmX2:@~Oq)BY-|<`GS6ew 192.168.002.002.49801-192.168.002.004.03306: 0=@,W.la:QC0,%>G~L6
192.168.002.004.03306-192.168.002.002.49801: #`X-$~v4_L3
|s
192.168.002.002.49801-192.168.002.004.03306: ==@,W.BFu?6'F6|P)EC]?%n)ww
nSHK*+@6FS(9l|Y2>apy;-192.168.002.002.49801: |`X,

Note however it is possible to disable this encryption with the –ssl-mode=disabled flag.

[patrick@WIN-CLIENT] C:\> mysql --host=lnx-mysql8.windows.domain `
>>                              --user=patrick                   `
>>                              --password=Password123           `
>>                              --enable-cleartext-plugin        `
>>                              --ssl-mode=disabled
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 64
Server version: 8.0.13-commercial MySQL Enterprise Server - Commercial

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

Observe that the password is now transmitted unencrypted during authentication

[root@lnx-mysql8 ~]# tcpflow -c port 3306
tcpflow: listening on enp0s3
192.168.002.004.03306-192.168.002.002.49899: U
8.0.13-commercialAi95M2)P3(f}3caching_sha2_password
192.168.002.002.49899-192.168.002.004.03306: patrick lv
J5T{W0J-rcaching_sha2_passwordq_pid172program_namemysql_client_namelibmysql_thread2640_client_version8.0.13_osWin64_platformx86_64
192.168.002.004.03306-192.168.002.002.49899: mysql_clear_password
192.168.002.002.49899-192.168.002.004.03306: Password123
192.168.002.004.03306-192.168.002.002.49899:
192.168.002.002.49899-192.168.002.004.03306: !select @@version_comment limit 1
192.168.002.004.03306-192.168.002.002.49899: 'def@@version_comment$%$MySQL Enterprise Server - Commercial

Such connections can be prevented with the require_secure_transport=ON variable

mysql> SET GLOBAL require_secure_transport=ON;
Query OK, 0 rows affected (0.00 sec)

In this case connections which disable encryption will be rejected

[patrick@WIN-CLIENT] C:\> mysql --host=lnx-mysql8.windows.domain `
>>                              --user=patrick                   `
>>                              --password=Password123           `
>>                              --enable-cleartext-plugin        `
>>                              --ssl-mode=disabled
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 3159 (HY000): Connections using insecure transport are prohibited while --require_secure_transport=ON.
[patrick@WIN-CLIENT] C:\>

Unfortunately even though the connection is rejected, the password is still transmitted unencrypted during authentication process

[root@lnx-mysql8 ~]# tcpflow -c port 3306
tcpflow: listening on enp0s3
192.168.002.004.03306-192.168.002.002.49867: U
8.0.13-commercial@=l?K/sbq!Qa&sG{{Bcaching_sha2_password
192.168.002.002.49867-192.168.002.004.03306: patrick Au'"?..`%85n]~caching_sha2_passwordr_pid3416program_namemysql_client_namelibmysql_thread4992_client_version8.0.13_osWin64_platformx86_64
192.168.002.004.03306-192.168.002.002.49867: mysql_clear_password
192.168.002.002.49867-192.168.002.004.03306: Password123
192.168.002.004.03306-192.168.002.002.49867: aW#HY000Connections using insecure transport are prohibited while --require_secure_transport=ON.

However if we re-run the original connection attempt (with encrypted traffic between database client and server)  capturing the network traffic between database and domain controller, we can see password is transferred unencyrpted as this point

[root@lnx-mysql8 .passwords]# tcpflow -c port 389
tcpflow: listening on enp0s3
192.168.002.004.43068-192.168.002.001.00389: 0PcK

NtVer0mainWINDOWS.DOMAIN
netlogon
192.168.002.001.00389-192.168.002.004.43070: 0d00znetlogon1jhl)X0K4fWINDOWSDOMAINWIN-DCWINDOWSWIN-DCDefault-First-Site-NameE0e

192.168.002.004.43070-192.168.002.001.00389: 0B
192.168.002.004.42990-192.168.002.001.00389: 0?`:(CN=patrick,CN=Users,DC=WINDOWS,DC=DOMAINPassword123
192.168.002.001.00389-192.168.002.004.42990: 0a

The documentation alludes to some of the restrictions, although to my mind it seems to focus on the encryption between database client and server, but not between database server and Domain Controller (unless I’ve missed something).

The server-side authentication_ldap_simple plugin performs simple LDAP authentication. For connections by accounts that use this plugin, client programs use the client-side mysql_clear_password plugin, which sends the password to the server in clear text. No password hashing or encryption is used, so a secure connection between the MySQL client and server is recommended to prevent password exposure.

Based on the above observations, as I was expecting, this plugin combination is not really suitable for implementation for any environment which takes security seriously.

If that wasn’t enough reason to avoid the plugin, during the course of my investigation I discovered some other surprising behavior.  Based on discussion with Oracle support this should be resolved in the next release of SQL Server, so I’ll wait till that is released before sharing.

 

Problems with Binding Arrays

Recently some of our developers have moved forwards with fixing some code that has been using literals, instead of bind variables.  A complication is that they are using lists of values, so the SQL they were generating was of format

WHERE COL1 IN (1,2,3,4)

They resolved this by using an array in Java and construct similar to the following:

WHERE COL1 IN (SELECT * FROM TABLE(?))

However when moving to the new method performance degraded, they were getting full table scans where previously they were getting index access.
Initially I thought we could resolve this by using a cardinality hint on the TABLE select ie:

WHERE COL1 IN (SELECT /*+CARDINALITY(t 1) */ * FROM TABLE(?) t)

However this didn’t help much. I’ve managed to reproduce the problem to the testcase below (running on 18c):

SQL> CREATE TABLE t1
  2      AS
  3          SELECT
  4              ROWNUM id,
  5              rpad('x', 100) padding
  6          FROM
  7              dual
  8          CONNECT BY
  9              level  <= 4000; -- comment to avoid WordPress format issue  

Table T1 created.

SQL> create index t1_i1 on t1(id);

Index T1_I1 created.

SQL> exec dbms_stats.gather_table_stats(null, 't1');

PL/SQL procedure successfully completed.

SQL> create or replace type n_t as table of number;
  2  /

Type N_T compiled

SQL> SELECT /*+ gather_plan_statistics */  NULL
  2      FROM
  3        (SELECT DISTINCT a.id
  4              FROM t1   a)
  5      WHERE
  6          id IN (1)
  7  /
NULL



SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));
PLAN_TABLE_OUTPUT
SQL_ID  a18bwsyqx37gs, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */  NULL     FROM       (SELECT
DISTINCT a.id             FROM t1   a)     WHERE         id IN (1)

Plan hash value: 405044659

---------------------------------------------------------------------------------------
| Id  | Operation           | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |      1 |        |      1 |00:00:00.01 |       2 |
|   1 |  VIEW               |       |      1 |      1 |      1 |00:00:00.01 |       2 |
|   2 |   SORT UNIQUE NOSORT|       |      1 |      1 |      1 |00:00:00.01 |       2 |
|*  3 |    INDEX RANGE SCAN | T1_I1 |      1 |      1 |      1 |00:00:00.01 |       2 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("A"."ID"=1)



21 rows selected.

SQL> SELECT /*+ gather_plan_statistics */  NULL
  2      FROM
  3        (SELECT DISTINCT a.id
  4              FROM t1   a)
  5      WHERE
  6          id IN (SELECT /*+cardinality(nt 1) */ column_value FROM TABLE ( n_t(1) ) nt)
  7  /
NULL



SQL> select * from table(dbms_xplan.display_cursor(format=>'allstats last'));
PLAN_TABLE_OUTPUT
SQL_ID  cwcsdhm543ph2, child number 0
-------------------------------------
SELECT /*+ gather_plan_statistics */  NULL     FROM       (SELECT
DISTINCT a.id             FROM t1   a)     WHERE         id IN (SELECT
/*+cardinality(nt 1) */ column_value FROM TABLE ( n_t(1) ) nt)

Plan hash value: 1445712880

----------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |         |      1 |        |      1 |00:00:00.01 |      64 |       |       |          |
|*  1 |  HASH JOIN RIGHT SEMI                   |         |      1 |      1 |      1 |00:00:00.01 |      64 |  2546K|  2546K|  303K (0)|
|   2 |   JOIN FILTER CREATE                    | :BF0000 |      1 |      1 |      1 |00:00:00.01 |       0 |       |       |          |
|   3 |    COLLECTION ITERATOR CONSTRUCTOR FETCH|         |      1 |      1 |      1 |00:00:00.01 |       0 |       |       |          |
|   4 |   VIEW                                  |         |      1 |   4000 |      1 |00:00:00.01 |      64 |       |       |          |
|   5 |    HASH UNIQUE                          |         |      1 |   4000 |      1 |00:00:00.01 |      64 |  2294K|  2294K|  514K (0)|
|   6 |     JOIN FILTER USE                     | :BF0000 |      1 |   4000 |      1 |00:00:00.01 |      64 |       |       |          |
|*  7 |      TABLE ACCESS FULL                  | T1      |      1 |   4000 |      1 |00:00:00.01 |      64 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ID"=VALUE(KOKBF$))

PLAN_TABLE_OUTPUT
   7 - filter(SYS_OP_BLOOM_FILTER(:BF0000,"A"."ID"))



27 rows selected.

SQL>

Fundmentally I think the problem is the optimzer is unable to push the TABLE function into the inner select.
For the moment we’re having to revert to a hybrid solution where they generate SQL such as the following:

WHERE COL1 IN (?,?,?,?)

Each value has to be bound seperately. It’s not ideal as shared pool is somewhat full of each variation depending on how many values there are.  Also they are having to do a soft parse of the cursor each time, rather than parsing once and re-using.

However other than getting the developers to rewrite all their queries I don’t see any better solution at the moment.