Adaptive Cursor Sharing not working as expected

We’ve been experiencing some issues with multiple child cursors for a given SQL statement, and I’ve just spent some time working on building a reproducible testcase of one problem that I thought I’d share in the hopes of documenting the behavior.

It is a python script connecting to an Oracle Database (tested against 12.2 and 19.6), so requires the cx_Oracle Python module.

import cx_Oracle
import argparse 

def setup_t1(con):
    cursor = con.cursor()
    try:
        cursor.execute("drop table t1")
    except Exception:
        pass
    cursor.execute("create table t1 (c1 number, n1 nchar(2), n2 nchar(2))")
    cursor.execute("insert into t1 select 1, to_char(round(rownum/10)),  to_char(mod(rownum, 2)) from all_objects where rownum <= 20")
    con.commit
    cursor.callproc("sys.dbms_stats.gather_table_stats", (None, 't1'))
    cursor.close()                    

def setup_t2(con):
    cursor = con.cursor()
    try:
        cursor.execute("drop table t2")
    except Exception:
        pass
    cursor.execute("create table t2(c1 number)")
    cursor.execute("create index i2 on t2(c1)")
    cursor.execute("insert into t2(c1) select rownum from all_objects where rownum <= 1000")
    con.commit
    cursor.callproc("sys.dbms_stats.gather_table_stats", \
                    keywordParameters = dict(ownname = None, tabname = "t2", method_opt = "for all columns size auto"))
    cursor.close()                    

def run_query(con, p1, p2):
    cursor = con.cursor()
    query = "select  count(*) from t1, t2 where t1.n1 = :1 and t1.n2 = :2 and t1.c1 = t2.c1"
    cursor.execute(query,[p1, p2])
    cursor.fetchall()
    cursor.close()

def run_query_0(con):
    run_query(con, '0','0')    

def run_query_0_space(con):
    run_query(con, '0 ','0 ')  

def get_sql_id(con):
    cursor = con.cursor()
    query = "select prev_sql_id from v$session where sid = userenv('SID')"
    cursor.execute(query)     
    row = cursor.fetchone()    
    cursor.close()
    return row[0]

def dump_sql(con, sql_id):
    cursor = con.cursor()    
    cursor.execute("select plan_hash_value, count(*) from v$sql where sql_id = :1 and is_shareable = 'Y' and is_obsolete = 'N' group by plan_hash_value", [sql_id])     
    rows = cursor.fetchall()    
    for row in rows:
        print (row)
    cursor.close()

parser = argparse.ArgumentParser()
parser.add_argument('--connection', type=str)
args = parser.parse_args()

con = cx_Oracle.connect(args.connection)
setup_t1(con)
setup_t2(con)
run_query_0(con)
run_query_0_space(con)
run_query_0(con)
sql_id = get_sql_id(con)
con.close

con = cx_Oracle.connect(args.connection)
for x in range(0, 80):
    run_query_0(con)
    run_query_0_space(con)

dump_sql(con, sql_id)
con.close

Results are as below, showing 81 active child cursors.

('66dbrch2wu34f', 3482066175, 1)
('66dbrch2wu34f', 2913850814, 80)

I’m not sure why it’s necessary to close and re-open the connection between the first three executions (when it becomes bind aware) and the subsequent executions, but it is. If both the nchar columns are char (or nvarchar2) the issue doesn’t reproduce.

It seems the database is not correctly handling the Cursor Selectivity Cubes:

SELECT
    predicate,
    range_id,
    low,
    high,
    COUNT(*)
FROM
    v$sql_cs_selectivity
WHERE
    sql_id = '66dbrch2wu34f'
GROUP BY
    predicate,
    range_id,
    low,
    high

PREDICATE                                  RANGE_ID LOW        HIGH         COUNT(*)
---------------------------------------- ---------- ---------- ---------- ----------
=1                                                0 0.300000   0.366667          317

SQL>
It also seems like the fact that the plan is adaptive is a factor. If I hint the query to not allow adaptive plans, with OPT_PARAM('_optimizer_adaptive_plans','false') I don't see the problem.
select * from dbms_xplan.display_cursor('66dbrch2wu34f', null, format=>'+adaptive')

-----------------------------------------------------------------------------------------
|   Id  | Operation                      | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT               |      |       |       |     4 (100)|          |
|     1 |  SORT AGGREGATE                |      |     1 |    17 |            |          |
|  *  2 |   HASH JOIN                    |      |     3 |    51 |     4   (0)| 00:00:01 |
|-    3 |    NESTED LOOPS                |      |     3 |    51 |     4   (0)| 00:00:01 |
|-    4 |     STATISTICS COLLECTOR       |      |       |       |            |          |
|  *  5 |      TABLE ACCESS STORAGE FULL | T1   |     3 |    39 |     2   (0)| 00:00:01 |
|- *  6 |     INDEX RANGE SCAN           | I2   |     1 |     4 |     2   (0)| 00:00:01 |
|     7 |    INDEX STORAGE FAST FULL SCAN| I2   |  1000 |  4000 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("T1"."C1"="T2"."C1")
   5 - storage(("T1"."N1"=SYS_OP_C2C(:1) AND "T1"."N2"=SYS_OP_C2C(:2)))
       filter(("T1"."N1"=SYS_OP_C2C(:1) AND "T1"."N2"=SYS_OP_C2C(:2)))
   6 - access("T1"."C1"="T2"."C1")

Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)

ORDS Issues in Oracle 19c? Check your Service Names

We upgraded our first database from 12.2 to 19c last week, and encountered a nasty issue with ORDS.  Credit goes to my colleagues Mingda Lu and Au Chun Kei for doing the hard work in understanding what was causing the issue.

The issue can be demonstrated with the Oracle DB Developer VM.  I have created a RESTful Web Service following the oracle-base guide.

A quick test with wget shows that everything is OK with the default settings:

[oracle@localhost ~]$ wget http://localhost:8080/ords/hr/hrmod/employees/100
--2020-02-15 04:23:46-- http://localhost:8080/ords/hr/hrmod/employees/100
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:8080... connected.
HTTP request sent, awaiting response... 200 OK

Note however that ORDS is configured with the PDB default service name ‘orcl’

[oracle@localhost ~]$ grep servicename /u01/userhome/oracle/ords/vmconfig/ords/defaults.xml
<entry key="db.servicename">orcl</entry>

What happens if use a different service name for ORDS?  I’ll create a couple of services to demonstrate the issue.

[oracle@localhost ~]$ sql system/oracle@localhost:1521/orcl

SQLcl: Release 19.1 Production on Sat Feb 15 04:58:57 2020
Copyright (c) 1982, 2020, Oracle. All rights reserved.
Last Successful login time: Sat Feb 15 2020 04:59:02 -05:00
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> exec sys.dbms_service.create_service('orcl_ords', 'orcl_ords');
PL/SQL procedure successfully completed.

SQL> exec sys.dbms_service.start_service('orcl_ords');
PL/SQL procedure successfully completed.

SQL> exec sys.dbms_service.create_service('orcl.ords', 'orcl.ords');
PL/SQL procedure successfully completed.

SQL> exec sys.dbms_service.start_service('orcl.ords');
PL/SQL procedure successfully completed.

If I change db.servicename entry to ‘orcl_ords’ in defaults.xml, restart ORDS and retest, all is OK.

[oracle@localhost wget http://localhost:8080/ords/hr/hrmod/employees/100
--2020-02-15 05:03:17-- http://localhost:8080/ords/hr/hrmod/employees/100
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:8080... connected.
HTTP request sent, awaiting response... 200 OK

However if I change the service name to the other new service, orcl.ords, and restart ORDS, then it starts up without any problems, but when testing the service we get an error.

[oracle@localhost ~]$ wget http://localhost:8080/ords/hr/hrmod/employees/100
--2020-02-15 05:05:57-- http://localhost:8080/ords/hr/hrmod/employees/100
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:8080... connected.
HTTP request sent, awaiting response... 503 Service Unavailable
2020-02-15 05:06:01 ERROR 503: Service Unavailable.

The error stacks from ORDS seems to gives some clues as to what’s going on.

WARNING: The database user for the connection pool named |apex|pu|, is not authorized to proxy to the schema named HR
oracle.dbtools.common.jdbc.ConnectionPoolConfigurationException: The database user for the connection pool named |apex|pu|, is not authorized to proxy to the schema named HR
at oracle.dbtools.common.jdbc.ConnectionPoolExceptions.from(ConnectionPoolExceptions.java:46)
at oracle.dbtools.common.jdbc.ConnectionPoolExceptions.from(ConnectionPoolExceptions.java:53)
Caused by: oracle.dbtools.common.ucp.ConnectionLabelingException: Error occurred when attempting to configure url: jdbc:oracle:thin:@//localhost:1521/orcl.ords with labels: {oracle.dbtools.jdbc.label.schema=HR}
at oracle.dbtools.common.ucp.LabelingCallback.handle(LabelingCallback.java:147)
at oracle.dbtools.common.ucp.LabelingCallback.proxyToSchema(LabelingCallback.java:210)
at oracle.dbtools.common.ucp.LabelingCallback.configure(LabelingCallback.java:76)
Caused by: java.sql.SQLException: ORA-01017: invalid username/password; logon denied
at oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:494)
at oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:441)
at oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:436)
at oracle.jdbc.driver.T4CTTIfun.processError(T4CTTIfun.java:1027)

This seems to imply there’s some issue with the proxy authentication mechanism when using the orcl.ords service, however testing from sqlplus, all seems to be OK.

[oracle@localhost ords]$ sql ords_public_user[hr]/oracle@localhost:1521/orcl.ords

SQLcl: Release 19.1 Production on Sat Feb 15 05:09:46 2020

Copyright (c) 1982, 2020, Oracle. All rights reserved.

Last Successful login time: Fri May 31 2019 16:29:03 -04:00

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0


SQL>

From our testing, any service with a name with the format <pdb_name>.<any_text> exhibits the problem. We have used service names with such a format in 12.2 without issues, so it seems this is new behaviour introduced in 18c or 19c.

We’ve also noticed that when checking v$services, the value for con_id for the ‘problem’ service is 1 which may give a clue as to what’s going on, although it only seems to cause a problem for ORDS.

SQL> select con_id, network_name from v$services where network_name in ('orcl_ords', 'orcl.ords');
CON_ID NETWORK_NAME
_________ _______________
1 orcl.ords
3 orcl_ords

ords.enable_schema fails with “ORA-06598: insufficient INHERIT PRIVILEGES privilege”

This is issue I always hit settitng up a test environment using Oracle REST Data Services (ORDS).  Using SYSTEM user to call ords.enable_schema throws ORA-06598.  According to the documentation it should succeed (note the SYSTEM user has DBA role) .

Only database users with the DBA role can enable or disable a schema other than their own.

[oracle@localhost ~]$ sql system/oracle

SQLcl: Release 19.1 Production on Sat Feb 15 02:20:33 2020
Copyright (c) 1982, 2020, Oracle.  All rights reserved.
Last Successful login time: Sat Feb 15 2020 02:20:36 -05:00
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> SHOW USER
USER is "SYSTEM"

SQL> select role from session_roles where role = 'DBA';
   ROLE
_______
DBA
SQL> BEGIN
  2     ords.enable_schema(
  3        p_enabled             => TRUE,
  4        p_schema              => 'HR',
  5        p_url_mapping_type    => 'BASE_PATH',
  6        p_url_mapping_pattern => 'hr',
  7        p_auto_rest_auth      => FALSE);
  8
  9    COMMIT;
 10  END;
 11  /
BEGIN
*
ERROR at line 1:
ORA-06598: insufficient INHERIT PRIVILEGES privilege
ORA-06512: at "ORDS_METADATA.ORDS", line 1
ORA-06512: at line 2

The solution is simple, grant inherit privileges on the current user (SYSTEM) to the ORDS_METADATA user.

SQL> show user
USER is "SYSTEM"
SQL> grant inherit privileges on user SYSTEM to ORDS_METADATA;
Grant succeeded.

Once this is completed, the call to ords.enable_schema is successful.

SQL> BEGIN
2 ords.enable_schema(
3 p_enabled => TRUE,
4 p_schema => 'HR',
5 p_url_mapping_type => 'BASE_PATH',
6 p_url_mapping_pattern => 'hr',
7 p_auto_rest_auth => FALSE);
8
9 COMMIT;
10 END;
11 /

PL/SQL procedure successfully completed.

ORA-01017 starting DB with srvctl? Consider firing your (oracle) agent!

Warning please don’t blindly follow the steps here without doing your own analysis of the risks involved, and ideally without getting Oracle support involved, I was hesitant to publish this, but as I’ve been in contact with someone else and it’s helped them workaround (to some extent) an issue they have been having I think it’s worth putting out there.

We had a problem during a dataguard switch-over (luckily planned switch-over for patching rather than a disaster situation) where Grid Infrastructure (clusterware) was unable to bring up one of the databases, it kept throwing “ORA-01017: invalid username/password”. Starting the database the ‘traditional way’ using “sqlplus / as sysdba” had no such problems.

Reviewing Oracle Support, particularly Doc ID 2313555.1 we identified some non-standard configuration in the Oracle home used for this database, but even after resolving them, the error persisted.

At times like these you realize (or at least I did) how little is published about the internals of how clusterware and the oracle databases it manages interact.

I suspected that restarting the entire clusterware stack would resolve the issue but that was difficult as this node also managed a production database which we didn’t want to take down.

However I guessed that restarting the clusterware agent for the oracle user might fix the problem. The executable is oraagant.bin and the process owner is oracle. I believe this is the process clusterware uses to actually start the database (You’ll also probably notice a similar process owned by grid and orarootagent.bin running as root).

I killed the oracle agent process and crossed my fingers. Luckily clusterware re-spawned this process and afterwards we were able to restart the problem instance without any problems.

Please re-read the first paragraph if you are considering to apply this work-around, and don’t blame me if you break anything, if it helps though I’m happy to take the credit!

Adding covering fields to a Primary Key Index

This is something to file under the (admittedly rather large) category of things that I wasn’t aware that the Oracle database could do.

While tuning a query, I wanted to use a common technique of adding fields to an index to eliminate a “Table Access by Index RowID” operation, however this particular case was a complicated  by the fact that the index was supporting the primary key, and the table was large and frequently accessed.

This is probably easiest demonstrated by the (much simplified) example below:

SQL> create table singles(id number generated always as identity,
2                       artist varchar2(255),
3                       title varchar2(255),
4                       constraint singles_pk primary key (id));

Table SINGLES created.

SQL>
SQL> insert into singles (artist,
2                        title)
3              values ('Chesney Hawkes',
4                      'The One And Only');

1 row inserted.

SQL> commit;

Commit complete.

SQL> select index_name from user_indexes where table_name = 'SINGLES';
INDEX_NAME
_____________
SINGLES_PK

SQL> select artist from singles where id = 1;
ARTIST
_________________
Chesney Hawkes

SQL> select * from dbms_xplan.display_cursor(format=>'BASIC');
PLAN_TABLE_OUTPUT
_____________________________________________________
EXPLAINED SQL STATEMENT:
------------------------
select artist from singles where id = 1

Plan hash value: 3923658952

--------------------------------------------------
| Id  | Operation                   | Name       |
--------------------------------------------------
|   0 | SELECT STATEMENT            |            |
|   1 |  TABLE ACCESS BY INDEX ROWID| SINGLES    |
|   2 |   INDEX UNIQUE SCAN         | SINGLES_PK |
--------------------------------------------------

14 rows selected.

Note that adding a new index on (id, artist) makes the plan more efficient:

SQL> create index i_singles_covering on singles(id, artist);

Index I_SINGLES_COVERING created.

SQL> select artist from singles where id = 1;
ARTIST
_________________
Chesney Hawkes

SQL> select * from dbms_xplan.display_cursor(format=>'BASIC');
PLAN_TABLE_OUTPUT
__________________________________________________
EXPLAINED SQL STATEMENT:
------------------------
select artist from singles where id = 1

Plan hash value: 1012019734

-----------------------------------------------
| Id  | Operation        | Name               |
-----------------------------------------------
|   0 | SELECT STATEMENT |                    |
|   1 |  INDEX RANGE SCAN| I_SINGLES_COVERING |
-----------------------------------------------

13 rows selected.

However we’ve now got two indexes, SINGLES_PK on (id) and I_SINGLES_COVERING on (id, artist).  SINGLES_PK is redundant, but being used to support the Primary Key:

SQL> select index_name from user_indexes where table_name = 'SINGLES';
           INDEX_NAME
_____________________
SINGLES_PK
I_SINGLES_COVERING

Now it is possible for a primary key to be supported by I_SINGLES_COVERING, but initially I thought I’d have to choose between dropping and re-creating the primary key to use the new index, or leaving the system in the non-optimal state of having the two indexes.

However I came across this blog post from Richard Foote, which referenced another post from Jonathan Lewis.  It describes the following technique of modifying the constraint to use the new index without needing to re-recreate it.  It’s worth noting that the index SINGLES_PK that the database automatically created to initially support the primary key gets dropped during this operation.

SQL> alter table singles
  2        modify constraint singles_pk
  3        using index i_singles_covering;
Table SINGLES altered.


SQL> select index_name from user_indexes where table_name = 'SINGLES';
INDEX_NAME
_____________________
I_SINGLES_COVERING

One thing I observed my testing was that if I created i_singles_covering as a unique index (id is unique as it’s the primary key, so obviously combination of id & artist must also be unique) then the database was unwilling to use this index to support the primary key:

SQL> create unique index i_singles_covering on singles(id, artist);

Index I_SINGLES_COVERING created.

SQL> alter table singles
2          modify constraint singles_pk
3          using index i_singles_covering;

ORA-14196: Specified index cannot be used to enforce the constraint.
14196. 00000 - "Specified index cannot be used to enforce the constraint."
*Cause: The index specified to enforce the constraint is unsuitable
for the purpose.
*Action: Specify a suitable index or allow one to be built automatically.

This case is documented by Oracle Support Document ID 577253.1 which states:

We cannot use a prefix of a unique index to enforce a unique constraint. We can use a whole unique index or a prefix of a non-unique index to do that. This is the way Oracle was designed.

However I can’t off-hand think of any technical reason for this limitation.

 

SQL Plan Directives: Gotta Purge ‘Em All

A tip I picked up from Nigel Bayliss regards purging SQL Plan Directives, I’ve been using it a lot recently and can’t see it documented elsewhere.

As some background these records, exposed via the DBA_SQL_PLAN_DIRECTIVES view, are cardinality corrections created and used when running with Adaptive Statistics enabled.  There is a job that should automatically purge all records unused for longer than the value of SPD_RETENTION_WEEKS, but we’ve experienced occasions when this job doesn’t work as expected.

The records can be individually purged by calling DBMS_SPD.DROP_SQL_PLAN_DIRECTIVE, but that’s a pain if you’ve got a lot of them

However what the documentation doesn’t mention is that you can call the procedure, passing in NULL for the mandatory directive_id parameter:

exec sys.DBMS_SPD.DROP_SQL_PLAN_DIRECTIVE (NULL);

This will purge all records based on retention rules that the auto-purge job follows.  If you really want to Purge ‘Em All then you can set the retention to 0 before calling the procedure.

exec sys.DBMS_SPD.SET_PREFS('SPD_RETENTION_WEEKS', '0');

 

Planned (Cursor) Obsolescence

I’ll start this blog-post by posing a question. Is it possible to have multiple records in v$sql for a given sql_id and child_number combination? While the title of this blog post may give you some clues, I’ll admit I’d always assumed that those values uniquely identified a child cursor.

As a bit of background we had a database availability situation this week, which we narrowed down to SGA issues, specifically bug 15881004 “Excessive SGA memory usage with Extended Cursor Sharing”. Some of our more complex SQL Statements were getting many (more than 700) child cursors. The reported reason for the child was “Bind mismatch(33)”.  Probably caused by bug 14176247 “Many child cursors using Adaptive Cursor Sharing with binds (due to BIND_EQUIV_FAILURE)”, although that is listed as fixed in 12.1 and this instance is running on 12.2.

We resolved the immediate issue by flushing the shared pool (admittedly not a great solution, but sometimes you got to do what you got to do), and created SQL Plan Baselines for those problem SQL statements so they would each just get one plan and child cursor.

We plan to monitor more closely for any SQL statements that do have many child cursors, however we need to make sure that even if that does happen it doesn’t break the system again.  One thing that seemed promising is the _cursor_obsolete_threshold parameter.  We had already reduced this parameter down to 1024 from it’s default of 8192 based on Mike Deitrich’s blog post but with this incident were considering reducing it further.  I think it’s wise to be wary of messing too much with underscore parameters but per Doc ID 2431353.1 Oracle Support say “the … parameter can be adjusted case-to-case basis should there be a problem”. For sure we had a significant problem with the setting at 1024 so plan to reduce further to 512.

We involved super consultant Stefan Koehler to review our findings and action plan, he was broadly in agreement, even recommending further reduction of the parameter value to 256.  However something puzzling me which I asked him was “What actually happens if the number of child cursors hit the value specified by this parameter”.  His answer “Well what happens is this … if your parent cursor got more than _cursor_obsolete_threshold child cursors it invalidates the parent (and in consequence all childs) and it starts from 0 again”

I was skeptical, my expectation was that Oracle would just invalidate the oldest unused child cursor and then re-use that child number.  Another thing puzzling me was happens if some of the child cursors were still held open?  Time to test this out for myself…
First let me demonstrate how I can get 4 child cursors for a given SQL Statement using different values of optimizer_index_cost_adj as a quick hack.
SQL> alter system flush shared_pool;
SQL> select count(*) from all_objects;
  COUNT(*)
     74807

SQL> select prev_sql_id from v$session where sid=sys_context('userenv','sid');
PREV_SQL_ID
9tz4qu4rj9rdp

SQL> alter session set optimizer_index_cost_adj=1;
SQL> select count(*) from all_objects;
  COUNT(*)
     74807

SQL> alter session set optimizer_index_cost_adj=2;
SQL> select count(*) from all_objects;
  COUNT(*)
     74807

SQL> alter session set optimizer_index_cost_adj=3;
SQL> select count(*) from all_objects;
  COUNT(*)
     74807

SQL> select sql_id, child_number, executions from v$sql where sql_id = '9tz4qu4rj9rdp';
SQL_ID            CHILD_NUMBER   EXECUTIONS
9tz4qu4rj9rdp                0            1
9tz4qu4rj9rdp                1            1
9tz4qu4rj9rdp                2            1
9tz4qu4rj9rdp                3            1

Let me reduce _cursor_obsolete_threshold at session level and re-run the test.

SQL> alter system flush shared_pool;
SQL> alter session set "_cursor_obsolete_threshold"=2;
SQL> alter session set optimizer_index_cost_adj=100;
SQL> select count(*) from all_objects;
  COUNT(*)
     74807

SQL> alter session set optimizer_index_cost_adj=1;
SQL> select count(*) from all_objects;
  COUNT(*)
     74807

SQL> alter session set optimizer_index_cost_adj=2;
SQL> select count(*) from all_objects;
  COUNT(*)
     74807

SQL> alter session set optimizer_index_cost_adj=3;
SQL> select count(*) from all_objects;
  COUNT(*)
     74807

SQL> select sql_id, child_number, executions from v$sql where sql_id = '9tz4qu4rj9rdp';
SQL_ID            CHILD_NUMBER   EXECUTIONS
9tz4qu4rj9rdp                0            1
9tz4qu4rj9rdp                1            1
9tz4qu4rj9rdp                0            1
9tz4qu4rj9rdp                1            1

Whoah…. each combination of sql_id and child number has two entries (not what I was expecting to see). To get a more full picture we need to look at a couple of additional fields, namely ‘address’ and ‘is_obsolete’.

SQL> select sql_id,  address, child_number, is_obsolete from v$sql where sql_id = '9tz4qu4rj9rdp';
SQL_ID          ADDRESS              CHILD_NUMBER IS_OBSOLETE
9tz4qu4rj9rdp   00000000610AB500                0 Y
9tz4qu4rj9rdp   00000000610AB500                1 Y
9tz4qu4rj9rdp   0000000073DDE788                0 N
9tz4qu4rj9rdp   0000000073DDE788                1 N

Although we tend to use sql_id as our handle for the parent cursor, Oracle actually uses the ‘Address’ field, and when the _cursor_obsolete_threshold value is exceeded, Oracle allocates a new parent cursor with a new ‘Address’.  This explains how Oracle copes when old child cursors are held open, they still stay in the shared pool, keeping their address, but are marked as obsolete, able to be aged out when they are no longer in use.

The other lessons here, firstly that Stefan knows his stuff, but also whenever someone tells you something, don’t just take it on trust, it’s normally easy to validate for yourself, and you may learn something about how Oracle works along the way

OGB Appreciation Day: Implicit Cursors

Recently I’ve been working with SQL Server and while it’s not all bad, sometimes it does helps to highlight some of the neat features available in Oracle.  One of these is IMPLICIT cursors, which I shall demonstrate.

First I’ll show how to populate some data in a table and then loop over it in TSQL (SQL Server’s equivalent of PL/SQL):

1> CREATE TABLE demo (col1 TINYINT, col2 TINYINT, col3 TINYINT);
2> GO

1> INSERT INTO demo (col1, col2, col3)
2>           VALUES (11,   12,   13),
3>                  (21,   22,   23),
4>                  (31,   32,   33);
5> GO
(3 rows affected)

1>  DECLARE @col1 tinyint, @col2 tinyint, @col3 tinyint;
2>  DECLARE explicit_cursor CURSOR LOCAL FAST_FORWARD FOR
3>     SELECT col1, col2, col3 FROM dbo.demo;
4>
5>  OPEN explicit_cursor;
6>  FETCH NEXT FROM explicit_cursor INTO @col1, @col2, @col3;
7>
8>  WHILE @@FETCH_STATUS = 0
9>  BEGIN
10>    PRINT CONCAT(@col1, ':', @col2, ':', @col3);
11>    FETCH NEXT FROM explicit_cursor INTO @col1, @col2, @col3;
12> END
13>
14> CLOSE explicit_cursor;
15> DEALLOCATE explicit_cursor;
16>
17> GO
11:12:13
21:22:23
31:32:33

By the way, note the neat way it’s possible to insert 3 records with a single INSERT statement. I didn’t say there weren’t some things that SQL Server does a little better 🙂

Next check out the equivalent SQL statements and PL/SQL code in the Oracle Database. Note my Oracle demos are running on an Autonomous Transaction Processing Database in Oracle Cloud although should work in all versions including Oracle XE, the free to use database.

SQL> CREATE TABLE demo (col1 NUMBER, col2 NUMBER, col3 NUMBER);
Table DEMO created.

SQL> INSERT INTO demo (col1, col2, col3) VALUES (11, 12, 13);
1 row inserted.

SQL> INSERT INTO demo (col1, col2, col3) VALUES (21, 22, 23);
1 row inserted.

SQL> INSERT INTO demo (col1, col2, col3) VALUES (31, 32, 33);
1 row inserted.

SQL> COMMIT;
Commit complete.

SQL> SET SERVEROUTPUT ON SIZE UNLIMITED;
SQL>
SQL> DECLARE
  2      CURSOR explicit_cursor IS
  3         SELECT col1, col2, col3 FROM demo;
  4      explicit_record explicit_cursor%ROWTYPE;
  5  BEGIN
  6      OPEN explicit_cursor;
  7      LOOP
  8          FETCH explicit_cursor INTO explicit_record;
  9          EXIT WHEN explicit_cursor%NOTFOUND;
 10          sys.dbms_output.put_line(explicit_record.col1 || ':' ||
 11                                   explicit_record.col2 || ':' ||
 12                                   explicit_record.col3          );
 13      END LOOP;
 14      CLOSE explicit_cursor;
 15  END;
 16  /

11:12:13
21:22:23
31:32:33

PL/SQL procedure successfully completed.

Already I prefer a few things about the Oracle solution.  The ability to use a cursor %ROWTYPE rather than having to define and use variables for individual columns, the fact there is only one fetch command required and the use of the %NOTFOUND cursor attribute rather than the somewhat arbitrary @@FETCHSTATUS == 0 check.

However Oracle offers an even better method, namely an implicit cursor.

SQL> BEGIN
  2      FOR implicit_record IN (SELECT col1, col2, col3 FROM demo)
  3      LOOP
  4          sys.dbms_output.put_line(implicit_record.col1 || ':' ||  
  5                                   implicit_record.col2 || ':' ||  
  6                                   implicit_record.col3          );
  7      END LOOP;
  8  END;
  9  /

11:12:13
21:22:23
31:32:33

PL/SQL procedure successfully completed

A few things to note.  We’re down from 15 to 8 lines of code which makes this easier to write, and just as importantly with less chance of bugs.  No need to worry about defining rowtypes, or opening or closing cursors, Oracle just does the right thing under the covers including tidying up in case exceptions are thrown.

 

 

 

 

ORDS Under Siege: Introduction

I’ve been playing around with researching ORDS over the summer particularly trying to optimize performance on Tomcat. Trying something a little different I’ve created a Vagrant box that should allow anybody interested to verify my findings, find mistakes I’ve made or identify performance optimizations I’ve missed.

If you’re new to Vagrant, Tim Hall provides a good introduction for the Oracle DBA.

You can clone or download the Vagrant box from my github page hopefully the instructions should be clear, you need to download Oracle 18cXE and ORDS releases and put into the software directory.  I’ve allocated 6GB RAM and 4 CPUs to the virtual machine, you may need to adjust these values depending on your test machine resources.  Doing “vagrant up” should automatically configure the database, and configure ORDS running in Tomcat and with some reverse proxies.  It will also generate a self-signed certificate and configure the SSL handling in both Tomcat and the reverse proxies.  Most of the database and ORDS configuration scripts were taken from the Oracle Vagrant boxes or Tim’s Vagrant boxes.

The bench-marking tool I am using is Siege.  There are many alternatives available but I chose Siege for a few reasons.  Firstly it is Free and Open Source software.  Secondly it is easy to configure, simply populate a file, urls.txt, with the URLs to hit and then run the executable with suitable parameters.  Finally it is lightweight, being written in C, whereas many other similar tools are written in Java, as I am running the bench-marking tool on the same virtual machine that hosts the software components I’m trying to measure this is important.

Once the vagrant machine is up, you can connect to it via “vagrant ssh” and then type “ords-demo” to run the entire test-suite. I’ll go through the individual tests in the following blog posts and share my findings.

 

 

 

 

Adding a little TLS complexity to authentication_ldap_simple

In a previous post I showed that by default when authentication_ldap_simple communicates with a Windows Domain Controller (or any other LDAP service), then the password is transmitted unencrypted during authentication.

This time I’ll demonstrate how to close this loophole.  A pre-requisite is that the Domain Controller needs to be configured to accept secure connections. This is done by installing a certificate, the process is well documented elsewhere so I won’t repeat it here.

There are two different ways to configure secure communication.  The first method is to set authentication_ldap_simple_server_port to 636.

As of MySQL 8.0.14, if the LDAP port number is configured as 636 or 3269, the plugin uses LDAPS (LDAP over SSL) instead of LDAP. (LDAPS differs from startTLS.)

The second method is to set authentication_ldap_simple_tls to ON.

For simple LDAP authentication, whether connections by the plugin to the LDAP server are secure. If this variable is enabled, the plugin uses TLS to connect securely to the LDAP server.

In both cases we have to set authentication_ldap_simple_ca_path to point to the certificate authority file used when securing the domain controller.  (Pro-tip ensure the both the file attributes of this certificate and of the directory it sits in are such that the mysql process is able to access it, you won’t believe how long I wasted due to this).

Of the two methods, I have been informed that the TLS method is optimal so that is what I will demonstrate.  Note I have found that it’s better to load the plugin and set the variables in the mysql configuration file (my.cnf) and restart the service rather than setting them dynamically (it seems the otherwise the values do not correctly propagate to the appropriate processes due to LDAP connection pooling)  so that’s what I’ll show you.

[root@lnx-mysql8 ~]# tail -5 /etc/my.cnf
plugin-load-add=authentication_ldap_simple.so
authentication_ldap_simple_server_host='win-dc.windows.domain'
authentication_ldap_simple_group_search_attr=
authentication_ldap_simple_tls=ON
authentication_ldap_simple_ca_path='/etc/certs/win-dc.crt'
[root@lnx-mysql8 ~]# 

Let’s have a look at the network traffic on port 389 (LDAP) on MySQL service start.  Observe that, after some initial negotiation regards the certificate, all traffic is encrypted.

[root@lnx-mysql8 ~]# tcpflow -C port 389
tcpflow: listening on enp0s3
0w1.3.6.1.4.1.1466.20037
0(x
1.3.6.1.4.1.1466.20037
]mK?$(UI0,($
kjih98762.*&=5/+'#g@?>3210EDCB1-)%</A C 0B10UXX10UDefault City10U 0B)JJ:7 }$kwIuI_0"%21-dc.windows.domain00 F#qI925rgCbG?{ O{R?_zm+(cx7Ju&+C0A0U0U%0 d&,(V5zVmBj2ZhZw%m@VX}5A2nDf!a)n[wn:~JTm:!0`jL4yMv"8'LH+BHQ K#F _3f]t'u)5B.^/fEKIb.Tj2?03`g5.0RJaF'pH&i=QSa[m3&j~~10&=kv)S%oiH3RvK'wE-tbJ8Tl#:B9tw;MB!FT]7AA7Gn>adwCR#I:x*#IIk8.g62~hd|N_L%OIBC#V|@)o+O_Afo-At~XHt`<fV'r]"u'}GF@<h}- eBA]vEJAu|=-t7ATiPhz(stn`[9U[_s@" (O,tD/'1RT{g6}I3% 0w1.3.6.1.4.1.1466.20037 0(x 1.3.6.1.4.1.1466.20037 )=?s@9fj& *x"80,($ kjih98762.*&=5/+'#g@?>3210EDCB1-)%</A
C

To save you having to refer to previous post here is how to create the AD authenticated user using the LDAP DN (Distinguished Name):

mysql> CREATE USER 'patrick'@'%'
    ->             IDENTIFIED WITH authentication_ldap_simple
    ->             BY 'CN=patrick,CN=Users,DC=WINDOWS,DC=DOMAIN';
Query OK, 0 rows affected (0.01 sec)

I am now able to connect using my Windows username and password:

[root@lnx-mysql8 ~]# mysql                                  \
>                          --host=lnx-mysql8.windows.domain \
>                          --enable-cleartext-plugin        \
>                          --user=patrick                   \
>                          --password=Password123
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 16
Server version: 8.0.16-commercial MySQL Enterprise Server - Commercial

Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

And let’s check the network traffic to domain controller during the authentication

[root@lnx-mysql8 ~]# tcpflow -C port 389
tcpflow: listening on enp0s3
Whl+E]X=n0ucbDOBt'
.rvY* CKWLS+

No more password in plain sight. It’s still in my bash history, but that’s something I can easily resolve by not passing it as a command line argument 🙂