Another DBA was investigating a performance difference of a query between two different systems and asked me to take a look. The execution plan, a full table scan, was the same in both cases. We ran Snapper on the problem system and observed that most of the time was spent on event ‘db file sequential read’.
-- Session Snapper v4.24 - by Tanel Poder ( http://blog.tanelpoder.com/snapper ) - Enjoy the Most Advanced Oracle Troubleshooting Script on the Planet! :) --------------------------------------------------------------------------------------------------------------- ActSes %Thread | INST | SQL_ID | SQL_CHILD | EVENT | WAIT_CLASS --------------------------------------------------------------------------------------------------------------- .98 (98%) | 1 | 2kcksks9ht14t | 0 | db file sequential read | User I/O .03 (3%) | 1 | 2kcksks9ht14t | 0 | ON CPU | ON CPU -- End of ASH snap 1, end=2017-11-24 16:12:18, seconds=5, samples_taken=40, AAS=1
‘db file sequential read’ is a single block access, not usually associated with a full table scan, however from previous experience I guessed that we had chained rows (one row split among multiple blocks), so we reran snapper and checked the value of statistic ‘table fetch continued row’.
-- Session Snapper v4.24 - by Tanel Poder ( http://blog.tanelpoder.com/snapper ) - Enjoy the Most Advanced Oracle Troubleshooting Script on the Planet! :) --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SID, USERNAME , TYPE, STATISTIC , DELTA, HDELTA/SEC, %TIME, GRAPH , NUM_WAITS, WAITS/SEC, AVERAGES --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 30, GLOGOWNER , STAT, session logical reads , 8477, 1.56k, , , , , 8.48k total buffer visits 30, GLOGOWNER , STAT, user I/O wait time , 531, 98.03, , , , , ~ per execution 30, GLOGOWNER , STAT, non-idle wait time , 531, 98.03, , , , , ~ per execution 30, GLOGOWNER , STAT, non-idle wait count , 8449, 1.56k, , , , , ~ per execution 30, GLOGOWNER , STAT, physical read total IO requests , 8449, 1.56k, , , , , ~ per execution 30, GLOGOWNER , STAT, physical read total bytes , 69214208, 12.78M, , , , , ~ per execution 30, GLOGOWNER , STAT, cell physical IO interconnect bytes , 69214208, 12.78M, , , , , ~ per execution 30, GLOGOWNER , STAT, consistent gets , 8477, 1.56k, , , , , ~ per execution 30, GLOGOWNER , STAT, consistent gets direct , 8477, 1.56k, , , , , ~ per execution 30, GLOGOWNER , STAT, physical reads , 8449, 1.56k, , , , , ~ per execution 30, GLOGOWNER , STAT, physical reads direct , 8449, 1.56k, , , , , ~ per execution 30, GLOGOWNER , STAT, physical read IO requests , 8449, 1.56k, , , , , 8.19k bytes per request 30, GLOGOWNER , STAT, physical read bytes , 69214208, 12.78M, , , , , ~ per execution 30, GLOGOWNER , STAT, file io service time , 16898, 3.12k, , , , , 2us bad guess of IO service time per IO request 30, GLOGOWNER , STAT, file io wait time , 5285794, 975.84k, , , , , 625.61us bad guess of IO wait time per IO request 30, GLOGOWNER , STAT, no work - consistent read gets , 8478, 1.57k, , , , , ~ per execution 30, GLOGOWNER , STAT, table scan rows gotten , 3798, 701.17, , , , , ~ per execution 30, GLOGOWNER , STAT, table scan blocks gotten , 882, 162.83, , , , , ~ per execution 30, GLOGOWNER , STAT, table fetch continued row , 7567, 1.4k, , , , , ~ per execution 30, GLOGOWNER , STAT, buffer is not pinned count , 8477, 1.56k, , , , , ~ per execution 30, GLOGOWNER , TIME, DB CPU , 716890, 132.35ms, 13.2%, [@@ ], , , 30, GLOGOWNER , TIME, sql execute elapsed time , 6001826, 1.11s, 110.8%, [##########], , , 30, GLOGOWNER , TIME, DB time , 6001826, 1.11s, 110.8%, [##########], , , -10.8 % unaccounted time 30, GLOGOWNER , WAIT, db file sequential read , 5300427, 978.54ms, 97.9%, [WWWWWWWWWW], 8409, 1.55k, 630.33us average wait -- End of Stats snap 1, end=2017-11-24 16:12:29, seconds=5.4
We were starting to understand the problem we were facing, especially when the DBA observed that the table in question had more than 255 columns (rows in such tables will always be chained), however we still didn’t understand why on the good performing system, also with chained rows, the query ran very fast, mostly on CPU.
I reran the query with SQL trace on the problem system, and attach a small portion below:
WAIT #139940851571352: nam='direct path read' ela= 3354 file number=11 first dba=257026 block cnt=62 obj#=131201 tim=1988902071975 WAIT #139940851571352: nam='db file sequential read' ela= 316 file#=11 block#=13034 blocks=1 obj#=131201 tim=1988902073088 WAIT #139940851571352: nam='db file sequential read' ela= 576 file#=11 block#=13034 blocks=1 obj#=131201 tim=1988902073711 WAIT #139940851571352: nam='db file sequential read' ela= 472 file#=11 block#=118100 blocks=1 obj#=131201 tim=1988902074209 WAIT #139940851571352: nam='db file sequential read' ela= 382 file#=11 block#=13034 blocks=1 obj#=131201 tim=1988902074629 WAIT #139940851571352: nam='db file sequential read' ela= 558 file#=11 block#=118140 blocks=1 obj#=131201 tim=1988902075214 WAIT #139940851571352: nam='db file sequential read' ela= 479 file#=11 block#=13034 blocks=1 obj#=131201 tim=1988902075732 WAIT #139940851571352: nam='db file sequential read' ela= 532 file#=11 block#=141498 blocks=1 obj#=131201 tim=1988902076290 WAIT #139940851571352: nam='db file sequential read' ela= 605 file#=11 block#=13034 blocks=1 obj#=131201 tim=1988902076933 WAIT #139940851571352: nam='db file sequential read' ela= 512 file#=11 block#=12950 blocks=1 obj#=131201 tim=1988902077472 WAIT #139940851571352: nam='direct path read' ela= 4326 file number=11 first dba=259074 block cnt=62 obj#=131201 tim=1988902081885 WAIT #139940851571352: nam='db file sequential read' ela= 754 file#=11 block#=13035 blocks=1 obj#=131201 tim=1988902082905 WAIT #139940851571352: nam='db file sequential read' ela= 544 file#=11 block#=13035 blocks=1 obj#=131201 tim=1988902083483 WAIT #139940851571352: nam='db file sequential read' ela= 617 file#=11 block#=12955 blocks=1 obj#=131201 tim=1988902084132 WAIT #139940851571352: nam='db file sequential read' ela= 546 file#=11 block#=13035 blocks=1 obj#=131201 tim=1988902084723 WAIT #139940851571352: nam='db file sequential read' ela= 976 file#=11 block#=141559 blocks=1 obj#=131201 tim=1988902085730 WAIT #139940851571352: nam='direct path read' ela= 3240 file number=11 first dba=264450 block cnt=62 obj#=131201 tim=1988902089070 WAIT #139940851571352: nam='db file sequential read' ela= 380 file#=11 block#=13036 blocks=1 obj#=131201 tim=1988902090605 WAIT #139940851571352: nam='db file sequential read' ela= 812 file#=11 block#=13036 blocks=1 obj#=131201 tim=1988902091452 WAIT #139940851571352: nam='db file sequential read' ela= 452 file#=11 block#=118118 blocks=1 obj#=131201 tim=1988902091936 WAIT #139940851571352: nam='db file sequential read' ela= 608 file#=11 block#=13036 blocks=1 obj#=131201 tim=1988902092588 WAIT #139940851571352: nam='db file sequential read' ela= 566 file#=11 block#=13008 blocks=1 obj#=131201 tim=1988902093186
While trying to figure out why the same block, say block# 13034, in the trace was being reread from disk during subsequent events, rather than being accessed from the buffer cache, another of my colleagues (thanks Kei) pointed out that the event associated with the table scan was a ‘direct path read’, rather than the usual ‘db file scattered read’.
‘direct path reads’ bypass the buffer cache, reading blocks directly into the session’s PGA. It seems this behavior also affects subsequent ‘db file sequential read’s necessary for accessing chained row pieces, potentially causing some significant performance issues.
As is often the case, Jonathan Lewis has already blogged about this issue however it’s always more rewarding and fun (re)discovering these issues oneself!