Cartesian Join – the usual suspect (part1)

Cartesian Join & Hash Join – Bonnie and Clyde

“This movie is based on true story”


Some time ago, I was approached to provide a solution to a performance issue.
To illustrate this case, I will later simulate the scenario highlighting the critical aspects and providing a detailed explanation of the underlying causes.
For now, let me begin with an introduction.

In summary, a query that typically has been executed within a few seconds unexpectedly began running for over an hour.
While skipping some intermediate steps of the analysis, technically, the problem could be brought to cartesian join as the only culprit. In fact, the cartesian join was just the prelude to much more painful circumstance which, to be fair, was caused by it.

Whenever the SQL query is evaluated (hard parsed) and the Cost-Based Optimizer (CBO) calculates an extremely low cardinality—typically 1—for the joined rowsets accessed with full table scan, it often determines that a Cartesian join is the most efficient join method.
In practice, this cardinality value of 1 often serves as a placeholder for 0 what the CBO explicitly notes in the 10053 trace file that it rounds the cardinality up to 1.
Under which circumstances would that happen and most of all, when would that depict a problem?
The answer is: any time the CBO receives misestimated statistical data being derived from the assumption that the table(s) in question is/are empty.

When would that happen?
Consider the following scenario: some tables were truncated (or completely purged in any way), then the statistics were gathered.
Then later these tables got filled up again, and ultimately the SQLs involving these tables got executed.
The statistics gathered post truncating will obviously contain number of rows (and blocks) equal to 0.
If then these tables will get populated afterwards w/o the refreshing of the stats, the CBO will be presented false data with – and this seems to produce the hassle – the number of rows = 0 (rounded up to 1 in the SQL Plan).
The problem is magnified with three tables participating in two joins.
Chances are, the first join will be cartesian join due to the assumed cardinality of 1.
But this itself does not depict the entire problem.
As in the meantime the table(s) was/were populated (and CBO obviously has no clue about this fact), the cardinality of such cartesian join may be huge.
With such huge rowset the subsequent join based on this rowset gets unexpectedly large set of rows to be handled – be it Hash Join or any other.

How did I know the CBO got very wrong information?
Spotting the estimated cardinality = 1 in the SQL Plan accompanied with full table scan without filtering should typically awake suspicions.
The analysis get way easier if you have any historical information showing the true cardinalities per Plan ID.
Such info can be derived from e.g. SQL Monitor (dbms_sqltune.report_sql_monitor) or arguably less known dbms_auto_report.report_repository_details.
But even v$active_session_history with creative usage of its column SQL_PLAN_LINE_ID should be sufficient (you’ll find the script and sample output later).
If you need to reach out rather deep back in time as neither of the above can serve you well (although dbms_auto_report.report_repository_details “may be loyal” enough), the real game changer can be DBA_HIST_ACTIVE_SESS_HISTORY, as it also contains data relevant to SQL_PLAN_LINE_ID, in addition it typically retains data weeks old (I will also share a script with sample execution and short explanation).
Yes, for all these tools to be utilized the “Diagnostic and Tuning Pack” needs to be purchased.

What if for whichever reason you can’t use any of those tools?
I had the luxury to be able to use the dbms_auto_report.report_repository_details as it still retained the relevant data (I was made aware past several days, so the data from both ASH and SQL Monitor was gone).
I am sure though the DBA_HIST_ACTIVE_SESS_HISTORY would also prove to be useful.

So what if you are exposed to just standard tools?
Well, try to be suspicious.. Before reaching out to dbms_auto_report.report_repository_details (which resembles SQL Monitor) I got alerted by two facts.
Firstly, as mentioned, seeing the SQL Plan (well, yes, SQL Plan needs to be at hand..) and the estimated cardinality = 1 of a full table scan without any filtering (no ‘*’ on the left hand side) arose the alert.
Secondly – I admit, I was in some way lucky. The names of the involved tables, namely “%_STG”, sparked an idea that these may be some staging tables, so statistically more vulnerable to undertake some purging cycle.
Later I found out they were indeed truncated, then the stats got gathered (with 0 rows), then in the meantime these tables got filled up again, and then ultimately the SQL was executed (hard-parsed before with the new plan, but with wrong stats).

So, as mentioned, since typically the SQL Monitor data gets flushed after several hours, try with dba_hist_reports and dbms_auto_report.report_repository_detail.
If you’re lucky, you’ll get yourself a goldmine of information which looks like SQL Monitor output.
You’ll need dba_hist_reports to supply REPORT_ID to dbms_auto_report.report_repository_detail.
Here’s the script :

col sql_exec_start form a25
col sql_exec_id form a20

 
select
KEY1 as sql_id,
to_char(key2) as sql_exec_id,
to_char( to_date(key3,'mm:dd:yyyy hh24:mi:ss') , 'yyyy-mm-dd hh24:mi:ss') sql_exec_start,
REPORT_ID, INSTANCE_NUMBER,INSTANCE_NUMBER,SESSION_ID,SESSION_SERIAL#, PERIOD_START_TIME,PERIOD_END_TIME
from dba_hist_reports where KEY1 = '&sql_id' order by sql_exec_start
/
 
SET LONG 1000000
SET LONGCHUNKSIZE 1000000
SET LINESIZE 330
SET PAGESIZE 0
SET TRIM ON
SET TRIMSPOOL ON
SET ECHO OFF
SET FEEDBACK OFF
 
select dbms_auto_report.report_repository_detail(RID=>&REPORT_ID,  type=>'TEXT') as report from dual;
 
 
set pages 80
SET FEEDBACK on


Simulating of the problem

I have pre-created three tables with ID and two columns of type VARCHAR.
Then I have truncated all of them, next gathered the statistics and finally populated them with 1’000, 10’000 and 100’000 rows, respectively.
No indexes involved.
Here is the prep. script and its output plus some checks :



SQL> desc delme.stag01
 Name                                                  Null?    Type
 ----------------------------------------------------- -------- ------------------------------------
 ID                                                             NUMBER
 PADDIE                                                         VARCHAR2(32)
 PAD                                                            VARCHAR2(512)

SQL> desc delme.stag02
 Name                                                  Null?    Type
 ----------------------------------------------------- -------- ------------------------------------
 ID                                                             NUMBER
 PADDIE                                                         VARCHAR2(32)
 PAD                                                            VARCHAR2(512)

SQL> desc delme.stag03
 Name                                                  Null?    Type
 ----------------------------------------------------- -------- ------------------------------------
 ID                                                             NUMBER
 PADDIE                                                         VARCHAR2(32)
 PAD                                                            VARCHAR2(512)




Preparing of the environemt :
- truncate all three tables
- gathering of the statistics for all of them
- populating the stag01, stag02 and stag03 with 1000, 10'000 and 100'000 rows, respectively



script: testcart_prep.sql
-----------------------------------------------------------------

set echo on

truncate table TSTUS.STAG01 ;
truncate table TSTUS.STAG02 ;
truncate table TSTUS.STAG03 ;

-- gathering statistics upon truncating of the three tables :
@gts
@gts
@gts

-- re-populating them with 1000, 10'000 and 100'000 rows, respectively :

insert into TSTUS.STAG01 select rownum, rpad('X',32,'X'), rpad('Y',512,'Y') from dual connect by level <= &rows_in_stg01 ;
commit ;

insert into TSTUS.STAG02 select rownum, rpad('X',32,'X') , rpad('Y',512,'Y') from dual connect by level <= &rows_in_stg02 ;
commit ;

insert into TSTUS.STAG03 select rownum, rpad('X',32,'X') , rpad('Y',512,'Y') from dual connect by level <= &rows_in_stg03 ;
commit ;

set echo off
-----------------------------------------------------------------


gts.sql :
exec DBMS_STATS.GATHER_TABLE_STATS (ownname => '&OWNER' , tabname => '&tablename',cascade => true, estimate_percent => &estimate_percent, granularity => 'ALL', degree => &parallel_slaves);



Here is the output of the execution of the above script: 



20:59:58 SYS@CDBTST1->PDBPS> @testcart_prep.sql
21:00:01 SYS@CDBTST1->PDBPS>
21:00:01 SYS@CDBTST1->PDBPS>
21:00:01 SYS@CDBTST1->PDBPS> truncate table TSTUS.STAG01 ;

Table truncated.

Elapsed: 00:00:00.09
21:00:01 SYS@CDBTST1->PDBPS> truncate table TSTUS.STAG02 ;

Table truncated.

Elapsed: 00:00:00.05
21:00:01 SYS@CDBTST1->PDBPS> truncate table TSTUS.STAG03 ;

Table truncated.

Elapsed: 00:00:00.04
21:00:02 SYS@CDBTST1->PDBPS>
21:00:02 SYS@CDBTST1->PDBPS> @gts
21:00:02 SYS@CDBTST1->PDBPS> exec DBMS_STATS.GATHER_TABLE_STATS (ownname => '&OWNER' , tabname => '&tablename',cascade => true, estimate_percent => &estimate_percent, granularity => 'ALL', degree => &parallel_slaves);
Enter value for owner: TSTUS
Enter value for tablename: stag01
Enter value for estimate_percent: 100
Enter value for parallel_slaves: 1

PL/SQL procedure successfully completed.


21:00:16 SYS@CDBTST1->PDBPS> @gts
21:00:16 SYS@CDBTST1->PDBPS> exec DBMS_STATS.GATHER_TABLE_STATS (ownname => '&OWNER' , tabname => '&tablename',cascade => true, estimate_percent => &estimate_percent, granularity => 'ALL', degree => &parallel_slaves);
Enter value for owner: TSTUS
Enter value for tablename: stag02
Enter value for estimate_percent: 100
Enter value for parallel_slaves: 1

PL/SQL procedure successfully completed.

21:00:24 SYS@CDBTST1->PDBPS> @gts
21:00:24 SYS@CDBTST1->PDBPS> exec DBMS_STATS.GATHER_TABLE_STATS (ownname => '&OWNER' , tabname => '&tablename',cascade => true, estimate_percent => &estimate_percent, granularity => 'ALL', degree => &parallel_slaves);
Enter value for owner: TSTUS
Enter value for tablename: stag03
Enter value for estimate_percent: 100
Enter value for parallel_slaves: 1

PL/SQL procedure successfully completed.


Enter value for rows_in_stg01: 1000
old   1: insert into TSTUS.STAG01 select rownum, rpad('X',32,'X'), rpad('Y',512,'Y') from dual connect by level <= &rows_in_stg01
new   1: insert into TSTUS.STAG01 select rownum, rpad('X',32,'X'), rpad('Y',512,'Y') from dual connect by level <= 1000

1000 rows created.

Commit complete.

Enter value for rows_in_stg02: 10e3
old   1: insert into TSTUS.STAG02 select rownum, rpad('X',32,'X') , rpad('Y',512,'Y') from dual connect by level <= &rows_in_stg02
new   1: insert into TSTUS.STAG02 select rownum, rpad('X',32,'X') , rpad('Y',512,'Y') from dual connect by level <= 10e3

10000 rows created.

Commit complete.

Enter value for rows_in_stg03: 100e3
old   1: insert into TSTUS.STAG03 select rownum, rpad('X',32,'X') , rpad('Y',512,'Y') from dual connect by level <= &rows_in_stg03
new   1: insert into TSTUS.STAG03 select rownum, rpad('X',32,'X') , rpad('Y',512,'Y') from dual connect by level <= 100e3

100000 rows created.

Commit complete.


Statistics reported : 

TABLE_NAME                                       TABLESPACE_NAME COMPRESS COMPRESS_FOR       BLKSIZE           BLOCKS         NUM_ROWS AVG_ROW_LEN   
------------------------------------------------ --------------- -------- --------------- ---------- ---------------- ---------------- ----------- 
TSTUS.STAG01                                     PSTS            DISABLED                       8192                0                0           0 
TSTUS.STAG02                                     PSTS            DISABLED                       8192                0                0           0 
TSTUS.STAG03                                     PSTS            DISABLED                       8192                0                0           0 


Last check of the cardinalities of all the test tables:

21:10:51 SYS@CDBTST1->PDBPS>  select count(*) cnt from  TSTUS.stag&nr ;
Enter value for nr: 01
old   1:  select count(*) cnt from  TSTUS.stag&nr
new   1:  select count(*) cnt from  TSTUS.stag01

             CNT
----------------
           1,000

Elapsed: 00:00:00.00
21:10:56 SYS@CDBTST1->PDBPS> /
Enter value for nr: 02
old   1:  select count(*) cnt from  TSTUS.stag&nr
new   1:  select count(*) cnt from  TSTUS.stag02

             CNT
----------------
          10,000

Elapsed: 00:00:00.01
21:10:59 SYS@CDBTST1->PDBPS> /
Enter value for nr: 03
old   1:  select count(*) cnt from  TSTUS.stag&nr
new   1:  select count(*) cnt from  TSTUS.stag03

             CNT
----------------
         100,000



query from dba_segments (how much space do the engage?):

SEGMENT_NAME                                                 SEGMENT_TYPE       TABLESPACE_NAME                      SEG_MB   LOBSIZE_MB     TOTAL_MB
------------------------------------------------------------ ------------------ ------------------------------ ------------ ------------ ------------
TSTUS.STAG01                                                 TABLE              PSTS                                      1                         1
TSTUS.STAG02                                                 TABLE              PSTS                                      7                         7
TSTUS.STAG03                                                 TABLE              PSTS                                     62                        62
                                                                                                               ------------ ------------ ------------
sum                                                                                                                      70                        70

Then I used that SQL query for my test :

# cat testcartnew.sql

select /* &comment */
/*+
monitor
*/
a.pad, b.pad, c.pad
--a.paddie, b.paddie, c.paddie
from TSTUS.STAG01 a, TSTUS.STAG02 b, TSTUS.STAG03 c
where a.id = b.id and b.id = c.id
/


Test with Cartesian Join

First, let's see what the CBO "thinks" - always good to know what should we be prepared for..


22:04:01 SYS@CDBTST1->PDBPS> set autotr trace expl
22:04:01 SYS@CDBTST1->PDBPS> @testcartnew
Enter value for comment: NEW01
old   1: select /* &comment */
new   1: select /* NEW01 */

Execution Plan
----------------------------------------------------------
Plan hash value: 3256225140

---------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |     1 |   813 |     6   (0)| 00:00:01 |
|*  1 |  HASH JOIN                   |        |     1 |   813 |     6   (0)| 00:00:01 |
|   2 |   MERGE JOIN CARTESIAN       |        |     1 |   542 |     4   (0)| 00:00:01 |
|   3 |    TABLE ACCESS STORAGE FULL | STAG01 |     1 |   271 |     2   (0)| 00:00:01 |
|   4 |    BUFFER SORT               |        |     1 |   271 |     2   (0)| 00:00:01 |
|   5 |     TABLE ACCESS STORAGE FULL| STAG03 |     1 |   271 |     2   (0)| 00:00:01 |
|   6 |   TABLE ACCESS STORAGE FULL  | STAG02 |     1 |   271 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"."ID"="B"."ID" AND "B"."ID"="C"."ID")


Ah, indeed - it chose JOIN CARTESIAN.
So this time with true execution :

22:06:01 SYS@CDBTST1->PDBPS> @setts
22:06:05 SYS@CDBTST1->PDBPS> SET AUTOTRACE TRACE EXPLAIN STATISTICS
22:06:07 SYS@CDBTST1->PDBPS> @ssqlid
22:06:09 SYS@CDBTST1->PDBPS> set  echo on
22:06:09 SYS@CDBTST1->PDBPS>  set feedback on sql_id
22:06:09 SYS@CDBTST1->PDBPS> set echo off


22:06:36 SYS@CDBTST1->PDBPS> @testcartnew
Enter value for comment: NEW001
old   1: select /* &comment */
new   1: select /* NEW001 */

1000 rows selected.

SQL_ID: avk3j07ya0k5u
Elapsed: 00:07:19.03   --> note the elapsed time (7 minutes, 19 sec)


Execution Plan
----------------------------------------------------------
Plan hash value: 3256225140

---------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |     1 |   813 |     6   (0)| 00:00:01 |
|*  1 |  HASH JOIN                   |        |     1 |   813 |     6   (0)| 00:00:01 |
|   2 |   MERGE JOIN CARTESIAN       |        |     1 |   542 |     4   (0)| 00:00:01 |
|   3 |    TABLE ACCESS STORAGE FULL | STAG01 |     1 |   271 |     2   (0)| 00:00:01 |
|   4 |    BUFFER SORT               |        |     1 |   271 |     2   (0)| 00:00:01 |
|   5 |     TABLE ACCESS STORAGE FULL| STAG03 |     1 |   271 |     2   (0)| 00:00:01 |
|   6 |   TABLE ACCESS STORAGE FULL  | STAG02 |     1 |   271 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"."ID"="B"."ID" AND "B"."ID"="C"."ID")
SQL_ID: 3s1hh8cvfan6w

SQL_ID: g72kdvcacxvtf

Statistics
----------------------------------------------------------
     101725  recursive calls
          1  db block gets
       8744  consistent gets
   12817098  physical reads     --> pay attention to the physical reads. This should be the first warning.. Three fairly small tables and 12817098 x 8KB of disk space accessed?
        184  redo size
      21479  bytes sent via SQL*Net to client
       1443  bytes received via SQL*Net from client
         68  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
       1000  rows processed

Ah, ok – cartesian join. There we have it. Cartesian Join is always an evil. Is it truly the biggest issue here?
Let’s find it out.
I will use the SQL Monitor giving me possibility to measure the time spent on each Plan Id in the SQL Plan.
Below there are several snapshots of the SQL Monitor (text version, still good enough though) up to to the final completion of the test query :

SYS@CDBTST1->PDBPS> @sqlmsqlid

First snap :
Enter value for sql_id: avk3j07ya0k5u
old   1:  select dbms_sqltune.REPORT_SQL_MONITOR(sql_id=>'&&sql_id',  report_level=>'ALL', type=>'TEXT') as report from dual
new   1:  select dbms_sqltune.REPORT_SQL_MONITOR(sql_id=>'avk3j07ya0k5u',  report_level=>'ALL', type=>'TEXT') as report from dual
SQL Monitoring Report

SQL Text
------------------------------
select /* NEW001 */ /*+ monitor */ a.pad, b.pad, c.pad --a.paddie, b.paddie, c.paddie from DELME.STAG01 a, DELME.STAG02 b, DELME.STAG03 c where a.id = b.id
and b.id = c.id

Global Information
------------------------------
Status              :  EXECUTING
Instance ID         :  1
Session             :  SYS (1228:2121)
SQL ID              :  avk3j07ya0k5u
SQL Execution ID    :  16777216
Duration            :  29s



Global Stats
==========================================================================
| Elapsed |   Cpu   |    IO    | Buffer | Write | Write |    Offload     |
| Time(s) | Time(s) | Waits(s) |  Gets  | Reqs  | Bytes | Returned Bytes |
==========================================================================
|      32 |      11 |       22 |   7868 | 22854 |  11GB |           11GB |
==========================================================================

SQL Plan Monitoring Details (Plan Hash Value=3256225140)
===================================================================================================================================================================================
| Id   |           Operation            |  Name  |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Write | Write |  Mem  | Temp | Activity |       Activity Detail       |
|      |                                |        | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs  | Bytes |       |      |   (%)    |         (# samples)         |
===================================================================================================================================================================================
| -> 0 | SELECT STATEMENT               |        |         |      |        31 |     +2 |     1 |        0 |       |       |     . |    . |          |                             |
| -> 1 |   HASH JOIN                    |        |       1 |    6 |        33 |     +0 |     1 |        0 | 24430 |  12GB | 241MB | 12GB |   100.00 | Cpu (11)                    |
|      |                                |        |         |      |           |        |       |          |       |       |       |      |          | direct path write temp (19) |
| -> 2 |    MERGE JOIN CARTESIAN        |        |       1 |    4 |        31 |     +2 |     1 |      12M |       |       |     . |    . |          |                             |
| -> 3 |     TABLE ACCESS STORAGE FULL  | STAG01 |       1 |    2 |        31 |     +2 |     1 |      123 |       |       |     . |    . |          |                             |
| -> 4 |     BUFFER SORT                |        |       1 |    2 |        31 |     +2 |   123 |      12M |       |       |  52MB |    . |          |                             |
|    5 |      TABLE ACCESS STORAGE FULL | STAG03 |       1 |    2 |         1 |     +2 |     1 |     100K |       |       |     . |    . |          |                             |
|    6 |    TABLE ACCESS STORAGE FULL   | STAG02 |       1 |    2 |           |        |       |          |       |       |     . |    . |          |                             |
===================================================================================================================================================================================


Second snap :
===================================================================================================================================================================================
| Id   |           Operation            |  Name  |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Write | Write |  Mem  | Temp | Activity |       Activity Detail       |
|      |                                |        | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs  | Bytes |       |      |   (%)    |         (# samples)         |
===================================================================================================================================================================================
| -> 0 | SELECT STATEMENT               |        |         |      |        55 |     +2 |     1 |        0 |       |       |     . |    . |          |                             |
| -> 1 |   HASH JOIN                    |        |       1 |    6 |        57 |     +0 |     1 |        0 | 43198 |  21GB | 241MB | 21GB |    98.18 | Cpu (18)                    |
|      |                                |        |         |      |           |        |       |          |       |       |       |      |          | direct path write temp (36) |
| -> 2 |    MERGE JOIN CARTESIAN        |        |       1 |    4 |        55 |     +2 |     1 |      21M |       |       |     . |    . |          |                             |
| -> 3 |     TABLE ACCESS STORAGE FULL  | STAG01 |       1 |    2 |        55 |     +2 |     1 |      215 |       |       |     . |    . |          |                             |
| -> 4 |     BUFFER SORT                |        |       1 |    2 |        55 |     +2 |   215 |      21M |       |       |  52MB |    . |     1.82 | Cpu (1)                     |
|    5 |      TABLE ACCESS STORAGE FULL | STAG03 |       1 |    2 |         1 |     +2 |     1 |     100K |       |       |     . |    . |          |                             |
|    6 |    TABLE ACCESS STORAGE FULL   | STAG02 |       1 |    2 |           |        |       |          |       |       |     . |    . |          |                             |
===================================================================================================================================================================================


Third snap :
========================================================================================================================================================================================
| Id   |           Operation            |  Name  |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Write | Write |  Mem  | Temp | Activity |         Activity Detail          |
|      |                                |        | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs  | Bytes |       |      |   (%)    |           (# samples)            |
========================================================================================================================================================================================
|    0 | SELECT STATEMENT               |        |         |      |       161 |     +2 |     1 |        0 |       |       |     . |    . |     1.88 | Cpu (3)                          |
| -> 1 |   HASH JOIN                    |        |       1 |    6 |       165 |     +0 |     1 |        0 |  126K |  61GB | 241MB | 62GB |    96.25 | Cpu (50)                         |
|      |                                |        |         |      |           |        |       |          |       |       |       |      |          | ASM IO for non-blocking poll (1) |
|      |                                |        |         |      |           |        |       |          |       |       |       |      |          | direct path write temp (103)     |
| -> 2 |    MERGE JOIN CARTESIAN        |        |       1 |    4 |       161 |     +2 |     1 |      62M |       |       |     . |    . |          |                                  |
| -> 3 |     TABLE ACCESS STORAGE FULL  | STAG01 |       1 |    2 |       161 |     +2 |     1 |      623 |       |       |     . |    . |          |                                  |
|    4 |     BUFFER SORT                |        |       1 |    2 |       161 |     +2 |   623 |      62M |       |       |  52MB |    . |     1.88 | Cpu (3)                          |
|    5 |      TABLE ACCESS STORAGE FULL | STAG03 |       1 |    2 |         1 |     +2 |     1 |     100K |       |       |     . |    . |          |                                  |
|    6 |    TABLE ACCESS STORAGE FULL   | STAG02 |       1 |    2 |           |        |       |          |       |       |     . |    . |          |                                  |
========================================================================================================================================================================================


These three snapshots reveal that the Cartesian Join (Plan Id = 2) is not the biggest problem here.
If you take a look at the “Activity %” it’s obvious that the Plan Id = 1 (HASH JOIN) is the culprit.
To be fair, it’s just partially true statement.
Let’s stick to the facts first. The time spent (lost) with the HASH JOIN indeed constitutes good 98% of the total 7 minutes.
How’s that possible? – typically the HASH JOIN is relatively fast operation, especially if accommodated within PGA. Admittedly, it’s not the case here – it clearly travelled to disk – it’s pretty obvious if you take a look at the accompanied wait events: “direct path write temp” and “direct path read temp“. This is clear indicator the PGA said goodbye at the certain phase.
But still…
If you take a look at the size of the involved segments STAG01, STAG02 and STAG03 – altogether they do not exceed 70 MB.
So, whichever Hash Join you’d imagine involving even the worst case scenario, so the two biggest tables you’d be exposed to the join of STAG02 (7MB) with STAG03 (62MB). So let’s assume the CBO freaked out and decided to build the Hash-Table on the 62MB. On a healthy disk subsystem (and I had such) it would take less than 1sec.
Then with that Hash-Table it’d be probing the table of 7MB – all would get ready in less than 1-2 sec (I am probably even too pessimistic – btw. I have prepared such scenario in further section of this post hinting the query to force two Hash Joins. Indeed, the elapsed time didn’t exceed 1 sec).

So what the heck was the problem here?
Let’s see the below summary upon completion of the test query (you may need to scroll to the right to spot the “Activity %” report column).
The time spent on preparing (building) of the Hash-Table associated with “direct path write temp” is even longer than its probing phase (direct path read temp).
For more about the mechanics of HASH JOIN take a look here and search for the string: “pseudo-code for HASH-JOIN”.
So why did the building of the Hash-Table take so long and why did it go to disk (wasn’t accommodated within the PGA) with fairly small tables?
The answer lies indeed within the CARTESIAN JOIN which is the usual (and silent) suspect.
First, let’s take a good look at the cardinalities of the involved children of the CARTESIAN JOIN: Plan Id = 3, STAG01 with 1000 rows and effectively the STAG03 with 100’000 rows behind the Plan Id = 5.
Now, if you follow the CARTESIAN JOIN rule (joining all rows from the “outer” table with all rows of the “inner” table, you’ll get the join cardinality equal to 100’000’000 (visible as 100M under the Rows (Actual) at the Plan ID = 2).
The math: 1000 x 100’000 = 100’000’000.
Further, the Plan Id = 2 (outcome of the CARTESIAN JOIN) is the input for the Plan Id = 1, so HASH JOIN, strictly speaking, the phase of building of the Hash-Table.
That means, the Hash-Table has been built based on the rowset as large as 100’000’0000 rows (!). In reality, that join cardinality, if not engaging the cartesian join, is just 1000 – see the test with hash join below and the reported true join cardinality.
So, that means an object (Hash-Table) accommodating 100 mln rows had to be laboriously prepared on disk to be then later probed (effectively: scanned) against just a tiny table STAG02 of size 7MB (Plan Id = 6).
Worth noticing, the space in TEMP tablespace required to retain such Hash-Table has reached 99GB (see “Temp (Max))”. If you think that there were just two tiny tables involved (about 63MB in total) to make a Hash Join to expand its Hash-Table to 99GB, it’s scary to imagine two much bigger tables involved in such Cartesian Join serving a Hash Join “up higher” in the SQL Plan.

It still does not fully encompass the complexity of the issue. A perhaps lesser-known detail is that when building a Hash-Table, it is not only the join predicates (e.g., IDs, typically of type NUMBER(22)) that are stored there.
Instead, all columns relevant to the query block are included in the Hash-Table. I will demonstrate this later in this post or dedicate a separate section to explore it in more detail.
To illustrate the impact, I used two columns, PAD (512 bytes) and PADDIE (32 bytes), deliberately selecting the larger column (PAD) to emphasize the effect. Had I chosen the smaller column (PADDIE), the elapsed time would have decreased. However, the fundamental issue would remain: the creation of a Hash Table based on 100 million rows. While the size of the Hash Table would be reduced, the underlying challenge of its construction would persist.

Global Information
------------------------------
Status              :  DONE (ALL ROWS)
Instance ID         :  1
Session             :  SYS (1228:2121)
SQL ID              :  avk3j07ya0k5u
SQL Execution ID    :  16777216
Duration            :  439s


Global Stats
=================================================================================================
| Elapsed |   Cpu   |    IO    | Fetch | Buffer | Read | Read  | Write | Write |    Offload     |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  | Reqs | Bytes | Reqs  | Bytes | Returned Bytes |
=================================================================================================
|     476 |     128 |      348 |    68 |   8745 | 203K |  98GB |  203K |  98GB |           98GB |
=================================================================================================

SQL Plan Monitoring Details (Plan Hash Value=3256225140)
======================================================================================================================================================================================================
| Id |           Operation            |  Name  |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Read | Read  | Write | Write |  Mem  | Temp  | Activity |         Activity Detail          |
|    |                                |        | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs | Bytes | Reqs  | Bytes | (Max) | (Max) |   (%)    |           (# samples)            |
======================================================================================================================================================================================================
|  0 | SELECT STATEMENT               |        |         |      |       438 |     +2 |     1 |     1000 |      |       |       |       |     . |     . |     1.17 | Cpu (5)                          |
|  1 |   HASH JOIN                    |        |       1 |    6 |       440 |     +0 |     1 |     1000 | 203K |  98GB |  203K |  98GB | 241MB |  99GB |    98.14 | Cpu (91)                         |
|    |                                |        |         |      |           |        |       |          |      |       |       |       |       |       |          | ASM IO for non-blocking poll (4) |
|    |                                |        |         |      |           |        |       |          |      |       |       |       |       |       |          | direct path read temp (148)      |
|    |                                |        |         |      |           |        |       |          |      |       |       |       |       |       |          | direct path write temp (178)     |
|  2 |    MERGE JOIN CARTESIAN        |        |       1 |    4 |       259 |     +2 |     1 |     100M |      |       |       |       |     . |     . |          |                                  |
|  3 |     TABLE ACCESS STORAGE FULL  | STAG01 |       1 |    2 |       259 |     +2 |     1 |     1000 |      |       |       |       |     . |     . |          |                                  |
|  4 |     BUFFER SORT                |        |       1 |    2 |       259 |     +2 |  1000 |     100M |      |       |       |       |  52MB |     . |     0.70 | Cpu (3)                          |
|  5 |      TABLE ACCESS STORAGE FULL | STAG03 |       1 |    2 |         1 |     +2 |     1 |     100K |      |       |       |       |     . |     . |          |                                  |
|  6 |    TABLE ACCESS STORAGE FULL   | STAG02 |       1 |    2 |         1 |   +260 |     1 |    10000 |      |       |       |       |     . |     . |          |                                  |
======================================================================================================================================================================================================

Btw. as promised, here are the ASH gv$active_session_history (with supplementary V$SQL_PLAN_MONITOR) and Hist-ASH (DBA_HIST_ACTIVE_SESS_HISTORY ) based scripts which, admittedly, do not provide many fancy details as SQL Monitor does, but the required data (the one based on DBA_HIST_ACTIVE_SESS_HISTORY) has chances to survive much longer.
You may be tempted to involve “connect by” accompanied by “rpad(‘ ‘,level) to introduce indentations, but they are to serve one fundamental goal – they need to show the time spent on each (registered) Plan Id.
You may need to scroll to the right to spot the most valuable data, that is, the percentage and time spent on each registered Plan Id.
If you compare this data with the output provided by SQL Monitor above, you’ll surely notice the relevant data does match.
In addition, if you compare the outcome from gv$active_session_history with the one from DBA_HIST_ACTIVE_SESS_HISTORY you’ll notice that the latter is the copy of the first one, essentially just containing the 10% of it (each 10th sample).

First, gv$active_session_history based script (first the output then the script) :

SYS@CDBTST1->PDBPS> @ashgrbysqlplan.sql
Enter value for sql_id: avk3j07ya0k5u
old   7: where SQL_ID = '&amp;&amp;sql_id'
new   7: where SQL_ID = 'avk3j07ya0k5u'

SQL_ID          SQL_EXEC_ID MIN(SQL_EXEC_START) MAX(LAST_REFRESH_TI MAX(SQL_PLAN_HASH_VALUE)     ELASEC
--------------- ----------- ------------------- ------------------- ------------------------ ----------
avk3j07ya0k5u      16777216 03-01-2025 11:06:44 03-01-2025 11:14:03               3256225140        439

Elapsed: 00:00:00.21
Enter value for sqlid: avk3j07ya0k5u
old   7:      ash.sql_id = nvl('&amp;sqlid',ash.sql_id)
new   7:      ash.sql_id = nvl('avk3j07ya0k5u',ash.sql_id)
Enter value for sql_exec_id:
old   8: and SQL_EXEC_ID = to_number( nvl('&amp;SQL_EXEC_ID',SQL_EXEC_ID) )
new   8: and SQL_EXEC_ID = to_number( nvl('',SQL_EXEC_ID) )
Enter value for how_many_minutes_back:
old   9: and sample_time >= ( sysdate - nvl('&amp;How_Many_Minutes_Back',99999999)/24/60 )
new   9: and sample_time >= ( sysdate - nvl('',99999999)/24/60 )

SQL_EXEC_ID SQL_ID          SQL_PLAN_HASH_VALUE SQL_PLAN_LINE_ID SQL_PLAN_OPE_OPTION                                EVENT                                       SECONDS % of all waits
----------- --------------- ------------------- ---------------- -------------------------------------------------- ---------------------------------------- ---------- --------------
   16777216 avk3j07ya0k5u            3256225140                0 SELECT STATEMENT                                   (on CPU)                                          5              1
                                                ****************                                                                                             ---------- --------------
                                                sum                                                                                                                   5              1

   16777216 avk3j07ya0k5u            3256225140                1 HASH JOIN                                          ASM IO for non-blocking poll                      4              1
   16777216 avk3j07ya0k5u            3256225140                  HASH JOIN                                          (on CPU)                                         91             21
   16777216 avk3j07ya0k5u            3256225140                  HASH JOIN                                          direct path read temp                           148             35
   16777216 avk3j07ya0k5u            3256225140                  HASH JOIN                                          direct path write temp                          178             41
                                                ****************                                                                                             ---------- --------------
                                                sum                                                                                                                 421             98

   16777216 avk3j07ya0k5u            3256225140                4 BUFFER -- SORT                                     (on CPU)                                          3              1
                                                ****************                                                                                             ---------- --------------
                                                sum                                                                                                                   3              1

                                                                                                                                                             ---------- --------------
sum                                                                                                                                                                 429            100


SQL script which produced the above output :

-- ashgrbysqlplan.sql
-- Author: Piotr Sajda PS5. 


compute sum of seconds on SQL_PLAN_LINE_ID
compute sum of "% of all waits" on SQL_PLAN_LINE_ID
compute sum of seconds on report
compute sum of "% of all waits"  on report
break on SQL_PLAN_LINE_ID  skip 1 on report

col "% of all waits" form 999999999
col EVENT form a40
col SQL_PLAN_OPE_OPTION form a50

set pages 100

select SQL_ID,SQL_EXEC_ID, min(SQL_EXEC_START), max( LAST_REFRESH_TIME ) , max(SQL_PLAN_HASH_VALUE)
, EXTRACT(HOUR FROM NUMTODSINTERVAL( max(LAST_REFRESH_TIME) - min(SQL_EXEC_START), 'DAY'))*3600 +
EXTRACT(MINUTE FROM NUMTODSINTERVAL( max(LAST_REFRESH_TIME) - min(SQL_EXEC_START), 'DAY'))*60 +
EXTRACT(SECOND FROM NUMTODSINTERVAL( max(LAST_REFRESH_TIME) - min(SQL_EXEC_START), 'DAY'))
ELASEC
from v$SQL_PLAN_MONITOR
where SQL_ID = '&&sql_id'
group by SQL_ID,SQL_EXEC_ID
order by 3
/

select SQL_EXEC_ID, ash.sql_id,  SQL_PLAN_HASH_VALUE, ash.SQL_PLAN_LINE_ID,
max(SQL_PLAN_OPERATION)||''||  case when max(SQL_PLAN_OPTIONS) is not NULL then ' -- '||max(SQL_PLAN_OPTIONS)  end  SQL_PLAN_OPE_OPTION,
nvl(event,'(on CPU)') event,  count(*) seconds  ,
round( count(*) / sum(count(*)) over (partition by ash.sql_id) * 100 , 2 ) "% of all waits"
from gv$active_session_history ash
where
     ash.sql_id = nvl('&sqlid',ash.sql_id)
and SQL_EXEC_ID = to_number( nvl('&SQL_EXEC_ID',SQL_EXEC_ID) )
and sample_time >= ( sysdate - nvl('&How_Many_Minutes_Back',99999999)/24/60 )
and ash.sql_id is not NULL
group by   SQL_EXEC_ID, ash.sql_id,  SQL_PLAN_LINE_ID,  SQL_PLAN_HASH_VALUE, event
order by   SQL_EXEC_ID, ash.sql_id,  SQL_PLAN_LINE_ID, count(*)
/


Now DBA_HIST_ACTIVE_SESS_HISTORY based script (the output and the script) :

SYS@CDBTST1->PDBPS> @hashgrbysqlplan.sql
Enter value for sqlid: avk3j07ya0k5u
old   8:      ash.sql_id = nvl('&amp;sqlid',ash.sql_id)
new   8:      ash.sql_id = nvl('avk3j07ya0k5u',ash.sql_id)
Enter value for sql_exec_id:
old   9: and SQL_EXEC_ID = to_number( nvl('&amp;SQL_EXEC_ID',SQL_EXEC_ID) )
new   9: and SQL_EXEC_ID = to_number( nvl('',SQL_EXEC_ID) )
Enter value for sql_plan_hash_value:
old  10: and SQL_PLAN_HASH_VALUE = to_number( nvl('&amp;SQL_PLAN_HASH_VALUE',SQL_PLAN_HASH_VALUE) )
new  10: and SQL_PLAN_HASH_VALUE = to_number( nvl('',SQL_PLAN_HASH_VALUE) )
Enter value for how_many_minutes_back:
old  11: and sample_time >= ( sysdate - nvl('&amp;How_Many_Minutes_Back',99999999)/24/60 )
new  11: and sample_time >= ( sysdate - nvl('',99999999)/24/60 )

SQL_EXEC_START      SQL_EXEC_ID SQL_ID          SQL_PLAN_HASH_VALUE SQL_PLAN_LINE_ID SQL_PLAN_OPE_OPTION                                EVENT                                       SAMPLES % of all waits
------------------- ----------- --------------- ------------------- ---------------- -------------------------------------------------- ---------------------------------------- ---------- --------------
03-01-2025 22:06:44    16777216 avk3j07ya0k5u            3256225140                1 HASH JOIN                                          ASM IO for non-blocking poll                      2              5
                       16777216 avk3j07ya0k5u            3256225140                1 HASH JOIN                                          (on CPU)                                          9             21
                       16777216 avk3j07ya0k5u            3256225140                1 HASH JOIN                                          direct path read temp                            15             35
                       16777216 avk3j07ya0k5u            3256225140                1 HASH JOIN                                          direct path write temp                           16             37
                       16777216 avk3j07ya0k5u            3256225140                  SELECT STATEMENT                                   (on CPU)                                          1              2
*******************                                                                                                                                                              ---------- --------------
sum                                                                                                                                                                                      43            100

SQL script which produced the above output :

-- hashgrbysqlplan.sql
-- Author: Piotr Sajda PS5. 

compute sum of samples on SQL_PLAN_LINE_ID
compute sum of "% of all waits" on SQL_PLAN_LINE_ID
compute sum of samples on SQL_EXEC_START
compute sum of "% of all waits"  on SQL_EXEC_START


break on SQL_EXEC_START  skip 1 on report

col "% of all waits" form 999999999
col EVENT form a40
col SQL_PLAN_OPE_OPTION form a50

set pages 100

select SQL_EXEC_START, SQL_EXEC_ID, ash.sql_id,  SQL_PLAN_HASH_VALUE, ash.SQL_PLAN_LINE_ID,
max(SQL_PLAN_OPERATION)||''||  case when max(SQL_PLAN_OPTIONS) is not NULL then ' -- '||max(SQL_PLAN_OPTIONS)  end  SQL_PLAN_OPE_OPTION,
-- nvl(event,'(on CPU)') event,  count(*)*10 seconds  ,
nvl(event,'(on CPU)') event,  count(*) samples  ,
round( count(*) / sum(count(*)) over (partition by ash.sql_id) * 100 , 2 ) "% of all waits"
from DBA_HIST_ACTIVE_SESS_HISTORY ash
where
     ash.sql_id = nvl('&sqlid',ash.sql_id)
and SQL_EXEC_ID = to_number( nvl('&SQL_EXEC_ID',SQL_EXEC_ID) )
and SQL_PLAN_HASH_VALUE = to_number( nvl('&SQL_PLAN_HASH_VALUE',SQL_PLAN_HASH_VALUE) )
and sample_time >= ( sysdate - nvl('&How_Many_Minutes_Back',99999999)/24/60 )
and ash.sql_id is not NULL
group by   SQL_EXEC_START, SQL_EXEC_ID, ash.sql_id,  SQL_PLAN_LINE_ID,  SQL_PLAN_HASH_VALUE, event
order by   SQL_EXEC_START, SQL_EXEC_ID, ash.sql_id,  SQL_PLAN_LINE_ID, count(*)
/


Test with Hash Join

This time I forced two Hash Joins for all three tables. To simplify the output and the analysis I took care that no bloom filters were triggered.
The script :

# cat testcartnewhash.sql

alter session set "_bloom_serial_filter" = off ;

select /* &comment */
/*+
monitor
leading (a b c)
use_hash( b )
use_hash( c )
*/
a.pad, b.pad, c.pad
--a.paddie, b.paddie, c.paddie
from DELME.STAG01 a, DELME.STAG02 b, DELME.STAG03 c
where a.id = b.id and b.id = c.id
/

Execution and runtime statistics :

SYS@CDBTST1->PDBPS> @ssqlid
SYS@CDBTST1->PDBPS>  set feedback on sql_id
SYS@CDBTST1->PDBPS> set echo off
SYS@CDBTST1->PDBPS> @setts
SYS@CDBTST1->PDBPS> SET AUTOTRACE TRACE EXPLAIN STATISTICS
SYS@CDBTST1->PDBPS> @testcartnewhash.sql
Enter value for comment: HJ
old   1: select /* &comment */
new   1: select /* HJ */

1000 rows selected.

SQL_ID: gu39nwx15xwhb
Elapsed: 00:00:00.04
SQL_ID: g72kdvcacxvtf

Execution Plan
----------------------------------------------------------
SQL_ID: 30gu3f85r6bwy
Plan hash value: 2925396405

--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |     1 |   813 |     6   (0)| 00:00:01 |
|*  1 |  HASH JOIN                  |        |     1 |   813 |     6   (0)| 00:00:01 |
|*  2 |   HASH JOIN                 |        |     1 |   542 |     4   (0)| 00:00:01 |
|   3 |    TABLE ACCESS STORAGE FULL| STAG01 |     1 |   271 |     2   (0)| 00:00:01 |
|   4 |    TABLE ACCESS STORAGE FULL| STAG02 |     1 |   271 |     2   (0)| 00:00:01 |
|   5 |   TABLE ACCESS STORAGE FULL | STAG03 |     1 |   271 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("B"."ID"="C"."ID")
   2 - access("A"."ID"="B"."ID")
SQL_ID: 3s1hh8cvfan6w

SQL_ID: g72kdvcacxvtf

Statistics
----------------------------------------------------------
          9  recursive calls
          0  db block gets
       8690  consistent gets
          0  physical reads
          0  redo size
      21479  bytes sent via SQL*Net to client
       1432  bytes received via SQL*Net from client
         68  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       1000  rows processed

Clearly, there were no physical reads, indicating that both hash tables fit entirely into the PGA. While it’s worth noting that all tables were cached (a total of approximately 70 MB), this detail is almost irrelevant here. Even if the data hadn’t been cached, the elapsed time wouldn’t have changed significantly.

Let’s analyze the runtime statistics segmented by Plan IDs as reported by SQL Monitor.
As observed, SQL Monitor did not capture any wait events (no surprise, as there were none—all operations were completed entirely on the CPU) so did not report any activity percentages.
It’s reasonable to conclude that SQL Monitor derives metrics such as wait events and subsequently calculated activity percentages from GV$ACTIVE_SESSION_HISTORY. Given that GV$ACTIVE_SESSION_HISTORY samples active session data at one-second intervals, it apparently did not register the session ID (SID) for this query due to its extremely short runtime.
Nevertheless, the true cardinality of each Plan ID is available, and this information is more than sufficient for our analysis.

11:28:04 SYS@GCTEST1->PDB01> @sqlmsqlid
Enter value for sql_id: gu39nwx15xwhb
old   1:  select dbms_sqltune.REPORT_SQL_MONITOR(sql_id=>'&&sql_id',  report_level=>'ALL', type=>'TEXT') as report from dual
new   1:  select dbms_sqltune.REPORT_SQL_MONITOR(sql_id=>'gu39nwx15xwhb',  report_level=>'ALL', type=>'TEXT') as report from dual
SQL Monitoring Report

SQL Text
------------------------------
select /* HJ */ /*+ monitor leading (a b c) use_hash( b ) use_hash( c ) */ a.pad, b.pad, c.pad --a.paddie, b.paddie, c.paddie from DELME.STAG01 a, DELME.STAG02 b, DELME.STAG03 c where a.id = b.id and b.id = c.
id

Global Information
------------------------------
Status              :  DONE (ALL ROWS)
Instance ID         :  1
Session             :  SYS (1945:59522)
SQL ID              :  gu39nwx15xwhb
SQL Execution ID    :  16777216
Duration            :  .0289s
Fetch Calls         :  68

Global Stats
=================================================
| Elapsed |   Cpu   |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
=================================================
|    0.03 |    0.03 |     0.00 |    68 |   8690 |
=================================================

SQL Plan Monitoring Details (Plan Hash Value=2925396405)
=============================================================================================================================================
| Id |           Operation           |  Name  |  Rows   | Cost |   Time    | Start  | Execs |   Rows   |  Mem  | Activity | Activity Detail |
|    |                               |        | (Estim) |      | Active(s) | Active |       | (Actual) | (Max) |   (%)    |   (# samples)   |
=============================================================================================================================================
|  0 | SELECT STATEMENT              |        |         |      |         1 |     +0 |     1 |     1000 |     . |          |                 |
|  1 |   HASH JOIN                   |        |       1 |    6 |         1 |     +0 |     1 |     1000 |   5MB |          |                 |
|  2 |    HASH JOIN                  |        |       1 |    4 |         1 |     +0 |     1 |     1000 |   5MB |          |                 |
|  3 |     TABLE ACCESS STORAGE FULL | STAG01 |       1 |    2 |         1 |     +0 |     1 |     1000 |     . |          |                 |
|  4 |     TABLE ACCESS STORAGE FULL | STAG02 |       1 |    2 |         1 |     +0 |     1 |    10000 |     . |          |                 |
|  5 |    TABLE ACCESS STORAGE FULL  | STAG03 |       1 |    2 |         1 |     +0 |     1 |     100K |     . |          |                 |
=============================================================================================================================================

The query executed in under one second.
Why?
The first Hash-Table was built using a small rowset of just 1’000 rows (as shown in the “Rows Actual” of STAG01 , Plan Id = 3). The second hash table was constructed with a similar level of effort—also 1,000 rows—derived from the result of the first hash join. This second hash table then probed STAG03 (Plan Id = 5), completing the operation efficiently.

Keep reading here

Leave a Reply

Your email address will not be published. Required fields are marked *