This post is the continuation of the previous part.
Clustering Factor – tests with block dumps
My test environment comprises a table with 10 million rows and three columns with two indexes:
– ID_SORTED (unique, generated as sorted) with the index on it
– ID_MESS (almost unique, generated as scattered) with the index on it
– PAD (filler with 256 bytes)
SYS@CDBTST1->TPDB1>
create table TSTUS.TSCLUF as
select rownum as ID_SORTED,
trunc( dbms_random.value(1,10e6) ) as ID_MESS,
rpad('X',256,'X') as pad
from dual connect by level <= 10e6 ;
Table created.
SYS@CDBTST1->TPDB1> create index TSTUS.IDX_ID_SORTED on TSTUS.TSCLUF ( ID_SORTED ) ;
Index created.
SYS@CDBTST1->TPDB1> create index TSTUS.IDX_ID_MESS on TSTUS.TSCLUF ( ID_MESS ) ;
Index created.
SYS@CDBTST1->TPDB1> exec DBMS_STATS.GATHER_TABLE_STATS (ownname => '&OWNER' , tabname => '&tablename',cascade => true, estimate_percent => &estimate_percent, granularity => 'ALL', degree => ¶llel_slaves);
Enter value for owner: TSTUS
Enter value for tablename: TSCLUF
Enter value for estimate_percent: 100
Enter value for parallel_slaves: 4
PL/SQL procedure successfully completed.
Data distribution and some other characteristics of these columns (NDV, LOW_VAL, HI_VAL) :
OWNER TABLE_NAME COL_ID COLUMN_NAME DATA_TYPE NDV DENSITY NUM_NULLS AVG_COL_LEN LOW_VAL HI_VAL HISTOGRAM
------------------------- ------------------------------ ------ ------------------------------ ------------ ------------ ----------- ---------- ----------- ----------------------------------- ----------------------------------- ---------------
TSTUS TSCLUF 2 ID_MESS NUMBER 6,321,696 .00000016 0 6 1 9999997 NONE
TSTUS 1 ID_SORTED NUMBER 10,000,000 .00000010 0 6 1 10000000 NONE
TSTUS 3 PAD VARCHAR2 1 1.00000000 0 257 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX NONE
Some data considering the both indexes.
Note, the numbers under CLUSTF for both (calculated values for the clustering factor – you may scroll a bit to the right).
– CLUF for IDX_ID_SORTED is close to the number of table’s blocks: CLUF: 384,616 and Table Blocks: 385,478
whereas :
– CLUF for IDX_ID_MESS – exactly opposite, that is: CLUF : 9,999,973 and number of rows: 10,000,000
OWNER TABLENAME INDEX_NAME INDEX_TYPE UNIQUENES COMPRESSION IDX_PX CLUSTF TabBlks TabRows NDV LEAF_BLOCKS COLS PAR BLEVEL LEAF_BL_KEY DATA_BL_KEY ORPHANED_E STATUS MYLAST_ANALYZED VISIBILIT MB
------------------ ------------------------------ ------------------------------ --------------- --------- ------------- -------- ------------ ---------------- ---------------- --------- ----------- ---- --- ------ ----------- ----------- ---------- ---------- -------------------- --------- ------------
TSTUS TSCLUF IDX_ID_MESS NORMAL NONUNIQUE DISABLED 1 9,999,973 385,478 10,000,000 6355456 23371 1 NO 2 1 1 NO VALID 24-09-24 06:00 VISIBLE 192
TSTUS IDX_ID_SORTED NORMAL NONUNIQUE DISABLED 1 384,616 385,478 10,000,000 9914368 24002 1 NO 2 1 1 NO VALID 24-09-24 06:00 VISIBLE 192
****************************** ------------
sum 384
Let’s have it examined and confirmed down to the bottom, so block dump level.
Here it is.
Test with low Clustering Factor
First, I will dump the entire index tree of IDX_ID_SORTED – the one which indexes the aligned values in the table. For that I’ll need the object_id.
Somewhere in the trace file you’ll find the list of “leaves” (index leaf blocks) – just navigate to “begin tree dump” within the trace file.
OBJECT SUBOBJECT_NAME OBJECT_ID DATA_OBJECT_ID OBJECT_TYPE
------------------------------------------------------------ ------------------------- ---------- -------------- --------------------
TSTUS.IDX_ID_SORTED 181826 181826 INDEX
20:28:21 SYS@CDBTST1->TPDB1>
20:28:21 SYS@CDBTST1->TPDB1> ALTER SESSION SET TRACEFILE_IDENTIFIER='&trfile_suffix';
Enter value for trfile_suffix: IDX_SORT_TREE
old 1: ALTER SESSION SET TRACEFILE_IDENTIFIER='&trfile_suffix'
new 1: ALTER SESSION SET TRACEFILE_IDENTIFIER='IDX_SORT_TREE'
Session altered.
INST_ID SID SERIAL# OSPID PID TRACEFILE
------- -------- ---------- ------------------------ -------- ---------------------------------------
1 252 21007 268114 254 CDBTST1_ora_268114_IDX_SORT_TREE.trc
SYS@CDBTST1->TPDB1> alter session set events 'immediate trace name treedump level 181826' ;
Session altered.
----- begin tree dump
branch: 0x112e80bb 288260283 (0: nrow: 38, level: 2)
branch: 0xc618836 207718454 (-1: nrow: 672, level: 1)
leaf: 0x112e80bc 288260284 (-1: row:485.485 avs:824)
leaf: 0x112e80bd 288260285 (0: row:479.479 avs:816)
leaf: 0x112e80be 288260286 (1: row:479.479 avs:816)
leaf: 0x112e80bf 288260287 (2: row:479.479 avs:816)
leaf: 0x112e80c0 288260288 (3: row:479.479 avs:816)
leaf: 0x112e80c1 288260289 (4: row:478.478 avs:830)
leaf: 0x112e80c2 288260290 (5: row:479.479 avs:816)
leaf: 0x112e80c3 288260291 (6: row:479.479 avs:816)
leaf: 0x112e80c4 288260292 (7: row:479.479 avs:816)
leaf: 0x112e80c5 288260293 (8: row:478.478 avs:830)
leaf: 0x112e80c6 288260294 (9: row:479.479 avs:816)
leaf: 0x112e80c7 288260295 (10: row:479.479 avs:816)
leaf: 0x112e80c9 288260297 (11: row:479.479 avs:816)
leaf: 0x112e80ca 288260298 (12: row:479.479 avs:816)
leaf: 0x112e80cb 288260299 (19: row:478.478 avs:830)
leaf: 0x112e80cc 288260300 (21: row:479.479 avs:816)
leaf: 0x112e80cd 288260301 (15: row:479.479 avs:816)
leaf: 0x112e80ce 288260302 (16: row:479.479 avs:816)
leaf: 0x112e80cf 288260303 (19: row:479.479 avs:816)
leaf: 0x112e80d0 288260304 (18: row:478.478 avs:830)
(...)
(...)
I took the first leaf with 485 entries.
Now let’s dump a specific leaf block (I’ll need file# and block#) :
INST_ID SID SERIAL# OSPID PID TRACEFILE
------- -------- ---------- ------------------------ -------- -------------------------------------
1 6661 56320 328117 307 CDBTST1_ora_328117_idx_bl_dump.trc
SYS@CDBTST1->TPDB1> select DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE(&&rba), DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK(&&rba) from dual;
Enter value for dba: 288260284
old 1: select DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE(&&rba), DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK(&&rba) from dual
new 1: select DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE(288260284), DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK(288260284) from dual
DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE(288260284) DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK(288260284)
----------------------------------------------- ------------------------------------------------
68 3047612
SYS@CDBTST1->TPDB1> alter system dump datafile 68 block 3047612 ;
System altered.
Indeed, that index block leaf contains 485 entries for SORTED_ID (rows are numbered starting from 0 – navigate in the dump to the row#0) :
row#0[8020] flag: -------, lock: 0, len=12
col 0; len 2; (2): c1 02 --> SORTED_ID = 1 as decimal
col 1; len 6; (6): 11 71 ad d3 00 00
row#1[8008] flag: -------, lock: 0, len=12
col 0; len 2; (2): c1 03 --> SORTED_ID = 2 as decimal
col 1; len 6; (6): 11 71 ad d3 00 01
row#2[7996] flag: -------, lock: 0, len=12
col 0; len 2; (2): c1 04 --> SORTED_ID = 3 as decimal
col 1; len 6; (6): 11 71 ad d3 00 02
row#3[7984] flag: -------, lock: 0, len=12
col 0; len 2; (2): c1 05
col 1; len 6; (6): 11 71 ad d3 00 03
row#4[7972] flag: -------, lock: 0, len=12
col 0; len 2; (2): c1 06
col 1; len 6; (6): 11 71 ad d3 00 04
row#5[7960] flag: -------, lock: 0, len=12
col 0; len 2; (2): c1 07
col 1; len 6; (6): 11 71 ad d3 00 05
row#6[7948] flag: -------, lock: 0, len=12
col 0; len 2; (2): c1 08
col 1; len 6; (6): 11 71 ad d3 00 06
row#7[7936] flag: -------, lock: 0, len=12
col 0; len 2; (2): c1 09
col 1; len 6; (6): 11 71 ad d3 00 07
(…)
(…) - skipped ..
(…)
row#484[1830] flag: -------, lock: 0, len=13
col 0; len 3; (3): c2 05 56 --> SORTED_ID = 485 as decimal
col 1; len 6; (6): 11 71 ad de 00 10
As expected, the indexed values within the leaf block are sorted (no need to convert the hex values, it’s pretty clear – I took the first 3 entries and the last one) :
for ro#0: col 0; len 2; (2): c1 02
for ro#1: col 0; len 2; (2): c1 03
for ro#3: col 0; len 2; (2): c1 04
(...)
for ro#484: col 0; len 2; (2): c2 05 56
Word of caution.
The decoding of hexadecimal values to the decimal INTEGER values is not straight forward, there are bytes devoted to store the exponent, mantissa, etc.
How you can easily check the matching decimal values (the ones you’d see in your sql*plus session being a result of a SELECT SORTED_ID from TSCLUF) with their hex-representation within the dumps is sort of reverse checks like that :
SQL> select dump(1,16) from dual ;
DUMP(1,16)
-----------------
Typ=2 Len=2: c1,2 --> matching the row#0, col0
SQL> select dump(2,16) from dual ;
DUMP(2,16)
-----------------
Typ=2 Len=2: c1,3 --> matching the row#1, col0
SQL> select dump(485,16) from dual ;
DUMP(485,16)
--------------------
Typ=2 Len=3: c2,5,56 --> matching the row#484, col0
Now, the moment of truth.
Let’s take the range of values: 1-485 which is the equivalent of a single index leaf block (the 485 entries from the dumped leaf) and count the number of blocks in the table for the selected range of ID_SORTED :
SYS@CDBTST1->TPDB1> select count(distinct DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid) ) "amount_blks" from TSTUS.TSCLUF i
where id_sorted <= 485 ;
amount_blks
-----------
19
Ah, the values of ID_SORTED ranging from 1 to 485 which fill the first leaf of the index, are distributed across 19 physical table blocks.
It does not tell us much until we compare this number to the corresponding number from high CLUF.
Let’s take a closer look how would this translate to a corresponding range scan of the table for a query with the predicate: WHERE ID_SORTED <= 485
, so being an equivalent of a range of values encompassing just single index leaf block directed to the table.
With the below query I will simply count the number of adjacent values.
As a result, in each of these 19 table blocks you’ll find 26 rows that satisfy this predicate, giving us a clear picture of how the data is distributed and will eventually be accessed from within this table.
I applied quite a big filler (PAD column) – this is why just 26 rows, you’ll see it later.
Here it is :
SYS@CDBTST1->TPDB1> select DBMS_ROWID.ROWID_RELATIVE_FNO(rowid) as "#File", DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid) "#Block", count(*) "Number_rows_in_this_block"
from TSTUS.TSCLUF i
where id_sorted <= 485
group by DBMS_ROWID.ROWID_RELATIVE_FNO(rowid), DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid)
order by 1,2 ;
#File #Block Number_rows_in_this_block
---------- ---------- -------------------------
54 2646760 26
54 2646761 26
54 2646762 26
54 2646763 26
54 2646764 26
54 2646765 26
54 2646766 26
54 2646767 26
69 3255763 26
69 3255764 26
69 3255765 26
69 3255766 26
69 3255767 26
69 3255769 26
69 3255770 26
69 3255771 26
69 3255772 26
69 3255773 26
69 3255774 17
19 rows selected. --> also: number of table's blocks
How to interpret the above?
Let’s take the #Block = 2646760 (marked red) and play with a simple assumption :
If, for example, a query had the following condition in its predicate section: “Select * from TSTUS.TSCLUF where ID_SORTED between N and N+26”, we can expect that this query would return 26 rows accessing just one physical block of the table.
This obviously makes a huge and positive difference as oppose to the predicate characterized by high CLUF, that is, for the predicate: “where ID_MESS between N and N+26”.
We will see it soon.
For even better understanding let’s combine the two outputs from above and list the block# along with the ordinal number of each row within that block and the true value of ID_SORTED within that block#.
Here it is.
I left just several representative entries.
We can expect 19 different blocks with 26 rows (see #Row ranging from 0-25) in each block with altogether 485 rows.
This is indeed what we achieved :
SQL> break on #Block skip 1
SQL> select rowid, DBMS_ROWID.ROWID_RELATIVE_FNO(rowid) as "#File", DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid) "#Block" , DBMS_ROWID.ROWID_ROW_NUMBER(rowid) "#Row", ID_SORTED
from TSTUS.TSCLUF i where id_sorted <= 485
order by "#Block", "#Row" ;
ROWID #File #Block #Row ID_SORTED
------------------ ---------- ---------- ---------- ----------
AAAsZBAA2AAKGLoAAA 54 2646760 0 131
AAAsZBAA2AAKGLoAAB 54 1 132
AAAsZBAA2AAKGLoAAC 54 2 133
AAAsZBAA2AAKGLoAAD 54 3 134
AAAsZBAA2AAKGLoAAE 54 4 135
AAAsZBAA2AAKGLoAAF 54 5 136
AAAsZBAA2AAKGLoAAG 54 6 137
AAAsZBAA2AAKGLoAAH 54 7 138
AAAsZBAA2AAKGLoAAI 54 8 139
AAAsZBAA2AAKGLoAAJ 54 9 140
AAAsZBAA2AAKGLoAAK 54 10 141
AAAsZBAA2AAKGLoAAL 54 11 142
AAAsZBAA2AAKGLoAAM 54 12 143
AAAsZBAA2AAKGLoAAN 54 13 144
AAAsZBAA2AAKGLoAAO 54 14 145
AAAsZBAA2AAKGLoAAP 54 15 146
AAAsZBAA2AAKGLoAAQ 54 16 147
AAAsZBAA2AAKGLoAAR 54 17 148
AAAsZBAA2AAKGLoAAS 54 18 149
AAAsZBAA2AAKGLoAAT 54 19 150
AAAsZBAA2AAKGLoAAU 54 20 151
AAAsZBAA2AAKGLoAAV 54 21 152
AAAsZBAA2AAKGLoAAW 54 22 153
AAAsZBAA2AAKGLoAAX 54 23 154
AAAsZBAA2AAKGLoAAY 54 24 155
AAAsZBAA2AAKGLoAAZ 54 25 156
AAAsZBAA2AAKGLpAAA 54 2646761 0 157
AAAsZBAA2AAKGLpAAB 54 1 158
AAAsZBAA2AAKGLpAAC 54 2 159
AAAsZBAA2AAKGLpAAD 54 3 160
AAAsZBAA2AAKGLpAAE 54 4 161
AAAsZBAA2AAKGLpAAF 54 5 162
AAAsZBAA2AAKGLpAAG 54 6 163
AAAsZBAA2AAKGLpAAH 54 7 164
AAAsZBAA2AAKGLpAAI 54 8 165
AAAsZBAA2AAKGLpAAJ 54 9 166
AAAsZBAA2AAKGLpAAK 54 10 167
AAAsZBAA2AAKGLpAAL 54 11 168
AAAsZBAA2AAKGLpAAM 54 12 169
AAAsZBAA2AAKGLpAAN 54 13 170
AAAsZBAA2AAKGLpAAO 54 14 171
AAAsZBAA2AAKGLpAAP 54 15 172
AAAsZBAA2AAKGLpAAQ 54 16 173
AAAsZBAA2AAKGLpAAR 54 17 174
AAAsZBAA2AAKGLpAAS 54 18 175
AAAsZBAA2AAKGLpAAT 54 19 176
AAAsZBAA2AAKGLpAAU 54 20 177
AAAsZBAA2AAKGLpAAV 54 21 178
AAAsZBAA2AAKGLpAAW 54 22 179
AAAsZBAA2AAKGLpAAX 54 23 180
AAAsZBAA2AAKGLpAAY 54 24 181
AAAsZBAA2AAKGLpAAZ 54 25 182
(…)
AAAsZBABFAAMa3TAAA 69 3255763 0 1
AAAsZBABFAAMa3TAAB 69 1 2
AAAsZBABFAAMa3TAAC 69 2 3
AAAsZBABFAAMa3TAAD 69 3 4
AAAsZBABFAAMa3TAAE 69 4 5
AAAsZBABFAAMa3TAAF 69 5 6
AAAsZBABFAAMa3TAAG 69 6 7
AAAsZBABFAAMa3TAAH 69 7 8
AAAsZBABFAAMa3TAAI 69 8 9
AAAsZBABFAAMa3TAAJ 69 9 10
AAAsZBABFAAMa3TAAK 69 10 11
AAAsZBABFAAMa3TAAL 69 11 12
AAAsZBABFAAMa3TAAM 69 12 13
AAAsZBABFAAMa3TAAN 69 13 14
AAAsZBABFAAMa3TAAO 69 14 15
AAAsZBABFAAMa3TAAP 69 15 16
AAAsZBABFAAMa3TAAQ 69 16 17
AAAsZBABFAAMa3TAAR 69 17 18
AAAsZBABFAAMa3TAAS 69 18 19
AAAsZBABFAAMa3TAAT 69 19 20
AAAsZBABFAAMa3TAAU 69 20 21
AAAsZBABFAAMa3TAAV 69 21 22
AAAsZBABFAAMa3TAAW 69 22 23
AAAsZBABFAAMa3TAAX 69 23 24
AAAsZBABFAAMa3TAAY 69 24 25
AAAsZBABFAAMa3TAAZ 69 25 26
AAAsZBABFAAMa3UAAA 69 3255764 0 27
AAAsZBABFAAMa3UAAB 69 1 28
AAAsZBABFAAMa3UAAC 69 2 29
AAAsZBABFAAMa3UAAD 69 3 30
AAAsZBABFAAMa3UAAE 69 4 31
AAAsZBABFAAMa3UAAF 69 5 32
AAAsZBABFAAMa3UAAG 69 6 33
AAAsZBABFAAMa3UAAH 69 7 34
AAAsZBABFAAMa3UAAI 69 8 35
AAAsZBABFAAMa3UAAJ 69 9 36
AAAsZBABFAAMa3UAAK 69 10 37
AAAsZBABFAAMa3UAAL 69 11 38
AAAsZBABFAAMa3UAAM 69 12 39
AAAsZBABFAAMa3UAAN 69 13 40
AAAsZBABFAAMa3UAAO 69 14 41
AAAsZBABFAAMa3UAAP 69 15 42
AAAsZBABFAAMa3UAAQ 69 16 43
AAAsZBABFAAMa3UAAR 69 17 44
AAAsZBABFAAMa3UAAS 69 18 45
AAAsZBABFAAMa3UAAT 69 19 46
AAAsZBABFAAMa3UAAU 69 20 47
AAAsZBABFAAMa3UAAV 69 21 48
AAAsZBABFAAMa3UAAW 69 22 49
AAAsZBABFAAMa3UAAX 69 23 50
AAAsZBABFAAMa3UAAY 69 24 51
AAAsZBABFAAMa3UAAZ 69 25 52
(…)
485 rows selected --> roughly: 19 (blocks) x 26 (rows in each block)
The block# = 3255763 (marked above) is btw. the one which partially covers the corresponding index leaf block which I dumped above (see the values of ID_SORTED encompassing the range : 1 – 26. So: “row 0” – “row 25”)
Let’s verify that :
SYS@CDBTST1->TPDB1> alter system dump datafile 69 block 3255763 ;
System altered.
The corresponding content of the table block representing the file# = 69 and block# = 3255763 is below (excerpt of the dump).
Start dump data blocks tsn: 8 file#:69 minblk 3255763 maxblk 3255763
….many other (interesting) entries skipped
Finally, the content of the dump of the table block we are after :
block_row_dump:
tab 0, row 0, @0x1e71
tl: 271 fb: --H-FL-- lb: 0x0 cc: 3
col 0: [ 2] c1 02 --> value: 1 (integer)
col 1: [ 5] c4 04 0e 58 36
col 2: [256]
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58
tab 0, row 1, @0x1d62
tl: 271 fb: --H-FL-- lb: 0x0 cc: 3
col 0: [ 2] c1 03 --> value: 2 (integer)
col 1: [ 5] c4 04 30 02 3d
col 2: [256]
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58
…
… more entries … …
… until the last one: row 25 (26th entry) :
tab 0, row 25, @0x3fb
tl: 271 fb: --H-FL-- lb: 0x0 cc: 3
col 0: [ 2] c1 1b --> value: 26 (integer)
col 1: [ 5] c4 07 4e 22 51
col 2: [256]
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58 58
58 58 58 58 58 58
end_of_block_dump
Indeed, 26 rows with the prefix “col 0” (e.g. the first one: “col 0: [ 2] c1 02”) being the hex representation of ID_SORTED.
Most importantly: the values for “col 0” are sorted within the table block whereas the values for “col 1” being such representation for ID_MESS are everything but sorted – absolute mess! Almost needless to carry out the subsequent test : )
For completeness, the “col 2” is obviously the hex representation of PAD with 256 bytes filled with ‘X’ (hex 58 is ‘X’). You can check it with e.g.
select utl_raw.cast_to_varchar2(hextoraw( replace(‘&hex_value’,chr(32),”) )) mystr from dual ;
Obviously, the values 1-26 don’t have to be sorted within the block to keep the CLUF low – it’d be enough if the block was kind of closed compound for the values 1-26.
Meaning, the values 1-26 can also be scattered within the block and CLUF won’t change, hence the efficiency of any associated query with the predicate “where ID_SORTED between 1 and 26” will be exactly the same (and satisfactory).
Reason is simple: Oracle fetches the entire (e.g. 8KB) block from disk, then traverses over the rows within that block, possibly filtering them out.
They are indeed sorted within each block since the table was created in such way.
Test with high Clustering Factor
Let’s repeat the above analysis, this time with the column of MESS_ID, so the one which is also indexed (thus sorted within the index leaf blocks), but it’s scattered around the table’s blocks.
As before: I will dump the index tree, then will dump the selected block from that tree (first leaf block) and will carry on with the analogical analysis.
OBJECT SUBOBJECT_NAME OBJECT_ID DATA_OBJECT_ID OBJECT_TYPE
------------------------------------------------------------ ------------------------- ---------- -------------- -------------
TSTUS.IDX_ID_MESS 181827 181827 INDEX
SYS@CDBTST1->TPDB1> ALTER SESSION SET TRACEFILE_IDENTIFIER='&trfile_suffix';
Enter value for trfile_suffix: IDXTREE_MESS
old 1: ALTER SESSION SET TRACEFILE_IDENTIFIER='&trfile_suffix'
new 1: ALTER SESSION SET TRACEFILE_IDENTIFIER='IDXTREE_MESS'
Session altered.
20:45:28 SYS@CDBTST1->TPDB1>
20:45:28 SYS@CDBTST1->TPDB1>
20:45:28 SYS@CDBTST1->TPDB1> alter session set events 'immediate trace name treedump level 181827' ;
Session altered.
----- begin tree dump
branch: 0x10f640bb 284573883 (0: nrow: 40, level: 2)
branch: 0xc142b14 202648340 (-1: nrow: 638, level: 1)
leaf: 0x10f640bc 284573884 (-1: row:485.485 avs:821)
leaf: 0x10f640bd 284573885 (0: row:479.479 avs:816)
leaf: 0x10f640be 284573886 (1: row:478.478 avs:830)
leaf: 0x10f640bf 284573887 (2: row:479.479 avs:818)
leaf: 0x10f640c0 284573888 (3: row:478.478 avs:829)
leaf: 0x10f640c1 284573889 (4: row:478.478 avs:827)
leaf: 0x10f640c2 284573890 (5: row:479.479 avs:820)
leaf: 0x10f640c3 284573891 (6: row:479.479 avs:821)
leaf: 0x10f640c4 284573892 (7: row:478.478 avs:829)
(…)
leaf: 0xda462ef 228877039 (338: row:422.422 avs:827)
leaf: 0xda462f0 228877040 (339: row:422.422 avs:822)
leaf: 0xda462f1 228877041 (340: row:41.41 avs:7299)
----- end tree dump
The anatomy of the index tree dump has been presented by Richard Foote here and I have nothing to add, just can learn a lot.
Btw. from Oracle 12c there’s been introduced a new format of the index tree and Richard covers that in his post.
Perhaps just one interesting thing to notice.
row:<number> informs you about the number of indexed entries (values) within a given index leaf block.
You might be wondering why the first leaf block contains 485 such entries but the “capacity” of a leaf block gets gradually decreased up until the last before last leaf with just 422 entries (the very last one can be skipped as simply not being fully stuffed with indexed values).
If you recall that B*Tree index presents a sorted structure of indexed values, and in our case the first value stored in the first leaf block is 1, the last is 10’000’000 you’ll surely come to the following conclusion :
along with the increase of the integer values (from 1 to 10’000’000) Oracle will simply need more room (measured in bytes) to store these values as hex values.
Therefore, as the leaf blocks grow from “left to right”, the less ID_MESS (or ID_SORTED – no difference) entries will be able to be accommodated withing a given leaf block because the numbers grow.
So, for example, the first leaf is able to accommodate 485 entries since the values of ID_SORTED start from 1 and ends with 485.
So, its hex representation spans the range 2 – 3 bytes with first hex: c1 02 (integer: 1) and the last: c2 05 56 (integer: 485).
The last before last block leaf (I am taking the one which is the last full) is able to accommodate just 422 entries, as it needs more space to store a single entry of ID_MESS because in that leaf we are around the values of 10’000’000.
Coming back to the test – let’s dump the selected index leaf.
SYS@CDBTST1->TPDB1> ALTER SESSION SET TRACEFILE_IDENTIFIER='&trfile_suffix';
Enter value for trfile_suffix: idx_leaf_dump_mess
old 1: ALTER SESSION SET TRACEFILE_IDENTIFIER='&trfile_suffix'
new 1: ALTER SESSION SET TRACEFILE_IDENTIFIER='idx_leaf_dump_mess'
Session altered.
20:51:25 SYS@CDBTST1->TPDB1>
20:51:26 SYS@CDBTST1->TPDB1> select DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE(&&rba), DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK(&&rba) from dual;
Enter value for rba: 284573884
old 1: select DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE(&&rba), DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK(&&rba) from dual
new 1: select DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE(284573884), DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK(284573884) from dual
DBMS_UTILITY.DATA_BLOCK_ADDRESS_FILE(284573884) DBMS_UTILITY.DATA_BLOCK_ADDRESS_BLOCK(284573884)
----------------------------------------------- ------------------------------------------------
67 3555516
SYS@CDBTST1->TPDB1> alter system dump datafile 67 block 3555516 ;
System altered.
As in the case of the previous index, also the IDX_ID_MESS contains sorted values within its leaf blocks.
As before, I took the first leaf and dumped it. Also this leaf block contains 485 sorted entries (rows are numbered starting from 0).
See “col 0” entries :
Leaf block dump
===============
header address 5289869412=0x13b4d0064
kdxcolev 0
KDXCOLEV Flags = - - -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 485
kdxcofbo 1006=0x3ee
kdxcofeo 1827=0x723
kdxcoavs 821
kdxlespl 0
kdxlende 0
kdxlenxt 284573885=0x10f640bd
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8032
row#0[8020] flag: -------, lock: 0, len=12
col 0; len 2; (2): c1 02 --> MESS_ID = 1 as decimal
col 1; len 6; (6): 0d a4 3f a3 00 15
row#1[8008] flag: -------, lock: 0, len=12
col 0; len 2; (2): c1 07 --> MESS_ID = 6 as decimal
col 1; len 6; (6): 11 71 fc b3 00 0e
row#2[7996] flag: -------, lock: 0, len=12
col 0; len 2; (2): c1 08 --> MESS_ID = 7 as decimal
col 1; len 6; (6): 10 f6 a3 d6 00 0f
row#3[7984] flag: -------, lock: 0, len=12
col 0; len 2; (2): c1 08
col 1; len 6; (6): 10 f6 ef b9 00 0e
row#4[7972] flag: -------, lock: 0, len=12
col 0; len 2; (2): c1 08
col 1; len 6; (6): 11 e0 85 48 00 0e
(..)
(..)
row#481[1866] flag: -------, lock: 0, len=13
col 0; len 3; (3): c2 05 5f
col 1; len 6; (6): 11 a1 df 2a 00 01
row#482[1853] flag: -------, lock: 0, len=13
col 0; len 3; (3): c2 05 60
col 1; len 6; (6): 11 71 f7 ab 00 15
row#483[1840] flag: -------, lock: 0, len=13
col 0; len 3; (3): c2 05 62
col 1; len 6; (6): 0d 9d 84 4a 00 08
row#484[1827] flag: -------, lock: 0, len=13
col 0; len 3; (3): c2 05 62 --> MESS_ID = 497 as decimal
col 1; len 6; (6): 11 35 3c ce 00 18
----- end of leaf block Logical dump -----
----- end of leaf block dump -----
You might be wondering, why the values are not consecutive (1,2,3,4…485) – this is because I have used the trunc( dbms_random.value(1,10e6) ) as ID_MESS in order to generate the scattering and the truncating of the outcome of dbms_random ended up with quite a number of the same values (still, its range is broad enough – nearly 6.3 mln distinct values across the range of 10 mln of all rows – see the NDV in “Data distribution and some other characteristics of these columns” somewhere at the beginning of the post).
I could have utilized mod (modulo function) to populate the data. Having applied mod I would have also produced the perfect mess, but with the consecutive values scattered around the table’s blocks.
Every cloud has a silver lining though : ) – with that tiny “imperfectness” I have achieved quite an interesting caveat which I will describe below (or will put it in another short post).
Just as before, let’s check how many table’s blocks Oracle needs in order to store (and eventually visit) the first 485 values which the equivalent of one index leaf block storing the ID_MESS :
SYS@CDBTST1->TPDB1> select count(distinct DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid) ) "amount_blks" from TSTUS.TSCLUF i where id_mess <= 485 ;
amount_blks
-----------
469
Oh, it got much worse.
How to understand it?
If you remember, with low CLUF it’d be just 19 physical reads from the table (19 table’s blocks would satisfy the predicate of ID_SORTED <= 485).
Now we’d need 469 physical reads from the same table for the adequate predicate, this time: ID_MESS <= 485.
So, putting it in different way, with the high (admittedly, extremely high) CLUF reading of one index leaf block would trigger 469 physical reads from the corresponding table.
Huge difference, isn’t it?
Again, a bit deeper.
As above, I will run the same simple query grouping by :
DBMS_ROWID.ROWID_RELATIVE_FNO(rowid), DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid)
and counting the number of rows in each block satisfying the predicate ID_MESS <= 485.
SYS@CDBTST1->TPDB1> select DBMS_ROWID.ROWID_RELATIVE_FNO(rowid) as "#File", DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid) "#Block", count(*) "Number_rows_in_this_block"
from TSTUS.TSCLUF i
where id_mess <= 485
group by DBMS_ROWID.ROWID_RELATIVE_FNO(rowid), DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid) order by 1,2
;
#File #Block Number_rows_in_this_block
---------- ---------- -------------------------
45 2048617 1
45 2048794 1
45 2048926 1
45 2049315 1
46 2057193 1
50 2352355 1
50 2352375 1
50 2352679 1
51 2196098 1
51 2196799 1
53 2458165 1
54 1523955 1
(…)
(…)
71 2153163 1
71 2153975 1
71 2156597 1
71 2157035 1
71 2157316 1
71 2158191 1
71 2159030 1
71 2159697 1
71 2162949 1
71 2162979 1
469 rows selected.
No surprise – for this exemplary listing each such block contains just one row which satisfies the predicate!
Note, with low CLUF we had 26 rows!
To be consequent, let’s combine the two outputs from above and list the block# along with the ordinal number of each row within that block and the true value of ID_MESS within that block#.
Here it is (I left several representative entries).
SYS@CDBTST1->TPDB1> break on #Block skip 1
select rowid, DBMS_ROWID.ROWID_RELATIVE_FNO(rowid) as "#File", DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid) "#Block" , DBMS_ROWID.ROWID_ROW_NUMBER(rowid) "#Row", ID_MESS from TSTUS.TSCLUF i where id_mess <= 485 order by "#Block", "#Row"
ROWID #File #Block #Row ID_MESS
------------------ ---------- ---------- ---------- ----------
AAAsZBAA2AAF0DzAAC 54 1523955 2 118
AAAsZBAA2AAF0D/AAJ 54 1523967 9 183
AAAsZBABHAAG+SQAAP 71 1827984 15 102
AAAsZBABHAAG+TdAAW 71 1828061 22 103
AAAsZBABHAAG+bZAAF 71 1828569 5 121
AAAsZBABHAAG+skAAS 71 1829668 18 269
AAAsZBABGAAHWXJAAT 70 1926601 19 484
AAAsZBAA2AAHYN8AAG 54 1934204 6 445
AAAsZBAA2AAHYRmAAD 54 1934438 3 458
(….)
(…)
AAAsZBABDAAN1saAAO 67 3627802 14 381
AAAsZBABDAAN1saAAR 67 17 8
AAAsZBABDAAN1tlAAO 67 3627877 14 40
AAAsZBABDAAN18BAAN 67 3628801 13 330
AAAsZBABDAAN18EAAV 67 3628804 21 31
AAAsZBABDAAN2e1AAQ 67 3631029 16 129
AAAsZBABDAAN2zhAAZ 67 3632353 25 318
AAAsZBABDAAN3FQAAS 67 3633488 18 147
AAAsZBABDAAN3QPAAE 67 3634191 4 11
AAAsZBABDAAN3ROAAG 67 3634254 6 233
AAAsZBABDAAN3hrAAW 67 3635307 22 445
AAAsZBABDAAN30gAAL 67 3636512 11 270
AAAsZBABDAAN31/AAW 67 3636607 22 456
AAAsZBABDAAN4CtAAL 67 3637421 11 65
AAAsZBABDAAN4OhAAA 67 3638177 0 165
AAAsZBABDAAN4dSAAB 67 3639122 1 421
470 rows selected.
We were “a bit lucky” in only one case, that is only block# 3627802 (marked above) stores more than 1 row (but still only 2) satisfying the predicate <= 485.
if you will, it’s for the ID_MESS = 381 (at the ordinal position 14) and ID_MESS = 8 (at the position 17 in the table’s block).
In all other cases the rest of the table blocks which would need to be visited contain just ONE row satisfying the predicate.
That, of course, is highly inefficient and we will try to quantify the “scale of evil” in the subsequent sections.
So much about the dumps.
Let’s jump to the cases represented by SELECT statements and examine the true effort measured with the number of physical reads which are hidden behind a query retrieving its data from the table by the means of either one: ID_SORTED or ID_MESS in the predicate section.
Clustering Factor – part3.
Leave a Reply