Let’s take a look at the point within the 10053 trace which, albeit usually not critical, sometimes may be a tip turning the scales.
You have surely noticed that a total cost reported in the trace involves two summands: IO_COST and CPU_COST, where although in real the latter being just a fraction of the entire COST figure typically shows stunningly high value.
For example: 15 and 759715.2, respectively (see below).
Yet the total COST shown is 15.020259.
So, it becomes clear that they cannot simply be added and there must be some conversion formula which needs to be applied (to CPU_COST) before it can be added to IO_COST order to achieve the final number.
Note.
For a comprehensive dive into the calculations of IO_COST, I encourage you to explore this post. It features an in-depth analysis of the topic, complete with examples. Additionally, it provides insights into aligning SQL Plan IDs with the corresponding sections of the 10053 trace.
Particularly valuable when explaining the underlying reasons for selecting a full table scan, even in cases where an index access path is expected to be more favorable. The evaluation is conducted in the context of high and low Clustering Factor (CLUF).
Coming back to CPU_COST.
Let’s look at the example.
For clarity, it’s an excerpt of costing of a full table scan of a table (STAG01), but the conversion formula shown in the subsequent section applies to all cases, regardless whether it’s costing of a FTS or a join, etc.
1075 SINGLE TABLE ACCESS PATH
1076 Single Table Cardinality Estimation for STAG01[A]
1077 SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
1078 Table: STAG01 Alias: A
1079 Card: Original: 1000.000000 Rounded: 1000 Computed: 1000.000000 Non Adjusted: 1000.000000
1080 Scan IO Cost (Disk) = 15.000000
1081 Scan CPU Cost (Disk) = 759715.200000
1082 Total Scan IO Cost = 15.000000 (scan (Disk))
1083 = 15.000000
1084 Total Scan CPU Cost = 759715.200000 (scan (Disk))
1085 = 759715.200000
1086 Access Path: TableScan
1087 Cost: 15.020259 Resp: 15.020259 Degree: 0
1088 Cost_io: 15.000000 Cost_cpu: 759715
1089 Resp_io: 15.000000 Resp_cpu: 759715
1090 Best:: AccessPath: TableScan
1091 Cost: 15.020259 Degree: 1 Resp: 15.020259 Card: 1000.000000 Bytes: 0.000000
So, let’s go through relevant lines marked red :
Total Scan IO Cost = 15.000000 (scan (Disk))
Total Scan CPU Cost = 759715.200000 (scan (Disk))
Cost: 15.020259
First, if we tried to simply add both summands:
Total Scan IO Cost (15.000000) + Total Scan CPU Cost (759715.200000) we would not achieve the total:
(Total) Cost: 15.020259
It turns obvious that the 0.020259 (being the difference between the TOTAL and COST_IO) corresponds to 759715.200000, so there must be a conversion formula to be applied which transforms the 759715.200000 to 0.020259 in order (for us, humans) to be able to compare apples to apples.
Here it is:
converted CPU Cost = Cost_cpu_from_10053 / 1000 / SREADTIM / CPUSPEED(NW)
Now, there is basically one cryptic variable: SREADTIM (average single block read time, its formula explained later), wheres the CPUSPEED (NW = no workload) can be found directly in the 10053 trace in this section representing the SYSTEM STATS :
1030 SYSTEM STATISTICS INFORMATION
1031 -----------------------------
1032 Using dictionary system stats.
1033 Using NOWORKLOAD Stats
1034 CPUSPEEDNW: 3125 millions instructions/sec (default is 100)
1035 IOTFRSPEED: 4096 bytes per millisecond (default is 4096)
1036 IOSEEKTIM: 10 milliseconds (default is 10)
1037 MBRC: 128 blocks (default is 8)
So, first, how is the SREADTIM calculated?
SREADTIM = IOSEEKTIM + db_block_size / IOTFRSPEED
db_block_size – it’s obviously the DB Block Size, in my case 8192 (also to be found in the trace file).
Let’s combine the two formulas based on the above data:
SREADTIM = 10 + 8192 / 4096
SREADTIM = 12
Finally:
Converted CPU Cost = Cost_cpu_from_10053 / 1000 / SREADTIM / CPUSPEED(NW)
Converted CPU Cost = 759715.200000 / 1000 / 12 / 3125
Finally: Converted CPU Cost = 0,020259072 (= ~0.020259 which is the sought number being the difference between the TOTAL COST and COST_IO).
Now if we add them up, we will achieve the total reported cost for the full table scan of STAG01:
Total Scan IO Cost (15.000000)
+ Total [Converted] Scan CPU (0.020259)
========================
= Total Cost (15.020259)
Another example
This example happens to represent the costing of a Nested Loop.
I will go through it putting the comments inline directly underneath the costing section from the 10053 trace, applying the same formula.
Note, the summands are not that obvious as before.
excerpt of 10053 trace :
1330 NL Join: Cost: 138912.140697 Resp: 138912.140697 Degree: 1
1331 Cost_io: 138696.000000 Cost_cpu: 8105276139
1332 Resp_io: 138696.000000 Resp_cpu: 8105276139
138912,140697 (total NL cost = shown as "NL Join: Cost:")
-138696,000000 (Cost_io)
================
216,140697 --> this is the sought number. It will be converted from Cost_cpu: 8105276139 based on the below formula
Conversion:
converted CPU Cost : Cost_cpu_from_10053 / 1000 / SREADTIM / CPUSPEEDNW
SREADTIM = IOSEEKTIM + db_block_size / IOTFRSPEED = 12
CPU Cost Factor SREADTIME CPUSPEEDNW (CPU Speed from "no workload stats")
8105276139 1000 12 3125
Converted CPU Cost:
216,140697
Check :
138912,140697 (I/O Cost) + 216,140697 (Converted CPU Cost) = 138912,140697 (Total NL Join Cost)
Summary
As you may have noticed, this conversion formula heavily relies on system statistics. Consequently, if your CPU Cost component exhibits unusually high values, you should investigate the relevant parameters—starting with the calculated value of SREADTIM and, most importantly, CPUSPEEDNW.
Both parameters serve as divisors in the formula mentioned above. Therefore, if either—or particularly CPUSPEEDNW—shows suspiciously low values, this factor may appear abnormally high.
Why is CPUSPEEDNW more important?
The key reason lies in the typical value ranges of these parameters.
SREADTIM generally fluctuates around a dozen or so, representing relatively low values. In contrast, CPUSPEEDNW is typically a much larger figure, reflecting the number of millions of instructions per second (MIPS) the CPU can process.
Since CPUSPEEDNW acts as a divisor in the formula, its higher magnitude means that even slight deviations can significantly impact the resulting value. Consequently, its influence on the final calculation is considerably greater than that of SREADTIM.
I agree, it’s a tiny thing.
Perhaps it’ll turn out useful one day.
Leave a Reply