Or was it:
I can’t remember…
And when I added I’ve used optimizer_index_cost_adj to deal with it I heard:
No, seriously he was very polite and respectful for some unknown DBA spreading such heresies. I would’ve killed the poor fellow straight away if I heard such things 😉
So with dedication to him for being such a fine bloke let’s get down to the business.
In general Neil is of course right with both answers, however there are some Oracle quirks that may make you think that way. In this part I’m going to start with very core aspects of how things work and some common cause of optimizer falling to the dark side of force.
Oracle 9i introduced the concept of system statistics to the cost based optimizer, and with 10g they become enabled by default (NOWORKLOAD stats). I’m not going to describe all the details, since it has already been done many years ago. For those of you who don’t know how the cost is calculated I recommend excellent blog series by Randolf Geist – really amazing stuff did he – it is necessary to understand it for all the later talks. For this part we only need to note the calculated cost of full table scan (FTS) is directly related to multi-block read times (and inversely to single ones too, but only to convert for backward compatibility, as initially the cost was expressed in single-block reads), which for NOWORKLOAD stats are derived from I/O seek time and transfer speed, with multi block read count (MBRC) setting adjustment for multi block reads. Using default NOWORKLOAD stats or gathering one yourselves won’t rather get you into problems, as the values will just get more realistic with the formulas still being the same. Gathering WORKLOAD stats however is a whole different kettle of fish. First, the values for single and multi-block read times are really measured, not synthesized using formulas based on seek time and transfer speed. MBRC is measured too. Second – other new measured parameters as MAXTHR (maximum I/O throughput) and SLAVETHR (slave throughput) start to play a role in costing the FTS (parallel cost). Jonathan Lewis described them here and here respectively. So what’s the point of all of this?
Well, it is not uncommon for the GATHER_SYSTEM_STATS procedure to get these values wrong. For many reasons known or not to me (read: bugs). You may get wrong ratio of SREADTIM to MREADTIM resulting in some fallback safeguard kicking in behind the scenes or slave throughputs measured too low, raising the cost of parallel FTS to the level of being below index access path. And then you’ll see in your queries the love Oracle has for indexes 🙂 If you use shared storage for your servers you may be just unlucky to pick the wrong time to do the measurement. If you use direct NFS, engineered system like Exadata, or basically any other uncommon setup you’re also more likely to catch some bug.
The solution to above? First off verify the results from gathering WORKLOAD stats. If the numbers are odd retry, check for known bug on MOS and eventually set the stats yourself with SET_SYSTEM_STATS – that is if you know your system. Don’t use any unrealistic numbers, test the results with 10053 trace and you should be fine.
A bit on a side note, there is also another deficiency with current approach of I/O performance measurement procedure. It just assumes the storage for the database to have equal performance characteristics, as it runs on all of it. If you are using different storage tiers for purpose of information lifecycle management, like slower, bigger disks for archive data and fast flash ones for hot data you’ll get some average numbers out of it. Statistically human and a cheetah have each three legs on average, so they should run the same. With queries using tables stored on different storage optimizer can’t currently take that into account when making execution plan. I would very much like it could.
Now, about that evil optimizer_index_cost_adj parameter. It had it’s golden years in Oracle 8 era, where the system stats were not available and single and multi block reads were treated more or less the same when costing. Back then it was used mostly to lower the cost of index access path. With the advent of 9i things have turned around as described above – so now it can rather be more often found used to make index access paths more costly (multiplied by number of hundreds above default 100) when they start to suddenly appear. This approach however has several drawbacks – these three should suffice:
- The issue does not stem from costing of index access paths, which stays the same, but from costing of full table scans – due to many reasons. Changing this parameter won’t solve the root cause of the issue.
- It is system-wide change. Queries you want will get affected, but also the ones you don’t want to – such as using database dictionary – for a worse. Trust me on that 🙂
- It may influence some other places in optimizer in an undesired way (like we know it all how it works…).
So why did I do it? The answer is: to gain more time. When the fire is raging the first priority is to put it out, and then with the situation more or less under control have the time to investigate the cause of it. Still you need to know at least what’s burning, so you don’t use water to douse oil 🙂 Think of it like well known variable on the opposite side of the equation where the unknown lies. I’m not going to recommend this approach, but it’s there when things start to get hot. Just be sure to treat it only as a quick and dirty hack to buy some time, and revert it as soon as real cause and solution is found. Check the system stats first, though.
To me all of this hassle stems from the fact that in the beginning cost was just reflecting a number of single block reads that need to be made. The equations for index access paths are pretty much written in stone and not influenced by anything except few known stats and above mentioned knob. Full table scans on the other hand have to deal with many more complicated parameters, take into account way more mechanisms which in turn evolve over time and as such are more prone to errors. Like partition pruning, which I’m going to show you in part two of this journey… More brevity and tech stuff this time, less mumbling – I promise 🙂