11G OCM Use multi column statistics


References

Oracle® Database Performance Tuning Guide 11g Release 1 (11.1)

Oracle® Database PL/SQL Packages and Types Reference 11g Release 1 (11.1)

Oracle® Database Reference 11g Release 1 (11.1)

Overview

The 11G OCM Upgrade Exam objective is:

Use multi column statistics

This seems to mean:

  1. Collection of multi-column statistics
  2. Influencing plan chosen because of multi-column statistics

Procedure

The use of multicolumn statistics is needed when the table is not in at least third normal form. That is, the value of a column is only dependent on the primary key.

The discussion in MultiColumn Statistics gives the example of the following two columns in the sample SH.CUSTOMERS table:

  1. CUST_STATE_PROVINCE
  2. COUNTRY_ID

With individual column statistics the optimizer has no way of knowing that the cust_state_province and the country_id columns are related. By gathering statistics on these columns as a group (column group), the optimizer will now have a more accurate selectivity value for the group, instead of having to generate the value based on the individual column statistics.

Creating a Column Group

Following the example given in MultiColumn Statistics , the following PL/SQL block is run in the SH schema (using the CREATE_EXTENDED_STATISTICS procedure):

declare
  cg_name varchar2(30);
begin
 cg_name := dbms_stats.create_extended_stats(null,'customers', '(cust_state_province,country_id)');
end;
/

Monitoring Column Groups

The ALL_STAT_EXTENSIONS view shows:

OWNER TABLE_NAME EXTENSION_NAME EXTENSION CREATOR DROPPABLE
SH CUSTOMERS SYS_STU#S#WF25Z#QAHIHE#MOFFMM_ ("CUST_STATE_PROVINCE","COUNTRY_ID") USER YES

Using the query from MultiColumn Statistics , but on the ALL_STAT_EXTENSIONS and ALL_TAB_COL_STATISTICS views instead, the following query returns no rows found:

SELECT
    e.extension col_group,
	t.num_distinct,
	t.histogram
  FROM
      all_stat_extensions e
    INNER JOIN
	  all_tab_col_statistics t
	ON (
	    e.extension_name = t.column_name
	  AND
	    e.table_name = t.table_name
	)
  WHERE
    t.table_name = 'CUSTOMERS';

Gathering Statistics on Column Groups

Onto the new step of Gathering Statistics on Column Groups in MultiColumn Statistics . Here we use the GATHER_TABLE_STATS procedure:

BEGIN
  DBMS_STATS.GATHER_TABLE_STATS('SH','CUSTOMERS',
    METHOD_OPT => 'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS (CUST_STATE_PROVINCE,COUNTRY_ID) SIZE SKEWONLY'
	);
END;
/

Checking the statistics as before via:

SELECT
    e.extension col_group,
	t.num_distinct,
	t.histogram
  FROM
      all_stat_extensions e
    INNER JOIN
	  all_tab_col_statistics t
	ON (
	    e.extension_name = t.column_name
	  AND
	    e.table_name = t.table_name
	)
  WHERE
    t.table_name = 'CUSTOMERS';
COL_GROUP NUM_DISTINCT HISTOGRAM
("CUST_STATE_PROVINCE","COUNTRY_ID") 145 FREQUENCY

Explain Plan

The Explain Plan for the following query is:

EXPLAIN PLAN FOR
select * from customers where cust_state_province = 'CA' and country_id=52790;
SELECT * FROM TABLE( dbms_xplan.display() );
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
Plan hash value: 2008213504

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |  3448 |   646K|   406   (1)| 00:00:05 |
|*  1 |  TABLE ACCESS FULL| CUSTOMERS |  3448 |   646K|   406   (1)| 00:00:05 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790)

13 rows selected.