“every time you decide something without having a good reason, you jump to Conclusions whether you like it or not.”
-- Norton Juster, The Phantom Tollbooth
In an Oracle forum recently the following question was posted:
I have a request for 3 indices as shown below. Does the 1st index suffice for 2 and 3? Do I need all 3?
CREATE INDEX IDX_ATM_EM_EFF_VER_CURRENT_DEL ON ATM_xxx_SALARY
(EMPLOYEE_KEY, EFFECTIVE_DT, SALARY_VERSION_NUMBER, IS_CURRENT, IS_DELETED);
CREATE INDEX IDX_ATM_EM_VER_CURRENT ON ATM_xxx_SALARY
(EMPLOYEE_KEY, SALARY_VERSION_NUMBER, IS_CURRENT);
CREATE INDEX .IDX_ATM_SAL_CURRENT_DEL ON ATM_xxx_SALARY
(EMPLOYEE_KEY, IS_DELETED, IS_CURRENT);
Answering such a question in absence of any additional information isn’t easy, but looking at the columns it’s not unrealistic to make the following presumptions about the data:
EMPLOYEE_KEY must be not null and unique IS_CURRENT is either a YES or NO value IS_DELETED is either a YES or NO value
The following example, using queries tailored to the indexes to be created, was run in Oracle 11.2.0.4; notice that the first index is used in all three queries:
SQL> --
SQL> -- Create the table
SQL> --
SQL> create table atm_xxx_salary(
2 employee_key varchar2(20) not null,
3 emp_fname varchar2(20),
4 emp_lname varchar2(40),
5 effective_dt date,
6 salary_version_number number,
7 is_current varchar2(3),
8 is_deleted varchar2(3));
Table created.
SQL>
SQL> --
SQL> -- Load data
SQL> --
SQL> begin
2 for i in 1..100 loop
3 insert into atm_xxx_salary
4 values('Employee_'||i, 'Blobbo', 'Fingnestle', sysdate + mod(i, 9), mod(i,11)+1, 'YES','NO');
5 End loop;
6 commit;
7 end;
8 /
PL/SQL procedure successfully completed.
SQL>
SQL> --
SQL> -- Create first index
SQL> --
SQL> CREATE INDEX IDX_ATM_EM_EFF_VER_CURRENT_DEL ON ATM_xxx_SALARY
2 (EMPLOYEE_KEY, EFFECTIVE_DT, SALARY_VERSION_NUMBER, IS_CURRENT, IS_DELETED);
Index created.
SQL>
SQL> exec dbms_stats.gather_schema_stats(user);
PL/SQL procedure successfully completed.
SQL>
SQL> set autotrace on linesize 150
SQL>
SQL> select *
2 from atm_xxx_salary
3 where employee_key like 'Emp%10%'
4 and effective_dt between sysdate and sysdate+4
5 and salary_version_number = 7
6 and is_current = 'YES';
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 2729507590
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6 | 288 | 2 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| ATM_XXX_SALARY | 6 | 288 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | IDX_ATM_EM_EFF_VER_CURRENT_DEL | 6 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(SYSDATE@!+4>=SYSDATE@!)
3 - access("EMPLOYEE_KEY" LIKE 'Emp%10%' AND "EFFECTIVE_DT">=SYSDATE@! AND
"SALARY_VERSION_NUMBER"=7 AND "IS_CURRENT"='YES' AND "EFFECTIVE_DT"<=SYSDATE@!+4)
filter("SALARY_VERSION_NUMBER"=7 AND "EFFECTIVE_DT"=SYSDATE@! AND "IS_CURRENT"='YES')
SQL>
SQL> select *
2 from atm_xxx_salary
3 where employee_key like 'Emp%10%'
4 and is_deleted = 'NO'
5 and is_current = 'YES';
EMPLOYEE_KEY EMP_FNAME EMP_LNAME EFFECTIVE SALARY_VERSION_NUMBER IS_ IS_
-------------------- -------------------- ---------------------------------------- --------- --------------------- --- ---
Employee_10 Blobbo Fingnestle 22-MAR-16 11 YES NO
Employee_100 Blobbo Fingnestle 22-MAR-16 2 YES NO
Execution Plan
----------------------------------------------------------
Plan hash value: 1328304448
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 99 | 4752 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID | ATM_XXX_SALARY | 99 | 4752 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_ATM_EM_EFF_VER_CURRENT_DEL | 99 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("EMPLOYEE_KEY" LIKE 'Emp%10%' AND "IS_CURRENT"='YES' AND "IS_DELETED"='NO')
filter("EMPLOYEE_KEY" LIKE 'Emp%10%' AND "IS_DELETED"='NO' AND "IS_CURRENT"='YES')
SQL>
SQL> select *
2 from atm_xxx_salary
3 where employee_key like 'Emp%10%'
4 and is_deleted = 'NO'
5 and is_current = 'YES';
EMPLOYEE_KEY EMP_FNAME EMP_LNAME EFFECTIVE SALARY_VERSION_NUMBER IS_ IS_
-------------------- -------------------- ---------------------------------------- --------- --------------------- --- ---
Employee_10 Blobbo Fingnestle 22-MAR-16 11 YES NO
Employee_100 Blobbo Fingnestle 22-MAR-16 2 YES NO
Execution Plan
----------------------------------------------------------
Plan hash value: 1328304448
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 99 | 4752 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID | ATM_XXX_SALARY | 99 | 4752 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_ATM_EM_EFF_VER_CURRENT_DEL | 99 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("EMPLOYEE_KEY" LIKE 'Emp%10%' AND "IS_CURRENT"='YES' AND "IS_DELETED"='NO')
filter("EMPLOYEE_KEY" LIKE 'Emp%10%' AND "IS_DELETED"='NO' AND "IS_CURRENT"='YES')
SQL>
SQL> set autotrace off
The above appears to answer the question and prove the remaining two indexes are not necessary, but it can’t hurt to test this theory by creating the remaining indexes and running the same set of queries after each index creation. Creating the second index apparently does no good as it isn’t used in favor of the first:
SQL> --
SQL> -- Create second index
SQL> --
SQL> CREATE INDEX IDX_ATM_EM_VER_CURRENT ON ATM_xxx_SALARY
2 (EMPLOYEE_KEY, SALARY_VERSION_NUMBER, IS_CURRENT);
Index created.
SQL>
SQL>
SQL> set autotrace on
SQL>
SQL> select *
2 from atm_xxx_salary
3 where employee_key like 'Emp%10%'
4 and effective_dt between sysdate and sysdate+4
5 and salary_version_number = 7
6 and is_current = 'YES';
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 2729507590
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6 | 288 | 2 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID | ATM_XXX_SALARY | 6 | 288 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | IDX_ATM_EM_EFF_VER_CURRENT_DEL | 6 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(SYSDATE@!+4>=SYSDATE@!)
3 - access("EMPLOYEE_KEY" LIKE 'Emp%10%' AND "EFFECTIVE_DT">=SYSDATE@! AND
"SALARY_VERSION_NUMBER"=7 AND "IS_CURRENT"='YES' AND "EFFECTIVE_DT"<=SYSDATE@!+4)
filter("SALARY_VERSION_NUMBER"=7 AND "EFFECTIVE_DT"=SYSDATE@! AND "IS_CURRENT"='YES')
SQL>
SQL> select *
2 from atm_xxx_salary
3 where employee_key like 'Emp%10%'
4 and is_deleted = 'NO'
5 and is_current = 'YES';
EMPLOYEE_KEY EMP_FNAME EMP_LNAME EFFECTIVE SALARY_VERSION_NUMBER IS_ IS_
-------------------- -------------------- ---------------------------------------- --------- --------------------- --- ---
Employee_10 Blobbo Fingnestle 22-MAR-16 11 YES NO
Employee_100 Blobbo Fingnestle 22-MAR-16 2 YES NO
Execution Plan
----------------------------------------------------------
Plan hash value: 1328304448
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 99 | 4752 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID | ATM_XXX_SALARY | 99 | 4752 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_ATM_EM_EFF_VER_CURRENT_DEL | 99 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("EMPLOYEE_KEY" LIKE 'Emp%10%' AND "IS_CURRENT"='YES' AND "IS_DELETED"='NO')
filter("EMPLOYEE_KEY" LIKE 'Emp%10%' AND "IS_DELETED"='NO' AND "IS_CURRENT"='YES')
SQL>
SQL> select *
2 from atm_xxx_salary
3 where employee_key like 'Emp%10%'
4 and is_deleted = 'NO'
5 and is_current = 'YES';
EMPLOYEE_KEY EMP_FNAME EMP_LNAME EFFECTIVE SALARY_VERSION_NUMBER IS_ IS_
-------------------- -------------------- ---------------------------------------- --------- --------------------- --- ---
Employee_10 Blobbo Fingnestle 22-MAR-16 11 YES NO
Employee_100 Blobbo Fingnestle 22-MAR-16 2 YES NO
Execution Plan
----------------------------------------------------------
Plan hash value: 1328304448
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 99 | 4752 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID | ATM_XXX_SALARY | 99 | 4752 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_ATM_EM_EFF_VER_CURRENT_DEL | 99 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("EMPLOYEE_KEY" LIKE 'Emp%10%' AND "IS_CURRENT"='YES' AND "IS_DELETED"='NO')
filter("EMPLOYEE_KEY" LIKE 'Emp%10%' AND "IS_DELETED"='NO' AND "IS_CURRENT"='YES')
SQL>
SQL> set autotrace off
Let’s create the third index and see what Oracle does:
SQL> --
SQL> -- Create third index
SQL> --
SQL> CREATE INDEX IDX_ATM_SAL_CURRENT_DEL ON ATM_xxx_SALARY
2 (EMPLOYEE_KEY, IS_DELETED, IS_CURRENT);
Index created.
SQL>
SQL> set autotrace on
SQL>
SQL> select *
2 from atm_xxx_salary
3 where employee_key like 'Emp%10%'
4 and effective_dt between sysdate and sysdate+4
5 and salary_version_number = 7
6 and is_current = 'YES';
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 2729507590
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6 | 288 | 2 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID | ATM_XXX_SALARY | 6 | 288 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | IDX_ATM_EM_EFF_VER_CURRENT_DEL | 6 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(SYSDATE@!+4>=SYSDATE@!)
3 - access("EMPLOYEE_KEY" LIKE 'Emp%10%' AND "EFFECTIVE_DT">=SYSDATE@! AND
"SALARY_VERSION_NUMBER"=7 AND "IS_CURRENT"='YES' AND "EFFECTIVE_DT"<=SYSDATE@!+4)
filter("SALARY_VERSION_NUMBER"=7 AND "EFFECTIVE_DT"=SYSDATE@! AND "IS_CURRENT"='YES')
SQL>
SQL> select *
2 from atm_xxx_salary
3 where employee_key like 'Emp%10%'
4 and is_deleted = 'NO'
5 and is_current = 'YES';
EMPLOYEE_KEY EMP_FNAME EMP_LNAME EFFECTIVE SALARY_VERSION_NUMBER IS_ IS_
-------------------- -------------------- ---------------------------------------- --------- --------------------- --- ---
Employee_10 Blobbo Fingnestle 22-MAR-16 11 YES NO
Employee_100 Blobbo Fingnestle 22-MAR-16 2 YES NO
Execution Plan
---------------------------------------------------------
Plan hash value: 1562453262
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 99 | 4752 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID | ATM_XXX_SALARY | 99 | 4752 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_ATM_SAL_CURRENT_DEL | 99 | | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("EMPLOYEE_KEY" LIKE 'Emp%10%' AND "IS_DELETED"='NO' AND "IS_CURRENT"='YES')
filter("EMPLOYEE_KEY" LIKE 'Emp%10%' AND "IS_DELETED"='NO' AND "IS_CURRENT"='YES')
SQL>
SQL> select *
2 from atm_xxx_salary
3 where employee_key like 'Emp%10%'
4 and is_deleted = 'NO'
5 and is_current = 'YES';
EMPLOYEE_KEY EMP_FNAME EMP_LNAME EFFECTIVE SALARY_VERSION_NUMBER IS_ IS_
-------------------- -------------------- ---------------------------------------- --------- --------------------- --- ---
Employee_10 Blobbo Fingnestle 22-MAR-16 11 YES NO
Employee_100 Blobbo Fingnestle 22-MAR-16 2 YES NO
Execution Plan
----------------------------------------------------------
Plan hash value: 1562453262
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 99 | 4752 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID | ATM_XXX_SALARY | 99 | 4752 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_ATM_SAL_CURRENT_DEL | 99 | | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("EMPLOYEE_KEY" LIKE 'Emp%10%' AND "IS_DELETED"='NO' AND "IS_CURRENT"='YES')
filter("EMPLOYEE_KEY" LIKE 'Emp%10%' AND "IS_DELETED"='NO' AND "IS_CURRENT"='YES')
SQL>
SQL>
SQL> set autotrace off
Notice that the first and third indexes are used, with the third index used on the second and third queries. Remember that the first index was used on all three queries so given the presumptions made on the nature of the data the second and third indexes appear to not be needed. Of course this is not rigorous testing and other queries and/or data distributions could produce differing results. You really need to test, test, test, test and test again before making a final decision. But, the indications are good that a single index would suffice to provide “speedy” data retrieval.
Please don’t make me repeat myself.