Subscribe to our ezine for more tips and tricks just go to www.asktheoracle.net/oracle-tips-signup.html
This is the 7th and final article in the series providing tips and techniques to improve the performance of your Oracle databases by orders of magnitude. This month we'll look at how to use sub queries in SQL to make a significant difference to Oracle database performance.
The previous issues are available on line just click here
Contact us directly with your questions and feedback and we'll answer them in a future issue.
Let's very quickly recap the other techniques we've discussed in this series:-
Click here to refresh your memory of the first 6 techniques
Sub-queries are exactly what they say they are - queries embedded in another SQL statement, often a query. They allow us to answer multi-part questions and are often interchangeable with a join. In fact, when executed, a query containing a sub-query may well be treated by Oracle exactly as if it were a join.
For example "give me the names of all the employees in the departments that had a budget of more than £20,000 this year".
Assuming that budget is an attribute of dept (department) we can answer the question by breaking it down into two parts: part 1 - "which departments had a budget over £20,000 this year ?" part 2 - "who works for those departments ?"
This can be translated into SQL as follows:
SELECT name FROM emp WHERE dept_id
IN
(SELECT dept_id FROM dept WHERE budget > 20,000).
Note that the query is written top down, but executed bottom up i.e. the sub query is run first.
That was a 2 part question, but suppose we had a 3 part question such as "give me the names of all the employees in the department(s) that had the biggest budget this year". In English this is only a 2 part question: "which departments had the biggest budget, and who works for those departments", but in SQL we have to split part 1 into 2 parts as follows: "what was the biggest budget this year ?" and "which departments had that budget".
Translate all this into SQL and we end up with the following statement:-
SELECT name FROM emp WHERE dept_id
IN
(SELECT dept_id FROM dept WHERE budget =
SELECT max(budget) FROM dept
).
Again the query is executed from the inside out, so the first question answered is "what was the highest budget this year?", the next question is "which departments had a budget equal to the maximum?" and finally "who works for that department ?".
The beauty of sub-queries is that we can keep nesting them almost infinitely to answer more and more complex questions. However, in practice it is unlikely that you would ever nest a query more than 3 levels deep because of performance issues.
The most common use of sub queries is in the WHERE clause of queries to define the limiting condition (i.e. what value(s) rows must have to be of interest), as in the above example.
However, they can also be used in other parts of the query in Oracle, as follows:
Until now we have looked at what are known as non-correlated sub-queries, but there are, in fact, two types of sub query: correlated and non-correlated.
Non-correlated sub queries are executed once for the whole statement, correlated sub queries are executed once per row in the parent query. Correlated sub queries also answer multi-part questions, but are most often used to check for existence or absence of matching records in the parent table and the related table in the sub query. A correlated sub query refers to a column from a table in the parent query.
This type of query can often be performed just as easily by a join query or a non-correlated sub query, but may be significantly faster when a correlated sub-query is used. For example, assuming we have a table of employees called emp and a table of departments called dept, to find out which departments do not have any employees assigned to them, the query can be written in three different ways - as a non-correlated sub query, as an outer join, or as a correlated sub-query.
You need to be aware though, that the outer join would do more than just return the names of those departments which do not have any employees assigned to them, it would also returns the names of those departments that do have employees assigned to them.
The reason for this is that the non-correlated sub-query can not use any indexes - the where clause of the sub query is just checking for NOT NULL rows - so a full table scan would be performed. Also the sub query would be executed once for each row in the dept table.
On the other hand, the correlated sub-query would be able to use the index and since only the dept_id would need to be returned by the sub query, there would no need for any subsequent table access.
For these reasons, the correlated sub-query would normally perform better than the non-correlated sub-query, in this case.
The main performance advantages arise when using sub queries to perform efficient existence checks i.e. when you don't actually need to know the value of one or more columns in a row in a table you just need to know whether or not there is a matching row. The EXISTS and NOT EXISTS operators are used for this purpose.
For example, let's suppose we want to find out the names of our top sales people where a top sales person is defined as an employee with sales over 100000 units of product 3 and let's assume we have that information in our database in our emp table.
We could write the query as a join like this:
SELECT emp.name FROM sales, emp
WHERE emp.id = sales.emp_id
AND sales.prod_id = 3
AND sales.total > 100000;
We could write the query using a non-correlated sub query like this:
SELECT emp.name FROM emp WHERE emp.id
IN
(SELECT emp.id
FROM sales
WHERE sales.prod_id
= 3
AND sales.total>100000);
The performance of this would probably be the same as the join query, but if we change it to a correlated sub-query using the EXISTS clause we could achieve great gains in performance because it only needs to check for a single row returned by the sub query.
SELECT emp.name FROM emp
WHERE EXISTS (
SELECT sales.emp_id FROM emp,
sales
WHERE emp.id = sales.emp_id
AND sales.prod_id = 3
AND sales.total > 100000);
The NOT EXISTS operator is used to determine the absence of a matching row and can also provide big performance gains.
For example, assume we want to find out which employees have not made any sales of any product. The non-correlated sub-query version would be:
SELECT emp.name FROM emp
WHERE emp.id NOT IN
( SELECT
emp_id FROM sales);
This would be very inefficient though, because a full table scan of sales would be performed for every row in emp as there are no limiting conditions. To solve this we need to make it a correlated sub-query using the NOT EXISTS clause as follows
SELECT emp.name FROM emp
WHERE NOT EXISTS(
SELECT emp_id FROM
sales, emp
WHERE emp.id = sales.emp_id);
This way the sub query can use the index (assuming there is one) on sales.emp_id and will only return one row from the table, instead of all the rows.
Sub queries can be great help in tuning queries in Oracle but they also have their drawbacks. As always changes to queries must be tested on reasonable volumes of data and the execution plan examined before and after any changes to ensure that the changes will improve performance rather than harm it.
---------------------------------------
Looking for more Oracle tips and tricks ? If you're looking for no frills, no fluff, just solid, reliable technical information, take a short cut now and subscribe to our jam-packed-full-of-tips-and-tricks ezine to help you make more of your Oracle systems and save you hours of blood, sweat and tears. Subscribe today and your first issue will soon be winging its way to your mailbox.
Send us your questions and we'll answer them in the next newsletter.
Smartsoft Computing Ltd
Bristol, England
Tel: 0845 0031320
Contact Us
Click here to view our privacy policy .
This site uses woopra.com to gather statistical information about our visitors. View woopra privacy policy .
Oracle is a registered trademark of Oracle Corporation and/or its affiliates. UNIX is a registered trademark of The Open Group in the United States and other countries.
© Copyright Smartsoft Computing Ltd 2001-2009. All rights reserved.