exists <-> in

Quelle: http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:442029737684

Can you pls explain the diff between IN and EXISTS and NOT IN
and NOT EXISTS. Because I have read that EXISTS will work better than
IN and NOT EXISTS will work better than NOT IN (read this is Oracle
server tunning).


Regards,
Madhusudhana Rao.P


and we said...
see
http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:953229842074

It truly depends on the query and the data as to which is BEST.

Note that in general, NOT IN and NOT EXISTS are NOT the same!!!


SQL> select count(*) from emp where empno not in ( select mgr from emp );

COUNT(*)
----------
0

apparently there are NO rows such that an employee is not a mgr -- everyone is a mgr
(or are they)


SQL> select count(*) from emp T1
2 where not exists ( select null from emp T2 where t2.mgr = t1.empno );

COUNT(*)
----------
9


Ahh, but now there are 9 people who are not managers. Beware the NULL value and NOT IN!!
(also the reason why NOT IN is sometimes avoided).



NOT IN can be just as efficient as NOT EXISTS -- many orders of magnitude BETTER even --
if an "anti-join" can be used (if the subquery is known to not return nulls)



und sonst noch:

Quelle: http://asktom.oracle.com/pls/asktom/f?p=100:11:3608003423518853::::P11_QUESTION_ID:953229842074

You Asked
Tom:

can you give me some example at which situation
IN is better than exist, and vice versa.
and we said...
Well, the two are processed very very differently.

Select * from T1 where x in ( select y from T2 )

is typically processed as:

select *
from t1, ( select distinct y from t2 ) t2
where t1.x = t2.y;

The subquery is evaluated, distinct'ed, indexed (or hashed or sorted) and then joined to
the original table -- typically.


As opposed to

select * from t1 where exists ( select null from t2 where y = x )

That is processed more like:


for x in ( select * from t1 )
loop
if ( exists ( select null from t2 where y = x.x )
then
OUTPUT THE RECORD
end if
end loop

It always results in a full scan of T1 whereas the first query can make use of an index
on T1(x).


So, when is where exists appropriate and in appropriate?

Lets say the result of the subquery
( select y from T2 )

is "huge" and takes a long time. But the table T1 is relatively small and executing (
select null from t2 where y = x.x ) is very very fast (nice index on t2(y)). Then the
exists will be faster as the time to full scan T1 and do the index probe into T2 could be
less then the time to simply full scan T2 to build the subquery we need to distinct on.


Lets say the result of the subquery is small -- then IN is typicaly more appropriate.


If both the subquery and the outer table are huge -- either might work as well as the
other -- depends on the indexes and other factors.



... mit bestem Dank an Markus!

Kommentare

Beliebte Posts aus diesem Blog

PGA unter Oracle 11g

trunc(sysdate) - nette Spiele mit dem Datum

Datapump - Verzeichnis erstellen