I think I read somewhere that using an "IN SELECT..." type of query will
speed up your query performance in Access 2003 better than linking the two
tables by their common field. Is that correct? If so, what is the proper
syntax for creating this type of query criteria?
My scenario is that I have a very long list (1000's) of employees and a
select group of managers (about 30). I only want the employees whose manager
is in my table of selected managers. I should end up with a list of <200.
Is this a good way to use the "IN Select.." as my criteria?
Using northwind: which employees haven't processed an order?
SELECT E1.EmployeeID
FROM Employees AS E1
WHERE NOT EXISTS (
SELECT *
FROM Orders AS O1
WHERE O1.EmployeeID = E1.EmployeeID);
SELECT E1.EmployeeID
FROM Employees AS E1
WHERE E1.EmployeeID NOT IN (
SELECT E1.EmployeeID
FROM Orders AS O1
WHERE O1.EmployeeID = E1.EmployeeID);
SELECT E1.EmployeeID
FROM Employees AS E1 LEFT JOIN Orders AS O1
ON O1.EmployeeID = E1.EmployeeID
WHERE O1.EmployeeID IS NULL;
All three queries are semantically equivalent; a smart optimizer will
recognise this and use the best plan regardless. How Jet handles each
requires detailed knowledge of its optimizer, which I do not possess.
I can tell you that the NOT EXISTS construct *could* outperform the
NOT IN one because it can 'short circuit' as soon as it finds a match
(you only need to find one for EXISTS to return true), as happens with
SQL Server (I read); what Jet *actually* does may not even be
documented. The LEFT JOIN would definitely (I think <g>) be slower in
the event, albeit unlikely, of the columns in the ON clause being
nullable.
Personally, I find the NOT EXISTS the most the intuitive and the LEFT
JOIN one the least intuitive, and that's what I would use to make my
choice. Minimising maintenance time is more important to me than run
time performance, all other things being equal.
In testing the above the LEFT JOIN ran in 11 milliseconds, the other
two in 10 milliseconds. OK, so I'm only using 10 employees and 900
orders i.e. not far off yours. My point is that performance is not a
factor here because I'm only interested in absolute performance rather
than relative performance.
Try googling the exact phase "premature optimization is the root of
all evil".
Jamie.
--