nested queries vs. non-nested

R

Rebecca

Just wondering what Access best practices are.

I always build nested queries -- 1) identify the population then 1b) subset
(men) then 1c) men with blue eyes, brown eyes, etc. 1b) subset2 (women) 1c)
women with blue eyes, brown eyes, etc.

However, a colleague of mine who uses Oracle is now starting to use Access,
and she doesn't use nested queries -- she builds each query independently:
1) men with blue eyes 2) men with brown eyes 3) women with blue eyes, etc.

I think this is poor form -- you want to make sure you have a consistent
universe you're starting with, and then create the subqueries off of it;
there's too great a chance you'll leave out a little detail in one of the
queries, and my way makes certain you're comparing apples to apples.

It also makes it harder to track her queries, since there are no object
dependencies--you have to open each one and read it to see what it's all
about. Also harder to make changes--you have to change every query, instead
of embeddnig the change in one nest level, which will then automatically
carry over into the lower levels.

However--what do others feel? Are both methods equally good?
 
D

Douglas J. Steele

You should be able to do it with a single query, as opposed to multiple.

SELECT Count(*) AS TotalPopulation,
Sum(IIf([Gender] = "M", 1, 0)) AS Males,
Sum(IIf([Gender] = "F", 1, 0)) AS Females,

Sum(IIf([Gender] = "M" AND Eyes = "Blue", 1, 0)) AS BlueEyedMales,
Sum(IIf([Gender] = "F" AND Eyes = "Blue", 1, 0)) AS BlueEyedFemales,
....
FROM MyTable
 
K

Ken Sheridan

I'd have reservations about both approaches.

For the type of scenario you describe then a report will often be more
appropriate, basing this on a single query and grouping/aggregating at
whatever levels are required.

If aggregation at different levels is required in the query's result table,
however, then conditional aggregation as described by Doug can be employed.

If only one level of aggregation is required then a single query can be
grouped accordingly, e.g.

SELECT Gender, EyeColour,
COUNT(*) As NumberOfPeople
FROM YourTable
GROUP BY Gender, EyeColour;

You could of course convert this to a crosstab query if you wish to show the
numbers per eye colour at row/column intersections if desired.

Parameters can also be used to restrict a single query of course:

PARAMETERS [Enter gender;] TEXT(1),
[Enter eye colour:] TEXT(20);
SELECT Gender, EyeColour,
COUNT(*) As NumberOfPeople
FROM YourTable
WHERE Gender = [Enter gender:]
AND EyeColour = [Enter eye colour:]
GROUP BY Gender, EyeColour;

BTW the term 'subquery' refers not, as you've used it, to a query on another
query, but to one embedded within another query as in the following example
(selected pretty randomly from those currently at hand on my machine) of
meter readings where two subqueries are used in the query's SELECT clause to
return the days between readings and the usage over that period, and another
is used in the query's WHERE clause to restrict the result by the outer query
to exclude the first reading per utility type as the first reading will not
have any earlier reading to compute the days from or usage of course:

SELECT R1.ReadingType, R1.ReadingDate, R1.ReadingDate -
(SELECT MAX(R2.ReadingDate)
FROM Readings As R2
WHERE R2.ReadingDate < R1.ReadingDate
AND ReadingType = R1.ReadingType)
AS DaysSinceLastReading,
R1.Reading -
(SELECT MAX(R3.Reading)
FROM Readings As R3
WHERE R3.ReadingDate < R1.ReadingDate
AND ReadingType = R1.ReadingType)
AS [Usage]
FROM Readings AS R1
WHERE R1.ReadingDate >
(SELECT MIN(R4.ReadingDate)
FROM Readings As R4
WHERE ReadingType = R1.ReadingType)
ORDER BY R1.ReadingType, R1.ReadingDate;

Note how in this aliases R1 to R4 are used to distinguish the different
instances of the Readings table, allowing the subqueries to be correlated
with the outer query. Subqueries could have been used like this in Doug's
example to aggregate the values at different levels, but Access does not
always handle correlated subqueries very efficiently, and often better
performance can be obtained by creating separate queries, each aggregating at
a different level, and joining them. This is not the same as basing queries
on queries, as you are doing, as the queries are joined in the final query's
JOIN clause rather than being included in the SELECT clause at each level as
in your case.

Ken Sheridan
Stafford, England
 
R

Rebecca

I appreciate your 3 answers.

I don't think my simple example fully reflects the complexity of the
database I use, in which you cannot always create the universe in one query.
(though sometimes I know I nest more than I have to, just because I'm trying
to be careful)

Nor do I always want counts of the people with blue eyes, brown eyes, etc.
-- I need the actual full records (names, etc.) for all people in each
category.

I also don't need reports--just a query, sometimes for one situation (only
want blue eyed men over 50) sometimes for multiple situations (blue eyed men
50 _and_ brown eyed women <35).

My original question was what is considered best practices, and since I got
3 different answers--I can't conclude my way or my colleague's way is right
or wrong.

If anyone else wants to weigh in on the _theme_ of my question (not the
specifics of the simplified dB I presented), feel free to.



Ken Sheridan said:
I'd have reservations about both approaches.

For the type of scenario you describe then a report will often be more
appropriate, basing this on a single query and grouping/aggregating at
whatever levels are required.

If aggregation at different levels is required in the query's result table,
however, then conditional aggregation as described by Doug can be employed.

If only one level of aggregation is required then a single query can be
grouped accordingly, e.g.

SELECT Gender, EyeColour,
COUNT(*) As NumberOfPeople
FROM YourTable
GROUP BY Gender, EyeColour;

You could of course convert this to a crosstab query if you wish to show the
numbers per eye colour at row/column intersections if desired.

Parameters can also be used to restrict a single query of course:

PARAMETERS [Enter gender;] TEXT(1),
[Enter eye colour:] TEXT(20);
SELECT Gender, EyeColour,
COUNT(*) As NumberOfPeople
FROM YourTable
WHERE Gender = [Enter gender:]
AND EyeColour = [Enter eye colour:]
GROUP BY Gender, EyeColour;

BTW the term 'subquery' refers not, as you've used it, to a query on another
query, but to one embedded within another query as in the following example
(selected pretty randomly from those currently at hand on my machine) of
meter readings where two subqueries are used in the query's SELECT clause to
return the days between readings and the usage over that period, and another
is used in the query's WHERE clause to restrict the result by the outer query
to exclude the first reading per utility type as the first reading will not
have any earlier reading to compute the days from or usage of course:

SELECT R1.ReadingType, R1.ReadingDate, R1.ReadingDate -
(SELECT MAX(R2.ReadingDate)
FROM Readings As R2
WHERE R2.ReadingDate < R1.ReadingDate
AND ReadingType = R1.ReadingType)
AS DaysSinceLastReading,
R1.Reading -
(SELECT MAX(R3.Reading)
FROM Readings As R3
WHERE R3.ReadingDate < R1.ReadingDate
AND ReadingType = R1.ReadingType)
AS [Usage]
FROM Readings AS R1
WHERE R1.ReadingDate >
(SELECT MIN(R4.ReadingDate)
FROM Readings As R4
WHERE ReadingType = R1.ReadingType)
ORDER BY R1.ReadingType, R1.ReadingDate;

Note how in this aliases R1 to R4 are used to distinguish the different
instances of the Readings table, allowing the subqueries to be correlated
with the outer query. Subqueries could have been used like this in Doug's
example to aggregate the values at different levels, but Access does not
always handle correlated subqueries very efficiently, and often better
performance can be obtained by creating separate queries, each aggregating at
a different level, and joining them. This is not the same as basing queries
on queries, as you are doing, as the queries are joined in the final query's
JOIN clause rather than being included in the SELECT clause at each level as
in your case.

Ken Sheridan
Stafford, England

Rebecca said:
Just wondering what Access best practices are.

I always build nested queries -- 1) identify the population then 1b) subset
(men) then 1c) men with blue eyes, brown eyes, etc. 1b) subset2 (women) 1c)
women with blue eyes, brown eyes, etc.

However, a colleague of mine who uses Oracle is now starting to use Access,
and she doesn't use nested queries -- she builds each query independently:
1) men with blue eyes 2) men with brown eyes 3) women with blue eyes, etc.

I think this is poor form -- you want to make sure you have a consistent
universe you're starting with, and then create the subqueries off of it;
there's too great a chance you'll leave out a little detail in one of the
queries, and my way makes certain you're comparing apples to apples.

It also makes it harder to track her queries, since there are no object
dependencies--you have to open each one and read it to see what it's all
about. Also harder to make changes--you have to change every query, instead
of embeddnig the change in one nest level, which will then automatically
carry over into the lower levels.

However--what do others feel? Are both methods equally good?
 
K

Ken Sheridan

The point I was trying to make was that both approaches introduce unnecessary
complexity. You are both using fundamental relational operations; you are
using projection of one query over another; she is presumably using a union
to pull her separate queries together; and you are both using restriction to
limit the final result. All three were included in the original eight
operators of the relational algebra when Codd first proposed the database
relational model. So *in the abstract* I see no qualitative difference, and
neither can be said to be 'best practice' per se; one approach might be more
appropriate in one situation, the other in another. In most situations,
however, I doubt either would be.

I suspect you are right when in thinking that you 'nest' more than you need
to. Although I'm not a database developer as such, but like many in the
scientific community merely use databases as a tool in my own area of work,
I've often been asked to look at other people's applications, and a common
feature has been the degree to which people frequently base queries on
queries on queries….. As the reason I've been asked to take a look at an
application is usually either because they have a problem with it, or because
they want to extend the functionality and are unable to do so themselves, one
of the biggest headaches I've had is in drilling down through layers of
queries, some of which repeat what was done in a query two or three layers
down. It may seem an easy option, when you already have one query, to base
another on it, and then another on that, and so on, but if you are not
careful you can end up with a multi-headed hydra. The Shorter Oxford English
Dictionary defines hydra as, amongst other things:

"A thing or person likened to the mythological hydra in its baneful
character, its multifarious aspects, or the difficulty of its extirpation."

I couldn't have put it better myself!

By all means base a query on a query, but only where necessary. And,
equally, your colleague can continue to create separate queries and pull the
results together via a union operation, but again, only when necessary. The
KISS principle has a lot going for it.

I know you want to address principles, not specifics, but taking your
example, just to show how this can be handled with one very simple query:

SELECT *
FROM People
WHERE
(gender = "M" AND AGE(dob,DATE()) > 50
AND eyecolour = "Blue")
OR
(gender = "F" AND AGE(dob,DATE()) < 35
AND eyecolour = "Brown");

where AGE is the function posted at:


http://www.mvps.org/access/datetime/date0001.htm


You clearly have an interest in writing good queries, and I must say its
good to hear from someone who is concerned with the nitty-gritty of what
databases are all about rather than with getting their forms to flash
multi-coloured lights while playing the theme from The Deer Hunter (I made
that up, its was actually Apocalypse Now). I'd recommend that you get hold
of a copy of Joe Celko's book 'SQL for Smarties'. Its not expensive as
computing books go, but its worth its weight in gold. Its not directed at
Access, but deals with standard SQL. In fact Joe's views on Access don't
bear repetition where the might be seen by those of a sensitive disposition.
I won't pretend its an easy read, but anyone who knows the basics and who's
prepared to put a little mental effort into it can't help but benefit.

Ken Sheridan
Stafford, England

Rebecca said:
I appreciate your 3 answers.

I don't think my simple example fully reflects the complexity of the
database I use, in which you cannot always create the universe in one query.
(though sometimes I know I nest more than I have to, just because I'm trying
to be careful)

Nor do I always want counts of the people with blue eyes, brown eyes, etc.
-- I need the actual full records (names, etc.) for all people in each
category.

I also don't need reports--just a query, sometimes for one situation (only
want blue eyed men over 50) sometimes for multiple situations (blue eyed men
50 _and_ brown eyed women <35).

My original question was what is considered best practices, and since I got
3 different answers--I can't conclude my way or my colleague's way is right
or wrong.

If anyone else wants to weigh in on the _theme_ of my question (not the
specifics of the simplified dB I presented), feel free to.



Ken Sheridan said:
I'd have reservations about both approaches.

For the type of scenario you describe then a report will often be more
appropriate, basing this on a single query and grouping/aggregating at
whatever levels are required.

If aggregation at different levels is required in the query's result table,
however, then conditional aggregation as described by Doug can be employed.

If only one level of aggregation is required then a single query can be
grouped accordingly, e.g.

SELECT Gender, EyeColour,
COUNT(*) As NumberOfPeople
FROM YourTable
GROUP BY Gender, EyeColour;

You could of course convert this to a crosstab query if you wish to show the
numbers per eye colour at row/column intersections if desired.

Parameters can also be used to restrict a single query of course:

PARAMETERS [Enter gender;] TEXT(1),
[Enter eye colour:] TEXT(20);
SELECT Gender, EyeColour,
COUNT(*) As NumberOfPeople
FROM YourTable
WHERE Gender = [Enter gender:]
AND EyeColour = [Enter eye colour:]
GROUP BY Gender, EyeColour;

BTW the term 'subquery' refers not, as you've used it, to a query on another
query, but to one embedded within another query as in the following example
(selected pretty randomly from those currently at hand on my machine) of
meter readings where two subqueries are used in the query's SELECT clause to
return the days between readings and the usage over that period, and another
is used in the query's WHERE clause to restrict the result by the outer query
to exclude the first reading per utility type as the first reading will not
have any earlier reading to compute the days from or usage of course:

SELECT R1.ReadingType, R1.ReadingDate, R1.ReadingDate -
(SELECT MAX(R2.ReadingDate)
FROM Readings As R2
WHERE R2.ReadingDate < R1.ReadingDate
AND ReadingType = R1.ReadingType)
AS DaysSinceLastReading,
R1.Reading -
(SELECT MAX(R3.Reading)
FROM Readings As R3
WHERE R3.ReadingDate < R1.ReadingDate
AND ReadingType = R1.ReadingType)
AS [Usage]
FROM Readings AS R1
WHERE R1.ReadingDate >
(SELECT MIN(R4.ReadingDate)
FROM Readings As R4
WHERE ReadingType = R1.ReadingType)
ORDER BY R1.ReadingType, R1.ReadingDate;

Note how in this aliases R1 to R4 are used to distinguish the different
instances of the Readings table, allowing the subqueries to be correlated
with the outer query. Subqueries could have been used like this in Doug's
example to aggregate the values at different levels, but Access does not
always handle correlated subqueries very efficiently, and often better
performance can be obtained by creating separate queries, each aggregating at
a different level, and joining them. This is not the same as basing queries
on queries, as you are doing, as the queries are joined in the final query's
JOIN clause rather than being included in the SELECT clause at each level as
in your case.

Ken Sheridan
Stafford, England

Rebecca said:
Just wondering what Access best practices are.

I always build nested queries -- 1) identify the population then 1b) subset
(men) then 1c) men with blue eyes, brown eyes, etc. 1b) subset2 (women) 1c)
women with blue eyes, brown eyes, etc.

However, a colleague of mine who uses Oracle is now starting to use Access,
and she doesn't use nested queries -- she builds each query independently:
1) men with blue eyes 2) men with brown eyes 3) women with blue eyes, etc.

I think this is poor form -- you want to make sure you have a consistent
universe you're starting with, and then create the subqueries off of it;
there's too great a chance you'll leave out a little detail in one of the
queries, and my way makes certain you're comparing apples to apples.

It also makes it harder to track her queries, since there are no object
dependencies--you have to open each one and read it to see what it's all
about. Also harder to make changes--you have to change every query, instead
of embeddnig the change in one nest level, which will then automatically
carry over into the lower levels.

However--what do others feel? Are both methods equally good?
 
D

david

Best Access design practice is to use nested queries where
required.

Access is designed to work with nested queries. Nested
queries are optimised collectively all the way down the stack.

Sometimes other database systems are not designed to work
well with nested queries, and perform badly when you try.

Access has a query design environment that makes it easy to
work with nested queries.

Sometimes other design environments make it difficult to work
with nested queries.

When you are working on a small single project, it is easy to
keep track of which query depends on which other query.

Sometimes when you are working on a large group of large
projects, it is better to make each query entirely independent.

The first time I tried to do a simple join query on an Oracle
database, it took 30 minutes to run and never finished. I wouldn't
say that your colleague is demonstrating "poor form", sometimes
things have to work differently in that world.

(david)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top