keyword lookup system guidance needed

D

djc

I have a specific question about being able to retrieve records based on a
certian condition which I will describe below. However, please also let me
know if you think my table structure is not a good idea or if you have other
ways which you think are better for doing this. The basic idea here is being
able to search for records based on one or several keywords.

The 2 tables:

tblMain: (I'll call them articles for this posts description's sake)
MainID
Title
Description

tblKeywords:
MainID
Keyword

tblMain: (example data)
101, How to make stew, description.....
102, vba trim() info, description....
103, Whats in stew, description....

tblKeywords (example data)
101, stew
101, cooking
101, food
102, vba
102, trim
102, programming
103, stew
103, food

So, given one user search field I intend on letting the user type in
multiple keywords seperated by spaces (similar to a google or yahoo search)
and then parsing that to get my multiple keywords. From these keywords I
will construct an SQL statement to find the appropriate records. So my
specific question comes in here. I know I can easily implement a logical OR
operator to get all the records (articles) that have at least one of the
keywords supplied by the user in the tblKeywords table but what about giving
the option to perform an AND logical comparison? Since the tblKeywords table
will have multiple entries for each article but one keyword per record,
simply saying 'WHERE keyword = input1 AND keyword = input2' would always be
false. How could I return only the articles that meet the requirment of
having several keywords? Like a user entering 'cooking AND stew' in the
input box should return 101, How to make stew, but NOT 103, Whats in stew.

anyone? any input is greatly appreciated.
 
J

John Viescas

SELECT * FROM tblMain
WHERE tblMain.MainID IN
(SELECT MainID FROM tblKeywords
WHERE tblKeywords.Keyword IN ("stew", "cooking", "food")
GROUP BY MainID
HAVING Count(*) = 3)

This works only if the combination of MainID and Keyword are unique in
tblKeywords - that is, a particular keyword cannot occur twice for the same
MainID. You'll probably have to build this SQL on the fly based on the
number of AND keywords chosen. The trick is you'll get a MainID from
tblKeywords only for (in this example) the ones that have all three
keywords.

--
John Viescas, author
"Microsoft Office Access 2003 Inside Out"
"Running Microsoft Access 2000"
"SQL Queries for Mere Mortals"
http://www.viescas.com/
(Microsoft Access MVP since 1993)
 
D

djc

sweet. I'm going to play with this after checking out the 'IN'
statement/keyword/operator, whichever it may be, in the jetsql help file. I
have not used it before and so it is unfamiliar. Thanks a lot!
 
D

djc

this gets me close to what I need. The issue I'm having is that the IN
("stew", "cooking", "food") statement does not work with wildcards (*, ?). I
definitely want to add a beginning and trailing * to each keyword entered by
the user to also match anything thats contains what they typed... partial
searches.

any other ideas?

thanks again regardless.
 
J

John Viescas

I would think that a keyword is a stand-alone item. For example, you
wouldn't want to search for "*pea*" when keywords might exist for "pea",
"sweet pea", "peanut", "peach", and "pear". I can't think of a cool way off
the top of my head (though I'm sure there is some solution) to make sure you
get a hit on at least one of each. Remember, I said this works only if the
keywords are not duplicated per MainID, but a wildcard search almost
guarantees duplicate hits.

--
John Viescas, author
"Microsoft Office Access 2003 Inside Out"
"Running Microsoft Access 2000"
"SQL Queries for Mere Mortals"
http://www.viescas.com/
(Microsoft Access MVP since 1993)
 
D

djc

yep. I may just nix the wildcard support for 'AND' type searches but include
them for single keyword searches. Thanks agian for your help though.
 
D

djc

I think I came up with a way. Let me know what you think

SELECT M.MainID, M.Title
FROM tblMain AS M
WHERE M.MainID IN
(
SELECT K1.MainID
FROM tblKeyPhrases AS K1
WHERE K1.Keyphrase Like "*food*"
)
AND M.MainID IN
(
SELECT K2.MainID
FROM tblKeyPhrases AS K2
WHERE K2.Keyphrase Like "*soup*"
)
AND M.MainID IN
(
SELECT K3.MainID
FROM tblKeyPhrases AS K3
WHERE K3.Keyphrase Like "*turtle*"
)

Ignore the actual data (food, soup, turtle), this is just a copy/paste from
my test query. You can see the structure from this though. I definetly don't
have experience writing complex queries but this seems to be returning what
I want. I get one ID returned for every ID that has ALL three keywords food,
soup, and turtle even when each keyword is in a seperate record. Supports
wildcards and I think can continue on to serveral more ANDs following the
same structure. This grew from your original reply to me. So 1. thanks agian
for that and 2. If this sparks any other ideas from you that you would like
to share it would be appreciated and 3. do you even think what I have here
is a decent solution? for example, from a more experienced eye can you see
problems with me implementing this? maybe performance issues? etc..
 
J

John Viescas

That's Perfect! And I'm blushing - I should have been able to come up with
that answer. That should run pretty fast, but if you find you're having
performance problems, you can re-cast it as INNER JOINS:

SELECT M.MainID, M.Title
FROM ((tblMain AS M
INNER JOIN
(SELECT K1.MainID
FROM tblKeyPhrases AS K1
WHERE K1.Keyphrase Like "*food*") AS C1
ON M.MainID = C1.MainID)
INNER JOIN
(SELECT K2.MainID
FROM tblKeyPhrases AS K2
WHERE K2.Keyphrase Like "*soup*") AS C2
ON M.MainID = C2.MainID)
INNER JOIN
(SELECT K3.MainID
FROM tblKeyPhrases AS K3
WHERE K3.Keyphrase Like "*turtle*") AS C3
ON M.MainID = C3.MainID

Note, however, that building the query using the IN technique is going to be
easier to build "on the fly" than the above solution - you just examine each
critera and build one more AND ... IN clause using your technique.


--
John Viescas, author
"Microsoft Office Access 2003 Inside Out"
"Running Microsoft Access 2000"
"SQL Queries for Mere Mortals"
http://www.viescas.com/
(Microsoft Access MVP since 1993)
 
D

djc

Great! Thanks for the positive feedback. Thats a good end to a frustrating
couple days for me. I do have one question with regard to your INNER JOIN
re-cast. I should say I am comfortable with basic SQL but nothing real
complex since this is probably a dumb question but: I know the M and K1
etc.. are table aliases but what do the C1, C2 etc.. represent in your
query? An alias for the results of the whole SELECT statement? I didn't know
you could do that.

I think I need a copy of "SQL Queries for Mere Mortals"! whats the ISBN of
your latest edition? I've heard the title "Database Design for Mere
Mortals", or something similar, are you the author of that book as well?

thanks again for all your help thus far.
 
J

John Viescas

Yes, the C1, C2, C3 are aliases for the (SELECT statement). You could
actually do that in Access 97 with some nonstandard syntax, and it was
undocumented. But, you had to enclose the SELECT in brackets [ ] and follow
the closing bracket with a dot. Go figure. This is really handy when you
want to do an OUTER JOIN and you need to filter the rows on the "right" side
of a "left" join first. (You cannot do this on the query grid.) For
example, if you want to find all customers who did not place an order in
February, you could solve it this way:

SELECT Customers.*
FROM Customers LEFT JOIN
(SELECT CustomerID FROM Orders
WHERE Orders.OrderDate Between #02/01/2004# And #02/29/2004#) As OFeb
ON Customers.CustomerID = OFeb.CustomerID
WHERE OFeb.CustomerID IS NULL;

(Also sometimes referred to in this newsgroup as a "frustrated" outer join.)

You can find SQL Queries for Mere Mortals here:
http://www.amazon.com/exec/obidos/ASIN/0201433362/viescaconsulinc/

and the recent edition of Database Design for Mere Mortals here:
http://www.amazon.com/exec/obidos/ASIN/0201752840/viescaconsulinc/

I was coauthor on the SQL book with my good friend, Mike Hernandez. He
wrote the Design book on his own.

--
John Viescas, author
"Microsoft Office Access 2003 Inside Out"
"Running Microsoft Access 2000"
"SQL Queries for Mere Mortals"
http://www.viescas.com/
(Microsoft Access MVP since 1993)
 
D

djc

Sweet. And I'm definitley going to pick up both those books!
Thanks for all the help.
chow.

John Viescas said:
Yes, the C1, C2, C3 are aliases for the (SELECT statement). You could
actually do that in Access 97 with some nonstandard syntax, and it was
undocumented. But, you had to enclose the SELECT in brackets [ ] and follow
the closing bracket with a dot. Go figure. This is really handy when you
want to do an OUTER JOIN and you need to filter the rows on the "right" side
of a "left" join first. (You cannot do this on the query grid.) For
example, if you want to find all customers who did not place an order in
February, you could solve it this way:

SELECT Customers.*
FROM Customers LEFT JOIN
(SELECT CustomerID FROM Orders
WHERE Orders.OrderDate Between #02/01/2004# And #02/29/2004#) As OFeb
ON Customers.CustomerID = OFeb.CustomerID
WHERE OFeb.CustomerID IS NULL;

(Also sometimes referred to in this newsgroup as a "frustrated" outer join.)

You can find SQL Queries for Mere Mortals here:
http://www.amazon.com/exec/obidos/ASIN/0201433362/viescaconsulinc/

and the recent edition of Database Design for Mere Mortals here:
http://www.amazon.com/exec/obidos/ASIN/0201752840/viescaconsulinc/

I was coauthor on the SQL book with my good friend, Mike Hernandez. He
wrote the Design book on his own.

--
John Viescas, author
"Microsoft Office Access 2003 Inside Out"
"Running Microsoft Access 2000"
"SQL Queries for Mere Mortals"
http://www.viescas.com/
(Microsoft Access MVP since 1993)
djc said:
Great! Thanks for the positive feedback. Thats a good end to a frustrating
couple days for me. I do have one question with regard to your INNER JOIN
re-cast. I should say I am comfortable with basic SQL but nothing real
complex since this is probably a dumb question but: I know the M and K1
etc.. are table aliases but what do the C1, C2 etc.. represent in your
query? An alias for the results of the whole SELECT statement? I didn't know
you could do that.

I think I need a copy of "SQL Queries for Mere Mortals"! whats the ISBN of
your latest edition? I've heard the title "Database Design for Mere
Mortals", or something similar, are you the author of that book as well?

thanks again for all your help thus far.

to examine
each following
the have
here you
see cool
way make
sure only
if the wildcards
(*, based
on if
you
implement
a least
one
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top