P
Petr Danes
I have trouble believing that I can't figure this out, but I'm stuck. How do
I set up a condition in a query where I want all the records that have the
same value in a particular field, when I don't know (or care) what that
value actually is?
Here's what I'm doing: This is a rather large database which reads in around
2000 text files, then indexes every word in each file for subsequent
searching. Each file contains many blocks of entries, sort of a file of card
catalogs. Here is a sample of one, with three cards:
.._HS Storch, Johann OKS
..AU
Storch, Johann
..TI
Der kleine Helfer.
..IM
Prag, 1919.
..RO
..PO
..PR
..VA
Vazba
..LI
..SI
Be¼ov nad Teplou 601
..SF
T/71
..OD
..DA
1919
..TD
Dv, Boh
..JZ
Ger
..GE
@PA
.._HS Dannhauser, Alfred AFFE
..AU
Dannhauser, Alfred
..TI
Die Tragoedie der Frau. Das Problem der reiferen Jahre. 2.«
Aufl.
..IM
Stuttgart, 1928.
..RO
..PO
..PR
..VA
Vazba
..LI
..SI
Be¼ov nad Teplou 602
..SF
T/71
..OD
..DA
1928
..TD
Dv
..JZ
Ger
..GE
@PA
.._HS Smolle, Leo NATOC
..AU
Smolle, Leo
..TI
Das Buch von unserem Kaiser. 1848 - 1888. Festschrift ...
..IM
Wien, 1888.
..RO
..PO
..PR
..VA
Vazba
..LI
..SI
Be¼ov nad Teplou 603
..SF
T/71
..OD
..DA
1888
..TD
Nt
..JZ
Ger
..GE
@PA
The start of every card is "._HS texttexttext..", and all the subsequent
lines are in pairs, the .XX identifier preceding the line or lines with the
actual text of interest.
My initial design had a table with a column for each of the .XX fields, but
in the interest of normalization (and my own sanity, there're over seventy
possible fields in each card) I switched to a table with the following
structure:
SouborAutoID (FK to table of file names)
SlovoAutoID (FK to table of unique words)
Linka (Line number in the original text file)
HSBlok (Card number in the original text file)
Tag (Text of the two-letter field identifier)
I need to assemble queries where I look for things like:
(Tag="IM" AND (Slovo="Prague" OR Slovo="Prag" OR Slovo="Praha"))
AND
(Tag="TI" AND (Slovo="Kunratice" OR Slovo="Kynzvart"))
For simplicity I omitted the JOIN clause; Slovo is in a related table, BUT,
I need to find these things all on the same card, that is, where the field
HSBlok is the same. I don't now what that value might be, there will in fact
be many values retrieved by such queries, all I care is that the value is
the SAME on both the IM and the TI record. That is, I need to add a clause
to the above which does something like:
AND HSBlok Is Identical On Both Records
I thought of doing it with a self-join, but what happens when I have three,
four or even more tags to test? This would have been trivial in my original
model, since every card comprised exactly one record, but this way I'm
stumped. Did I over-normalize my design? Is there even such a thing? Have I
finally gone off the deep end? It seems like some sort of grouping clause
should help, but I haven't been able to come up with anything.
Pete
I set up a condition in a query where I want all the records that have the
same value in a particular field, when I don't know (or care) what that
value actually is?
Here's what I'm doing: This is a rather large database which reads in around
2000 text files, then indexes every word in each file for subsequent
searching. Each file contains many blocks of entries, sort of a file of card
catalogs. Here is a sample of one, with three cards:
.._HS Storch, Johann OKS
..AU
Storch, Johann
..TI
Der kleine Helfer.
..IM
Prag, 1919.
..RO
..PO
..PR
..VA
Vazba
..LI
..SI
Be¼ov nad Teplou 601
..SF
T/71
..OD
..DA
1919
..TD
Dv, Boh
..JZ
Ger
..GE
@PA
.._HS Dannhauser, Alfred AFFE
..AU
Dannhauser, Alfred
..TI
Die Tragoedie der Frau. Das Problem der reiferen Jahre. 2.«
Aufl.
..IM
Stuttgart, 1928.
..RO
..PO
..PR
..VA
Vazba
..LI
..SI
Be¼ov nad Teplou 602
..SF
T/71
..OD
..DA
1928
..TD
Dv
..JZ
Ger
..GE
@PA
.._HS Smolle, Leo NATOC
..AU
Smolle, Leo
..TI
Das Buch von unserem Kaiser. 1848 - 1888. Festschrift ...
..IM
Wien, 1888.
..RO
..PO
..PR
..VA
Vazba
..LI
..SI
Be¼ov nad Teplou 603
..SF
T/71
..OD
..DA
1888
..TD
Nt
..JZ
Ger
..GE
@PA
The start of every card is "._HS texttexttext..", and all the subsequent
lines are in pairs, the .XX identifier preceding the line or lines with the
actual text of interest.
My initial design had a table with a column for each of the .XX fields, but
in the interest of normalization (and my own sanity, there're over seventy
possible fields in each card) I switched to a table with the following
structure:
SouborAutoID (FK to table of file names)
SlovoAutoID (FK to table of unique words)
Linka (Line number in the original text file)
HSBlok (Card number in the original text file)
Tag (Text of the two-letter field identifier)
I need to assemble queries where I look for things like:
(Tag="IM" AND (Slovo="Prague" OR Slovo="Prag" OR Slovo="Praha"))
AND
(Tag="TI" AND (Slovo="Kunratice" OR Slovo="Kynzvart"))
For simplicity I omitted the JOIN clause; Slovo is in a related table, BUT,
I need to find these things all on the same card, that is, where the field
HSBlok is the same. I don't now what that value might be, there will in fact
be many values retrieved by such queries, all I care is that the value is
the SAME on both the IM and the TI record. That is, I need to add a clause
to the above which does something like:
AND HSBlok Is Identical On Both Records
I thought of doing it with a self-join, but what happens when I have three,
four or even more tags to test? This would have been trivial in my original
model, since every card comprised exactly one record, but this way I'm
stumped. Did I over-normalize my design? Is there even such a thing? Have I
finally gone off the deep end? It seems like some sort of grouping clause
should help, but I haven't been able to come up with anything.
Pete