junction tables for 3 many to many relationships

L

LAF

Imagine 3 tables: A, B, C, with many-to-many relationships among all pairs
(AB, AC, BC). Three junction tables can be used to deal with the pairwise
relationships. Question: Would a single junction table work that has
primary keys from each of the 3 tables?
 
A

Allen Browne

Insufficient information.

There may be scenarios where a single table with 3 foreign keys would be
usuable, especially if you use the Validation Rule of the table (not fields)
to require that at least 2 foreign keys are present. That scenario could be
attractive rather than trying to UNION the 3 junction tables regularly, or
if there could actually be a meaningful combination of all 3 (where all 3
foreign keys are present).

But there are many other approaches. Example: a single table in place of A,
B, and C, with a table identifying a "relationship" that exists, with a
related table containing a foreign key record for each record of the
original table that is invovled in this "relationship". The advantage of
this approach is that there is only a single field to examine to find all
the relationship that one of the original records has. (That may be useless
for what you are doing: we just don't know.)
 
L

LAF

Additional information is indeed required. This information can be generic.
There are 3 fields (ID, Place, Time) such that ID and Place, ID and
Date-Time, Place and Date-Time can have many-to-many relationships. In a
single table with these 3 fields, the 3 fields as a primary key "capture" the
relationships that the 3 field junction table would have. With one
additional field (Happening) in this single table, the table would be useful
for many queries. ID and Place would be primary keys in their own tables.
The other table(s) in the database minimally require ID and Date-Time as
primary key, with numerous fields about the ID at each Date-Time. Place
would come from the Happening table.

Would this work? The principle here is whether a multi-field primary key
can capture the many-to-many relationships that those fields contain.

Thanks,

LAF
 
A

Allen Browne

A table of happenings with 3 foreign keys - ID, Place, and Time - sounds
fine.

The 3-field primary key would be suitable if:
- all 3 are always required, and
- there could never be 2 IDs at the same place and time (for example), and
- there are not further related tables that would then need a 3-field
foreign key, where a single field key would be better.
 
L

LAF

The 3-field primary key (ID, Place, Date-Time) serves to identify unique
records, but frequently several ID's occur at the same Date-Time at the same
Place. It is not clear to me why the condition of no 2 ID's at the same
Place at the same Date-Time is necessary.

Thanks, LAF
 
T

Tim Ferguson

The 3-field primary key would be suitable if:
- all 3 are always required, and
- there could never be 2 IDs at the same place and time (for example),
and - there are not further related tables that would then need a
3-field foreign key, where a single field key would be better.

Sorry to butt in, but isn't the argument here dependent on the semantics
rather than what is allowable?

For example, a business that allows People to bring their individual Skills
to particular Projects could have:

Person Skill Project
====== ===== =======
Peter Design Website
Peter Italian Website
Derek JScript Website
Derek Italian Help files
Peter Design Help files

so that Peter does design and translation on the website, but does not do
any translation on the help files, etc. This is, FWIW, straightforward
fifth normal form (I think!).

On the other hand, the database might only be concerned with skills rather
than who-does-what. In this case a set of three relationships is required:

PeopleHaveSkills
Person Skill
====== =====
Peter Design
Peter Italian
Derek JScript
Derek Italian

PeopleOnProjects
Person Project
====== =======
Peter Website
Peter Help files
Derek Help files
Derek Website

ProjectsNeedSkills
Project Skill
======= =====
Website Design
Website Italian
Website JScript
Help files Design
Help files Italian


Although these two designs are in roughly the same area, they represent
different business needs, and are thus completely non-equivalent. Surely
the OP's real problem is establishing his actual needs.

All the best


Tim F
 
L

LAF

Butting in is encouraged because we are going to the heart of data base
design for users. Perhaps one of the considerations that goes beyond
semantics are the queries that might be performed. For example, which ID's
were involved in 2 or more Places? Which place had fewest ID's? Which ID's
were carried over from a previous year, which ID's were newly established in
a given year and were detected again in that year? Many queries involve
looping over recordsets with counters. It is in this spirit that I am
concerned about 3-field primary keys that at least identify unique records.

However, I am still a little confused about the condition that there could
never be 2 ID's at the same Place at the same Date-Time, or any other
analogous situations. Perhaps it would be useful to specify how these
violated conditions can mess up use of the database, even when each other
table in the database has a unique primary key made up of one or two fields
of the 3-field primary key that identifies unique records.

Thanks for the participation,

LAF
 
A

Allen Browne

LAF, I'm really not clear what is the issue here.

The condition that the combination of ID + Place + Date/Time must be unique
swings on that combination being primary key. If there could validly be 2 of
whatever the ID is, at the same time, in the same place, then the
combination is not suitable as a p.k.
 
T

Tim Ferguson

However, I am still a little confused about the condition that there
could never be 2 ID's at the same Place at the same Date-Time, or any
other analogous situations. Perhaps it would be useful to specify how
these violated conditions can mess up use of the database, even when
each other table in the database has a unique primary key made up of
one or two fields of the 3-field primary key that identifies unique
records.

As I remember, this thread began with "Tables A, B, and C". Although
various real-world scenarios have been put forward as examples, I don't
think we've been let into what you are actually modelling; and furthermore
I don't think it's possible to discuss general cases sensibly in any depth.

The 3NF and 5NF solutions are simply solutions to different problems, and
it makes as much sense to compare them as it does to ask, "What is the best
way to sail across the Atlantic - carry more coal or eat more limes?"

All the best


Tim F
 
L

LAF

Thanks to Allen and Tim for pursuing this. The 3 field primary key
absolutely identifies a unique record. Yet, several combinations of two of
these fields have many to many relationships. These are:

The same ID may be in different sites, but at different times.

Different ID's may be in different sites, but at the same time.

By the way, the ID's are aluminum bands placed on the legs of birds.
Banding operations can occur at different sites on the same day, and the time
is when we capture a bird in a mist-net. Several birds can be caught in a
mist-net at the same time.

My original question still stands. Do I need to worry about many to many
relationships among subsets of the 3-field primary key if the 3-filed key
identifies unique records?


Many of the tables in the database deal with lots of measurements we take on
the birds. For these tables, just ID and Date-Time are sufficient as primary
key to identify unique records.

I may be making a mountain out of a molehill, but no book on access deals
with junction tables for more than a single many to many relationship. I am
viewing my 3-field primary key as a junction table because of inferred many
to many relationships among some of the fields that are part of the primary
key.

Thanks, LAF
 
A

Allen Browne

LAF, it looks like neither Tim nor I have understood your question.

Hopefully someone else can contribute.

--
Allen Browne - Microsoft MVP. Perth, Western Australia.

Reply to group, rather than allenbrowne at mvps dot org.

[snip]
 
T

Tim Ferguson

The 3 field primary key
absolutely identifies a unique record.
Of course it does: the relevant question is whether than means anything!

By the way, the ID's are aluminum bands placed on the legs of birds.
Banding operations can occur at different sites on the same day, and
the time is when we capture a bird in a mist-net. Several birds can
be caught in a mist-net at the same time.

Here's an entity you haven't mentioned before (ps * always denotes the PK
field, rest are obviously optional):

MistNetDeployments(*Place, *DateAndTime, WhoOrganised,
WeatherAtTheTime, SizeOfNet, etc...)

Place is a FK referencing Places, and DateAndTime is a FK referencing
Events, or whatever you have called your tables.
Many of the tables in the database deal with lots of measurements we
take on the birds.

Catchings(*Place, *DateAndTime, *BirdID, Weight, SignsOfHairloss, etc)

note that the FK is (Place, DateAndTime) references MistNetDeployment, and
there is _no_ (repeat, no) FK referencing Places or Events.
For these tables, just ID and Date-Time are
sufficient as primary key to identify unique records.

To identify what exactly? A record is just somewhere to store some facts
about a Thing or an Event or an Occurrence: it has to mean something in the
real world.

I may be making a mountain out of a molehill, but no book on access
deals with junction tables for more than a single many to many
relationship.

A lot of basic texts deal stop at third normal form; you need to look at
more advanced material for 4, 5, and 6. In the end, though, it still makes
pretty much common sense once you have a feel for what you are actually
modelling.

From what you have said, though, it looks like a couple of 1-M
relationships:

Events ----+
^
MistNetDeployments --< Catchings >-- Birds
v
Places ----+


The best advice is always to understand your real life business scenario
first, and then the model falls straight out of that. There is general
theory underlying this, but it's no use until you have your own specifics
nailed down flat.

Hope it helps


Tim F
 
L

LeAnne

PMFJI, but as a scientist who has both created and consulted
(informally) on designing biological sampling/monitoring databases, may
I offer a suggestion or two?

LAF, it's difficult to grasp your db design from the descriptions you've
posted here. There is clearly some kind of normalization problem that's
preventing you from utilizing the full capabilities of a relational
database. Were I you, I would consider stepping back and reexamining
your entities (real-life persons, places, things, or events that your db
will keep track of) and their attributes (categories of related
information relevent to each entity). Entities & attributes = tables &
fields.

In my experience, managing data from any field survey of a population or
community of organisms will require at least the following entities &
attributes:

Stations - the sampling sites where data are collected. Fields may
include SiteID, SiteName, Latitude, Longitude, County, State, Ecoregion,
and so on.

Events - actual visits (of which there may be one OR MORE) to sites to
collect raw data. Fields may include EventID, SiteID, SampleDate,
SampleTime, ReplicateNum, SampleType, GearUsed (or GearCode), etc.. A
note on SampleType...some developers might argue that it is "easier" to
break up their events tables to reflect the categories of data being
collected (e.g. Benthic, WaterQuality, Habitat). But strictly speaking
this isn't in keeping with the Normal forms.

Organisms - which critters (of which there may be one OR MORE collected
for each unique combination of site+date+time+rep) were collected at
each site for each sampling event , and the number of each. Fields may
include EventID, Taxon (or Taxon ID if you have a taxonomic lookup table
with LatinNames, Taxon level, etc.), and TotalIndividuals.

Organism Details - necessary if you are recording multiple
characteristics or measurements for individual critters. For example:
at Stream X I collected 2 sunfish, 3 bass, and one pickerel. Each fish
was measured for total length, body weight, and whether or not there
were any visible lesions or other anomalies. Fields for the table in
this example may include EventID, TaxonID, FishNum, ParameterMeasured
(or ParameterID if you have a lookup table of standard measurement
parameters and their descriptions), and ParameterValue. You would
probably substitute BandNumber for FishNum to identify individual birds.
You might also want to include a Memo field for comments.

Once your data are organized in a framework that Access understands,
queries become much, much easier. You can calculate the taxonomic
richness at a site for a particular visit, or get a list of all taxa
collected at a site over multiple visits, or generate a list of all
sites where Black-and-White Warblers were collected, or calculate the
proportion of Parula warbler individuals to total abundance of all
individuals at sights visited in Spring, and so on and so on.

Just my $0.02 worth...

Good luck!

LeAnne

LeAnne
 
L

LAF

Thanks to Tim, Leanne, and Allen for advice.

The original question about junction tables has become transmorgrified into
general database design. Here I will attempt to integrate and indicate the
aspects of design that I think are relevant. I will list the tables and some
fields with primary key as (PK) and foreign key as (FK):

event table: event.id (PK), bird.id (FK), subsite.id (FK), date, time,
capture.type, observer

bird table: bird.id (PK), species.id (FK), leftleg, rightleg

site table: subsite.id (PK), site, elevation, forest.structure

species table: species (PK), residency, listing

status table: event.id (PK), age, sex, disease, molt, fat, weight

morpho table: event.id (PK), bill, wing, tail, tarsus

sampling table: event.id (PK), bloodlog (FK), bloodquan, feather, pictures

genetics table: bloodlog (PK), event.id (FK), prepnum, malaria, cytb, chd

The major aspect of design here is that the 3 fields in the many to many
relationships (bird.id, subsite.id, date, time [actually 4 fields]) are
indicated by event.id as an autonumbered field. In this case, event.id
serves to indicate the unique combinations of the 4 fields. This is true
because the 4 fields could serve as a composite primary key. What I think is
important here is that the autonumbered field is meaningless except as a
primary key. I read somewhere that composite primary keys can cause some
problems in some queries.

The second aspect of design is that there are several one to one
relationships linked by event.id. In principle, all of these tables can be
merged into the event table, but not all individual birds have all data
taken. The event.id field in all of these additional tables will come from
the event table.

Many of my queries will involve looping over sorted recordsets, such as a
query that could add fields to the bird table such as date first captured,
date last captured. Or identifying individuals that were mis-sexed at
different events.

Happy New Year,

LAF
 
L

LeAnne

Hi LAF,

You're very close. However, I suggest:

tblSites 1:m to tblEvents (since each site (subsite?) can be visited on
one OR MORE dates) on SiteID

tblEvents would have EventID as PK. EventID would be an autonumber
providing each unique combination of site+date+time (+capture type if
your are using more than one per event)

tblEvents 1:m to a BirdsCaptured table (since each event can capture one
OR MORE birds) on EventID

The BirdsCaptured table would be the junction table between tblEvents
and tblBirds (since a bird could conceivably be captured one OR MORE
times), and contain BOTH EventID and BirdID as key fields. NetNum could
also be included as a field if multiple nets are being deployed per event.

tblBirdsCaptured m:1 to tblBirds on BirdID (again, each individual bird
may be captured one OR MORE times)

tblBirds m:1 to tblSpecies (assuming that one OR MORE individuals in a
species have been banded) on SpeciesID

tblSpecies is the lookup table with taxon-specific information.

Now, if I read your explanation right, there are multiple parameters
measured for each bird captured (morphological characteristics, age,
genetics, &c.). Not all birds may be measured for all parameters. And
each time a bird is recaptured, some (but not all) parameters are
measured again. Correct? If so, then I would recommend that measurements
be combined into a single Measurements table, joined m:1 to
tblBirdsCaptured on BirdID + EventID. tblMeasurements would include the
fields EventID, BirdID, MeasurementID, and MeasurementValue.
tblMeasurements would be joined m:1 to a MeasurementLookup table on
MeasurementID with MeasurementName, MeasurementUnit, MeasurementCategory
(morpho, genetic, etc.), Description, MethodUsed, etc. included as
fields. Measurements are data, not attributes; by expanding measurements
down instead of across, you eliminate empty fields when a particular
parameter is not recorded for a particular bird captured in a particular
event.

Hope this helps,

LeAnne
Thanks to Tim, Leanne, and Allen for advice.

The original question about junction tables has become transmorgrified into
general database design. Here I will attempt to integrate and indicate the
aspects of design that I think are relevant. I will list the tables and some
fields with primary key as (PK) and foreign key as (FK):

event table: event.id (PK), bird.id (FK), subsite.id (FK), date, time,
capture.type, observer

bird table: bird.id (PK), species.id (FK), leftleg, rightleg

site table: subsite.id (PK), site, elevation, forest.structure

species table: species (PK), residency, listing

status table: event.id (PK), age, sex, disease, molt, fat, weight

morpho table: event.id (PK), bill, wing, tail, tarsus

sampling table: event.id (PK), bloodlog (FK), bloodquan, feather, pictures

genetics table: bloodlog (PK), event.id (FK), prepnum, malaria, cytb, chd

The major aspect of design here is that the 3 fields in the many to many
relationships (bird.id, subsite.id, date, time [actually 4 fields]) are
indicated by event.id as an autonumbered field. In this case, event.id
serves to indicate the unique combinations of the 4 fields. This is true
because the 4 fields could serve as a composite primary key. What I think is
important here is that the autonumbered field is meaningless except as a
primary key. I read somewhere that composite primary keys can cause some
problems in some queries.

The second aspect of design is that there are several one to one
relationships linked by event.id. In principle, all of these tables can be
merged into the event table, but not all individual birds have all data
taken. The event.id field in all of these additional tables will come from
the event table.

Many of my queries will involve looping over sorted recordsets, such as a
query that could add fields to the bird table such as date first captured,
date last captured. Or identifying individuals that were mis-sexed at
different events.

Happy New Year,

LAF
 
L

LAF

Hi LeAnne,

Thanks so much. I will get back to the group on the suggestions. However,
one point that could be made now. The condition data like age, sex, molt,
disease, etc, are really attributes. The sampling data are also really
attributes. The morphometric data are really the values. I am trying to
think of how the measurement table as you identify could be used for
statistical analysis. I frequently bring tables into the statistical program
S-plus, where I have the fields (columns) with BirdId, Species, Yr, and many
of the condition and morphometric fields. Even if the 1:1 table
relationships that I proposed have lots of fields with missing values, it is
easy either in access or in S-plus to just take the subset of records with
non-missing values.

Might it would be better to have the condition fields and the sampling
fields in the event table? I have to think of how I could get tables into
S-plus in which the rows are observations and columns are values.

Thanks,

LAF

LeAnne said:
Hi LAF,

You're very close. However, I suggest:

tblSites 1:m to tblEvents (since each site (subsite?) can be visited on
one OR MORE dates) on SiteID

tblEvents would have EventID as PK. EventID would be an autonumber
providing each unique combination of site+date+time (+capture type if
your are using more than one per event)

tblEvents 1:m to a BirdsCaptured table (since each event can capture one
OR MORE birds) on EventID

The BirdsCaptured table would be the junction table between tblEvents
and tblBirds (since a bird could conceivably be captured one OR MORE
times), and contain BOTH EventID and BirdID as key fields. NetNum could
also be included as a field if multiple nets are being deployed per event.

tblBirdsCaptured m:1 to tblBirds on BirdID (again, each individual bird
may be captured one OR MORE times)

tblBirds m:1 to tblSpecies (assuming that one OR MORE individuals in a
species have been banded) on SpeciesID

tblSpecies is the lookup table with taxon-specific information.

Now, if I read your explanation right, there are multiple parameters
measured for each bird captured (morphological characteristics, age,
genetics, &c.). Not all birds may be measured for all parameters. And
each time a bird is recaptured, some (but not all) parameters are
measured again. Correct? If so, then I would recommend that measurements
be combined into a single Measurements table, joined m:1 to
tblBirdsCaptured on BirdID + EventID. tblMeasurements would include the
fields EventID, BirdID, MeasurementID, and MeasurementValue.
tblMeasurements would be joined m:1 to a MeasurementLookup table on
MeasurementID with MeasurementName, MeasurementUnit, MeasurementCategory
(morpho, genetic, etc.), Description, MethodUsed, etc. included as
fields. Measurements are data, not attributes; by expanding measurements
down instead of across, you eliminate empty fields when a particular
parameter is not recorded for a particular bird captured in a particular
event.

Hope this helps,

LeAnne
Thanks to Tim, Leanne, and Allen for advice.

The original question about junction tables has become transmorgrified into
general database design. Here I will attempt to integrate and indicate the
aspects of design that I think are relevant. I will list the tables and some
fields with primary key as (PK) and foreign key as (FK):

event table: event.id (PK), bird.id (FK), subsite.id (FK), date, time,
capture.type, observer

bird table: bird.id (PK), species.id (FK), leftleg, rightleg

site table: subsite.id (PK), site, elevation, forest.structure

species table: species (PK), residency, listing

status table: event.id (PK), age, sex, disease, molt, fat, weight

morpho table: event.id (PK), bill, wing, tail, tarsus

sampling table: event.id (PK), bloodlog (FK), bloodquan, feather, pictures

genetics table: bloodlog (PK), event.id (FK), prepnum, malaria, cytb, chd

The major aspect of design here is that the 3 fields in the many to many
relationships (bird.id, subsite.id, date, time [actually 4 fields]) are
indicated by event.id as an autonumbered field. In this case, event.id
serves to indicate the unique combinations of the 4 fields. This is true
because the 4 fields could serve as a composite primary key. What I think is
important here is that the autonumbered field is meaningless except as a
primary key. I read somewhere that composite primary keys can cause some
problems in some queries.

The second aspect of design is that there are several one to one
relationships linked by event.id. In principle, all of these tables can be
merged into the event table, but not all individual birds have all data
taken. The event.id field in all of these additional tables will come from
the event table.

Many of my queries will involve looping over sorted recordsets, such as a
query that could add fields to the bird table such as date first captured,
date last captured. Or identifying individuals that were mis-sexed at
different events.

Happy New Year,

LAF
 
L

LeAnne

Hi LAF,

Sorry, this is incorrect. Condition categories like age, sex, molt,
disease, etc. are *data*, not attributes (fields). The values recorded
for each category are also *data.* In a properly relational design,
tables expand down, not across. With a table design using conditions as
attributes, each time you want to add an additional measurement
category, you have to add a new field. All queries you have based on the
table will then have to be changed manually. How data are viewed or
exported should not affect table design. Tables are for *storing* data.
The design I suggested is normalized for a relational model. Use queries
to *display* the data in a wide-flat arrangement...look up "Crosstab
queries" in the Help file. You can export query recordsets as .xls or
..dbf files (File > Save As/Export > To an External File or Database),
which I believe SAS can handle easily.

L.
Hi LeAnne,

Thanks so much. I will get back to the group on the suggestions. However,
one point that could be made now. The condition data like age, sex, molt,
disease, etc, are really attributes. The sampling data are also really
attributes. The morphometric data are really the values. I am trying to
think of how the measurement table as you identify could be used for
statistical analysis. I frequently bring tables into the statistical program
S-plus, where I have the fields (columns) with BirdId, Species, Yr, and many
of the condition and morphometric fields. Even if the 1:1 table
relationships that I proposed have lots of fields with missing values, it is
easy either in access or in S-plus to just take the subset of records with
non-missing values.

Might it would be better to have the condition fields and the sampling
fields in the event table? I have to think of how I could get tables into
S-plus in which the rows are observations and columns are values.

Thanks,

LAF

:

Hi LAF,

You're very close. However, I suggest:

tblSites 1:m to tblEvents (since each site (subsite?) can be visited on
one OR MORE dates) on SiteID

tblEvents would have EventID as PK. EventID would be an autonumber
providing each unique combination of site+date+time (+capture type if
your are using more than one per event)

tblEvents 1:m to a BirdsCaptured table (since each event can capture one
OR MORE birds) on EventID

The BirdsCaptured table would be the junction table between tblEvents
and tblBirds (since a bird could conceivably be captured one OR MORE
times), and contain BOTH EventID and BirdID as key fields. NetNum could
also be included as a field if multiple nets are being deployed per event.

tblBirdsCaptured m:1 to tblBirds on BirdID (again, each individual bird
may be captured one OR MORE times)

tblBirds m:1 to tblSpecies (assuming that one OR MORE individuals in a
species have been banded) on SpeciesID

tblSpecies is the lookup table with taxon-specific information.

Now, if I read your explanation right, there are multiple parameters
measured for each bird captured (morphological characteristics, age,
genetics, &c.). Not all birds may be measured for all parameters. And
each time a bird is recaptured, some (but not all) parameters are
measured again. Correct? If so, then I would recommend that measurements
be combined into a single Measurements table, joined m:1 to
tblBirdsCaptured on BirdID + EventID. tblMeasurements would include the
fields EventID, BirdID, MeasurementID, and MeasurementValue.
tblMeasurements would be joined m:1 to a MeasurementLookup table on
MeasurementID with MeasurementName, MeasurementUnit, MeasurementCategory
(morpho, genetic, etc.), Description, MethodUsed, etc. included as
fields. Measurements are data, not attributes; by expanding measurements
down instead of across, you eliminate empty fields when a particular
parameter is not recorded for a particular bird captured in a particular
event.

Hope this helps,

LeAnne

LAF wrote:

Thanks to Tim, Leanne, and Allen for advice.

The original question about junction tables has become transmorgrified into
general database design. Here I will attempt to integrate and indicate the
aspects of design that I think are relevant. I will list the tables and some
fields with primary key as (PK) and foreign key as (FK):

event table: event.id (PK), bird.id (FK), subsite.id (FK), date, time,
capture.type, observer

bird table: bird.id (PK), species.id (FK), leftleg, rightleg

site table: subsite.id (PK), site, elevation, forest.structure

species table: species (PK), residency, listing

status table: event.id (PK), age, sex, disease, molt, fat, weight

morpho table: event.id (PK), bill, wing, tail, tarsus

sampling table: event.id (PK), bloodlog (FK), bloodquan, feather, pictures

genetics table: bloodlog (PK), event.id (FK), prepnum, malaria, cytb, chd

The major aspect of design here is that the 3 fields in the many to many
relationships (bird.id, subsite.id, date, time [actually 4 fields]) are
indicated by event.id as an autonumbered field. In this case, event.id
serves to indicate the unique combinations of the 4 fields. This is true
because the 4 fields could serve as a composite primary key. What I think is
important here is that the autonumbered field is meaningless except as a
primary key. I read somewhere that composite primary keys can cause some
problems in some queries.

The second aspect of design is that there are several one to one
relationships linked by event.id. In principle, all of these tables can be
merged into the event table, but not all individual birds have all data
taken. The event.id field in all of these additional tables will come from
the event table.

Many of my queries will involve looping over sorted recordsets, such as a
query that could add fields to the bird table such as date first captured,
date last captured. Or identifying individuals that were mis-sexed at
different events.

Happy New Year,

LAF
 
L

LAF

Hi LeAnne,

I think I understand the design. However, the measurement value field in
tblMeasurements will contain numeric data for some measurements, text data
for some measurements, and logical data for still others. I am not aware of
access's ability to mix these data types in the same field.

Thanks, and all the best,

LAF



LeAnne said:
Hi LAF,

Sorry, this is incorrect. Condition categories like age, sex, molt,
disease, etc. are *data*, not attributes (fields). The values recorded
for each category are also *data.* In a properly relational design,
tables expand down, not across. With a table design using conditions as
attributes, each time you want to add an additional measurement
category, you have to add a new field. All queries you have based on the
table will then have to be changed manually. How data are viewed or
exported should not affect table design. Tables are for *storing* data.
The design I suggested is normalized for a relational model. Use queries
to *display* the data in a wide-flat arrangement...look up "Crosstab
queries" in the Help file. You can export query recordsets as .xls or
..dbf files (File > Save As/Export > To an External File or Database),
which I believe SAS can handle easily.

L.
Hi LeAnne,

Thanks so much. I will get back to the group on the suggestions. However,
one point that could be made now. The condition data like age, sex, molt,
disease, etc, are really attributes. The sampling data are also really
attributes. The morphometric data are really the values. I am trying to
think of how the measurement table as you identify could be used for
statistical analysis. I frequently bring tables into the statistical program
S-plus, where I have the fields (columns) with BirdId, Species, Yr, and many
of the condition and morphometric fields. Even if the 1:1 table
relationships that I proposed have lots of fields with missing values, it is
easy either in access or in S-plus to just take the subset of records with
non-missing values.

Might it would be better to have the condition fields and the sampling
fields in the event table? I have to think of how I could get tables into
S-plus in which the rows are observations and columns are values.

Thanks,

LAF

:

Hi LAF,

You're very close. However, I suggest:

tblSites 1:m to tblEvents (since each site (subsite?) can be visited on
one OR MORE dates) on SiteID

tblEvents would have EventID as PK. EventID would be an autonumber
providing each unique combination of site+date+time (+capture type if
your are using more than one per event)

tblEvents 1:m to a BirdsCaptured table (since each event can capture one
OR MORE birds) on EventID

The BirdsCaptured table would be the junction table between tblEvents
and tblBirds (since a bird could conceivably be captured one OR MORE
times), and contain BOTH EventID and BirdID as key fields. NetNum could
also be included as a field if multiple nets are being deployed per event.

tblBirdsCaptured m:1 to tblBirds on BirdID (again, each individual bird
may be captured one OR MORE times)

tblBirds m:1 to tblSpecies (assuming that one OR MORE individuals in a
species have been banded) on SpeciesID

tblSpecies is the lookup table with taxon-specific information.

Now, if I read your explanation right, there are multiple parameters
measured for each bird captured (morphological characteristics, age,
genetics, &c.). Not all birds may be measured for all parameters. And
each time a bird is recaptured, some (but not all) parameters are
measured again. Correct? If so, then I would recommend that measurements
be combined into a single Measurements table, joined m:1 to
tblBirdsCaptured on BirdID + EventID. tblMeasurements would include the
fields EventID, BirdID, MeasurementID, and MeasurementValue.
tblMeasurements would be joined m:1 to a MeasurementLookup table on
MeasurementID with MeasurementName, MeasurementUnit, MeasurementCategory
(morpho, genetic, etc.), Description, MethodUsed, etc. included as
fields. Measurements are data, not attributes; by expanding measurements
down instead of across, you eliminate empty fields when a particular
parameter is not recorded for a particular bird captured in a particular
event.

Hope this helps,

LeAnne

LAF wrote:


Thanks to Tim, Leanne, and Allen for advice.

The original question about junction tables has become transmorgrified into
general database design. Here I will attempt to integrate and indicate the
aspects of design that I think are relevant. I will list the tables and some
fields with primary key as (PK) and foreign key as (FK):

event table: event.id (PK), bird.id (FK), subsite.id (FK), date, time,
capture.type, observer

bird table: bird.id (PK), species.id (FK), leftleg, rightleg

site table: subsite.id (PK), site, elevation, forest.structure

species table: species (PK), residency, listing

status table: event.id (PK), age, sex, disease, molt, fat, weight

morpho table: event.id (PK), bill, wing, tail, tarsus

sampling table: event.id (PK), bloodlog (FK), bloodquan, feather, pictures

genetics table: bloodlog (PK), event.id (FK), prepnum, malaria, cytb, chd

The major aspect of design here is that the 3 fields in the many to many
relationships (bird.id, subsite.id, date, time [actually 4 fields]) are
indicated by event.id as an autonumbered field. In this case, event.id
serves to indicate the unique combinations of the 4 fields. This is true
because the 4 fields could serve as a composite primary key. What I think is
important here is that the autonumbered field is meaningless except as a
primary key. I read somewhere that composite primary keys can cause some
problems in some queries.

The second aspect of design is that there are several one to one
relationships linked by event.id. In principle, all of these tables can be
merged into the event table, but not all individual birds have all data
taken. The event.id field in all of these additional tables will come from
the event table.

Many of my queries will involve looping over sorted recordsets, such as a
query that could add fields to the bird table such as date first captured,
date last captured. Or identifying individuals that were mis-sexed at
different events.

Happy New Year,

LAF
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top