As far as the logical model is concerned you are currently modelling the
Samples and Tests entity types with your two tables. The relationship type
between these is many-to-many, and this is modelled by a third table which
references the primary keys of each of the two referenced tables, so it would
have foreign key columns labIDNumber and TestCodeID. If each test is applied
to each sample only once then these tow columns in combination constitute the
composite primary key of this third table. If not then another column, e.g.
TestDateTime would also be part of the primary key.
As well as modelling the relationship type between the two referenced tables
the third table itself models an entity type. In fact relationship types are
just a special kind of entity type. Consequently the third table can have
other columns which represent attributes of the entity type modelled by the
table, e.g. the results of each test on each sample.
The existence of a row in the third table, with the relevant values in the
columns of the primary key, indicates that a particular test is applied to
that sample. The data in other columns are the results of that test on that
sample.
The above is the logical model in its simplest form. It may or may not need
to be more complex and require further tables. That would require a more
detailed analysis of the entity types involved and the relationship types
between them. The important thing is that the model correctly represents the
reality without introducing any redundancy, i.e. each 'fact' is stored once
and once only in the database. The formal process of normalization
eliminates any redundancies by decomposing tables into several tables.
Here's a brief summary of what normalization (to third normal form at least)
involves:
"Normalization is the process of eliminating redundancy from a database, and
involves decomposing a table into several related tables. In a relational
database each table represents an entity type, e.g. Contacts, Companies,
Cities, States etc. and each column in a table represents an attribute type
of the entity type, e.g. ContactID, FirstName and LastName might be attribute
types of Contacts and hence columns of a Contacts table. Its important that
each attribute type must be specific to the entity type, so that each 'fact'
is stored once only. In the jargon its said that the attribute type is
'functionally dependent' solely on the whole of the primary key of a table.
To relate tables a 'referencing' table will have a foreign key column which
makes the link to the 'referenced' table, e.g. a Contacts table might have a
CompanyID column as a foreign key, while a Companies table has a CompanyID
column as its primary key. Consequently no data other than the CompanyID
needs to be stored in a row in the Contacts table for all the company
information for that contact to be known; its available via the relationship
and can be returned in a query joining the two tables on the CompanyID
columns.
Similarly the Companies table might have a CityID column and the Cities
table a StateID column. If its an international database the States (or more
generically Regions) table would have a CountryID referencing the primary key
of a Countries table. So via the relationships, simply by entering (in
reality this would be selected from a list of Companies in a combo box, not
typed in) a CompanyID in the Contacts table the location of the contact's
company is also known. Redundancy, and therefore repetitive data entry is
avoided.
To see how a database is made up of related tables take a look at the sample
Northwind database. Its not completely normalized in fact (deliberately so
for the sake of simplicity) but illustrates the main principles of how tables
representing entity types relate to each other. An example of its lack of
proper normalization can be found in its Customers table. You'll see that
this has City, Region and Country columns so we are told numerous times that
São Paulo is in SP region (as is Resende) and that SP region is in Brazil.
Not only does this require repetitive data entry, but more importantly it
opens up the risk of inconsistent data, e.g. it would be perfectly possible
to put São Paulo in California in one row and California in Ireland! Proper
normalization as I described above would prevent this as the fact that São
Paulo is in SP region would be stored only once in the database as would the
fact that SP region is in Brazil and California is in the USA.
An example of what at first sight might seem to be redundancy, but in fact
is not, can also be found in Northwind. The Products table and the
OrderDetails table both have UnitPrice columns. It might be thought that the
unit price of a product could always be looked up from the Products table, so
its unnecessary in Order Details. However, the unit price of a product will
change over time, but each order needs to retain the price in force at the
time the order was created. Consequently a UnitPrice column is needed in
both tables; that in products holds the current price and is used to get the
value for that in Order Details (code in the ProductID control's AfterUpdate
event procedure in the Order Details Subform does this), which then remains
static when the current price (in products) changes. In each case UnitPrice
is functionally dependent on the key of the table, so there is no redundancy"
When it comes to the interface its perfectly possible to replicate your
Excel worksheet, but only by means of an unbound form with code in its module
to write data to the underlying table when controls in the form are updated
by the user (or to delete rows if a checkbox is unchecked), and to read data
from the table to update the controls in the form with regard to existing
tests on a sample. This would require some experience of writing VBA code
using the DAO or ADO data access technologies. Its not a difficult task if
you have this experience, albeit a somewhat tedious one when first setting up
and debugging the form.
The more usual approach in Access would be to use a bound form based on the
Samples table, with a subform based on the new third table embedded within in
it and linked to it on labIDNumber. The main parent form would be in single
form view, the subform in continuous form view. This is relatively
straightforward to set up and requires little or no code. The difference
from the user's point of view would be that for each test per sample they'd
insert a new row in the subform, selecting the tests in question from a combo
box in each case, rather than checking pre-existing controls for all possible
tests. Users familiar with your current Excel worksheet would therefore need
to rethink their modus operandi to some extent.
The more detailed form which you mention would really be a modification of
the above bound form/subform approach, perhaps with a single form view
subform rather than continuous form view to facilitate the inclusion of more
detailed data.
Ken Sheridan
Stafford, England